Solved

Empty values load differently in Python SDK and DB-Extractor

  • 28 March 2024
  • 5 replies
  • 45 views

Hi, everyone

I'm trying to load sequences using Python SDK by inserting dataframes. In the dataframe, I have some columns with NaN and numeric values. 

Dataframe: 

 

and when loading NaN values, they are uploaded as 'null'. I've tried replacing them with empty strings, but due to the column definitions, if I have numeric values, the column's data type must be Double or Long, and conflicts arise when encountering empty strings.

This is the code to load a sequence:

 

Results in CDF using Python SDK:

 

However, when loading the sequence via DB-Extractor, empty strings are uploaded as such and not as 'null'

 

Is there any way to load via SDK so that NaN values are uploaded as empty and not 'null'? 

 

Regards,

Karina Saylema

@Aditya Kotiyal  @HanishSharma  @Jason Dressel  @Jairo Salaya @Liliana Sierra 

 

icon

Best answer by Jason Dressel 29 March 2024, 15:24

View original

5 replies

Userlevel 4
Badge

@Karina Saylema I believe you would run the following code on your input Dataframe before uploading to CDF:  Essentially replace the NaN’s with ‘’ (empty string).
Let me know if this works :).

-Jason

df.fillna('', inplace=True)

Hi, I tried filling the NaNs with empty values, but it conflicts when there are numeric values in the same column because it encounters the restriction that all elements in the column must be of the same type. I got this error when filling Nan with empty strings:

 

Userlevel 4
Badge

@Karina Saylema I think you can change the dtype of the column or create a new string column based on the double column of the pandas DataFrame.

Userlevel 4
Badge +2

Hi @Karina Saylema,

Have you had a chance to try out Jason's suggestion?

Hi Jason,Dilini,

I already tried it and everything worked fine, thank you very much for your help :). 

Reply