I’ve been using the retrieve_dataframe function in cognite-sdk and been missing the feature to getting a fixed size of the dataframe when doing separate requests. When requesting aggregates of data points in a timeseries, there is no output when the are no data points within a instance of granularity. For example, I have used:
df = cdfClientRef.datapoints.retrieve_dataframe(external_id = '', start = '10d-ago', end = 'now', aggregates = ='average'], granularity = '1d')
If there are no data points in that time I get an empty dataframe. If there are no data points within a day in this example that row would missing from the output.
Empty DataFrame
Columns: s |average]
Index: x]
If I instead request several timeseries, the missing data is given as NaN to fill out the dataframe:
|average |average
2022-08-30 NaN 3803.468955
2022-08-31 NaN 3837.437594
2022-09-01 NaN 3826.125239
2022-09-02 NaN 3841.227053
2022-09-03 NaN 3834.526855
2022-09-04 NaN 3839.324081
2022-09-05 NaN 3829.446260
2022-09-06 NaN 3833.002560
2022-09-07 NaN 3853.340727
2022-09-08 NaN 3854.664126
It would be useful with an option to make sure the output dataframe contains an entry for each requested instance for a given granularity, meaning that instead of an empty dataframe the output would be:
|average
2022-08-30 NaN
2022-08-31 NaN
2022-09-01 NaN
2022-09-02 NaN
2022-09-03 NaN
2022-09-04 NaN
2022-09-05 NaN
2022-09-06 NaN
2022-09-07 NaN
2022-09-08 NaN
Hi, Thomas
Great question! May I ask which SDK you are using? Which language?
Knut
Hi Knut,
I am using the Python SDK.
Thomas
Hi Thomas,
Thanks for the feedback on the Python SDK. We will have a look at how it can fit into our roadmap! We would love to get more feedback on the SDKs so any additional input is greatly appreciated.
Regards,
Omar