Question

Monthly Aggregation: End date discrepancy

  • 25 June 2024
  • 7 replies
  • 63 views

Hi,

I am using Time zone and Calander granularity function (Release timezone and calendar features in DatapointsAPI (beta) by haakonvt · Pull Request #1779 · cognitedata/cognite-sdk-python · GitHub)

I am facing an issue wherein I am passing following data

dps_lst = client.time_series.data.retrieve_dataframe_in_tz(

   external_id=list,

   start=datetime(2023, 6, 24, tzinfo=ZoneInfo("America/New_York")),

   end=datetime(2024, 6, 27, tzinfo=ZoneInfo("America/New_York")),

   aggregates="average",

   granularity="1month")

 

For the End date time is given as 27th June 2024; while I am expecting to get a calculation from 1st June- 27th June; I am receiving an output from 1st June- 30th June. the functionality is not considering “end” date while calculating. 

 


7 replies

Userlevel 3

Yes, the time range is extended to whole intervals.
If you want to retrieve a subset of the interval, you can use a lower granularity. In this case, 27 days (or 3 days for 24th to 27th)

Badge

Kindly note that the question is asking for data for a 1year time period from 2023-2024. Could you please clarify the date ranges over which will be aggregated?

Userlevel 3

My mistake.
With the API, we aggregate over whole periods, in this case, whole months. Every month from June 2023 to June 2024 (inclusive), to be specific.

An alternative is to query for

  • 7day granularity with start = June 24th 2023 and end=July 1st 2023
  • 1month granularity with start=July 1st 2023 and end=June 1st 2024
  • 26day granularity with start=June 1st 2024 and end=June 27th 2024

Hi,

For the above mentioned example Monthly granularity is needed and expected. 

ex.

1-30th June 2023

1-31st July 2023….and so on

 but when it comes to June-2024; functionality must consider “End DateTime” 27th June.

so the output for June-2024 must be 1-27th June 2024 (and ignore 28-30th June).

Can we give more priority to END DATE rather than last date of month.

Userlevel 3
Badge

Note that the call you are doing, .retrieve_dataframe_in_tz, is not using the API for monthly aggregation, but using a client side aggregation implemented in the SDK. This have now been deprecated, so please use the alternative call

dps_lst = client.time_series.data.retrieve_dataframe(

   external_id=list,

   start=datetime(2023, 6, 24),

   end=datetime(2024, 6, 27),

   timezone=ZoneInfo("America/New_York"),

   aggregates="average",

   granularity="1month")

to use the API to perform the aggregation instead. On the newest version of the SDK there should have been a warning telling you not to use the .retrieve_dataframe_in_tz method.

Userlevel 4
Badge

Just a quick comment: After the official release, 

...the following SDK methods use the API directly:

  • client.time_series.data.retrieve
  • client.time_series.data.retrieve_arrays
  • client.time_series.data.retrieve_dataframe

...while the method client.time_series.data.retrieve_dataframe_in_tz , which predates API support, does this “client side”. I’d recommend you to start using the other methods, as this is deprecated (as per the warning) and will be removed in due time.

 

On the topic of respecting start- and end time in the first and last aggregate interval: this has never been supported in neither the API or SDK and is something you would need to do yourself (and then stitch together with the full-length in-between intervals).

Userlevel 4
Badge +2

Hi @RincyC,

We are following up to see whether you're satisfied with the responses you've received? 

Reply