Skip to main content
Answer

Compute timeseries standard deviation with synthetic timeseries

  • July 22, 2025
  • 6 replies
  • 58 views

Hello,

I’m looking to compute the standard deviation of a timeseries on the fly with synthetic timeseries.

I expected to use this pseudocode formula : sqrt(avg(pow(TS{externalid}-avg(TS{externalid}),2))) with endpoint :

client.time_series.data.synthetic.query(
    expressions=expression,
    start="2w-ago",
    end="now"
)

Unfortunately, avg expect at least 2 inputs, I try to switch to aggregate feature but I found it available only for timeseries, not synthetic timeseries.

expression = '''
sqrt(
avg(
pow(
ts{ID} - ts{ID, aggregate="average", granularity="14d"},
2
)
)
)'''

Do you have any tips or workaround to compute this value when “start” value changes ? Dont hesitate to explain 

I'm open to any opportunity to calculate this metric using another method.

Thanks in advance,

Pierre

 

edit : I find this function in additionnal library : Rolling standard deviation of data points time delta — indsl 8.7.0 documentation but i’m looking for a answer without additionnal package if it’s possible.

Best answer by matiasholte

Hi Pierre!
Thank you for a great question!
Unfortunately, synthetic time series is limited to calculations at single points in time only.
avg() is thus a tool to average from a list of inputs at a single point in time, not average across time.

We do have the variance available as a regular aggregate function, but we have not exposed it in synthetic, as it is not well defined what value to use when there is a gap in the data for one of the input time series.

If you just want the standard deviation, you could fetch the variance, and then sqrt() it locally. Aggregation periods without data will be omitted from the output.
Hope this helps
Matias
 

6 replies

Mithila Jayalath
Seasoned Practitioner
Forum|alt.badge.img+8

@RAMBOURG Pierre I’ll check on this and get back to you with an update.


matiasholte
Practitioner
  • Backend developer
  • Answer
  • July 25, 2025

Hi Pierre!
Thank you for a great question!
Unfortunately, synthetic time series is limited to calculations at single points in time only.
avg() is thus a tool to average from a list of inputs at a single point in time, not average across time.

We do have the variance available as a regular aggregate function, but we have not exposed it in synthetic, as it is not well defined what value to use when there is a gap in the data for one of the input time series.

If you just want the standard deviation, you could fetch the variance, and then sqrt() it locally. Aggregation periods without data will be omitted from the output.
Hope this helps
Matias
 


Thanks ​@matiasholte, I'll try that, but I have a good feeling about it!


@matiasholte Your solution seems to works, thanks. But I have a question, following this example :

I’m expecting only one value in the return value, why there is 2 ? Why the granularity “2w” doesn’t match with the 2w span of data requested ?


matiasholte
Practitioner
  • Backend developer
  • August 11, 2025

@RAMBOURG Pierre The 2 return values come from rounding.
2w granularity must start at midnight in the given time zone, while start=”2w-ago” is not rounded, but depends on the time of the request.

In your example, you sent the request during the day of 2025-08-05. 
2w-ago was sometime during the day of 2025-07-22. That is rounded down to the start of that day (midnight). 
2w after that would be the start of 2025-08-05, which is some hours before the end (now), so that aggregate is included too.

See documentation here: https://developer.cognite.com/dev/concepts/aggregation/#aggregation-in-cognite-data-fusion
 

If you would only want one result you can
1. set limit=1
2. set end = 1d-ago (or 13d-ago for that matter)
3. Ignore the later results.

Best,

Matias


@matiasholte sorry for the delay, many thanks for your detailed answer, you rock!