Skip to main content
Question

Support Aggregation of Mixed Numeric and String Time Series in SDK Calls

  • May 29, 2026
  • 1 reply
  • 6 views

Forum|alt.badge.img+7

 

Summary

Enable aggregation operations (such as step_interpolation) on requests containing both numeric and string time series in the same SDK call.

Currently, aggregation works correctly for numeric time series, but the request fails when a string/alphanumeric time series is included alongside numeric series. This limitation prevents users from retrieving synchronized operational datasets that combine process measurements and categorical/status information in a single query.

Problem Description

When using the Cognite SDK method retrieve_dataframe() with aggregation enabled, requests containing mixed data types fail if at least one time series is of type string.

Example:

  • Numeric-only aggregation → works correctly
  • Numeric + string aggregation → fails

The expected behavior for string time series using step_interpolation is to replicate the last known value across the aggregation interval, similar to how industrial historians such as PI System behave.

Example Use Case

Operational dashboards and analytics workflows frequently require:

  • Numeric process variables (temperature, pressure, flow, etc.)
  • Text/status variables (campaign, mode, operational state, equipment status)

These datasets must often be retrieved together using the same timestamp alignment and granularity.

Example:

df = client.time_series.data.retrieve_dataframe(
external_id=[
"numeric_ts_1",
"numeric_ts_2",
"string_ts_status"
],
start="6h-ago",
end="now",
aggregates="step_interpolation",
granularity="1h",
timezone="UTC-03:00"
)

Currently, including "string_ts_status" causes the aggregation request to fail.

Expected Behavior

For string time series:

  • step_interpolation should return the last known value within the aggregation window.
  • Mixed-type aggregation requests should succeed without breaking the entire query.
  • Returned DataFrames should preserve both numeric and textual columns properly aligned by timestamp.

Current Behavior

  • Aggregation succeeds for numeric time series only.
  • Aggregation fails when a string/alphanumeric time series is included.
  • Users must split requests into multiple calls and manually merge results afterward.

Proposed Enhancement

Implement support for:

  • Aggregation of string time series using step_interpolation
  • Mixed numeric and string aggregation in the same SDK/API request
  • Graceful handling of heterogeneous time series types

1 reply

Forum|alt.badge.img+7

We are aware of existing feature requests related to machine-state time series and predefined operational states (e.g. OPEN/CLOSED, ON/OFF/STANDBY). LINK

However, this request addresses a broader and different use case.

Our requirement is not limited to predefined machine states or specialized state-based time series types. Instead, we require support for generic string/alphanumeric time series aggregation together with numeric time series within the same SDK/API request.

Examples include:

  • Campaign identifiers
  • Operational modes
  • Product names
  • Routing identifiers
  • Free-text operational tags
  • External categorical process information

The key requirement is the ability to:

  • include string and numeric TS in the same retrieve_dataframe() call;
  • apply step_interpolation consistently;
  • preserve timestamp alignment across all series;
  • avoid failures caused by mixed data types.

This capability is especially important for migration and interoperability scenarios with systems such as PI System, where mixed-type aggregation workflows are already supported.

Expected Behavior
The expected behavior is to replicate the latest known string value until a new value is received, similarly to how industrial historians such as PI System handle step interpolation for categorical data.

In the example below, the value DSTNL is preserved across consecutive timestamps until the transition to DSTNP, demonstrating the expected step interpolation behavior for string values.