Skip to main content

Product Ideas Pipeline

Filter by Idea Status

Filter by Topic

1290 Ideas

Luciana Velasco MedaniPractitioner ⭐️⭐️⭐️

Feature Request: Individualized Pagination Endpoint for Time Series Datapoints APIGathering Interest

Hi,It would be great to have the creation of an endpoint based on the one below:https://api-docs.cognite.com/20230101/tag/Time-series/operation/getMultiTimeSeriesDatapointsCurrently, this endpoint allows requesting multiple items in a single call and returns the next page individually for each requested item.For Celonis’ REST extractor, this pagination model creates a limitation, as the platform works better with individual pagination, where each response contains only one component and a single pagination continuation token.Today, to enable this extraction, a Python-based workaround is being used within the environment, since the current model is not compatible with the standard extractor.We would like to check if it is possible to make the endpoint available in a more individualized format, with one time series per response, or if there is already an alternative endpoint that supports this scenario. This would greatly help improve the standardization and stability of the integration.Sharing the user’s assessment regarding the available data extraction methods:Extraction BuilderThis solution, offered by Celonis, works very well for REST APIs. However, for the current endpoint structure, pagination is not supported. To solve this issue, it would be enough to update or create a new endpoint that returns only one result per response instead of a nested list structure.SDKThe SDK offers a flexible solution but carries higher implementation risk. It would require using an existing middleware layer to orchestrate the extractions, which is currently considered unfeasible due to resource constraints, access dependencies, and a longer development timeline compared to other alternatives.

Anders Brakestad
Seasoned ⭐️⭐️⭐️
Anders BrakestadSeasoned ⭐️⭐️⭐️

Atlas AI Query Tool: Resolve graph relations for deterministic agent evaluationGathering Interest

Hello :)  Use caseAs an AI evaluation engineer, I want graph relations returned by the agent to be resolved into the underlying equipment and time series instance IDs in the structured JSON output, so that recall metrics reflect the instances the agent actually identified, not only the relation objects it retrieved. BackgroundI am evaluating agents that retrieve process data from an industrial knowledge graph in Cognite Data Fusion. I use a simple Python-based recall evaluator that compares expected equipment/time series IDs against IDs returned in the agent’s structured JSON output. Current behaviorThe agent may identify the correct equipment or time series in the natural-language response, but the structured JSON sometimes only contains the relation objects it used to infer them. This is mostly invisible to the end user, because the answer can still be correct. The issue is automated evaluation: correct retrievals may be undercounted because the expected equipment/time series IDs are missing from the JSON. I am spending too much time prompt engineering around this. The agent can produce the desired JSON structure sometimes, but not in a stable or deterministic manner. Desired behaviorWhen graph relations are returned, the agent should perform one or more final listInstances calls to resolve/dereference them into the underlying equipment and time series instance IDs. This would make the structured JSON reflect the natural-language response, enabling faster, cheaper, and more deterministic recall evaluation for quality control and governance.

Oussama ALLALI
Seasoned ⭐️⭐️
Oussama ALLALISeasoned ⭐️⭐️

Transformations — Non-deterministic silent upsert when duplicate externalIds are split across workers: request for configurable deduplication strategyGathering Interest

Observed behaviorWhen a CDF Transformation produces rows with duplicate externalId values , the behavior depends on how CDF partitions the data across workers:Duplicates within the same API request (same batch) → the API raises an error → the transformation fails visibly ✅ Duplicates split across multiple API requests (different CDF partitions/workers) → each request succeeds individually → the transformation completes with status "success", but which version of the node was actually written to the knowledge graph is unknown and non-deterministic ❌ProblemThe second case is the dangerous one:Silent and invisible — the run reports success, no alert is triggered, no engineer investigates. The data in the knowledge graph may be incomplete or wrong with no trace. Non-deterministic — which duplicate "wins" depends entirely on which CDF worker flushes first. Two identical runs on the same input can produce different results. No user control — there is no way to express intent: "fail if duplicates exist", "always keep the latest", "deduplicate before write". The behavior is undefined and undocumented at the transformation level.Feature RequestAdd a conflict resolution option at the transformation level for duplicate externalId within a single run: Option Behavior fail Abort the run and report an error if any duplicate externalId is detected across all batches keep_last Last row processed wins deduplicate_before_write CDF deduplicates globally before dispatching to the API