Problem Statement
The Cognite DB Extractor currently requires space to be a static, query-level configuration parameter, forcing users to create multiple identical extraction queries when distributing data across spaces based on hierarchical entity relationships.
When working with hierarchical data structures, each distinct parent entity requires a separate extraction query, even if the source data and transformations are identical.
Asset hierarchy:
- L1: Assets
- L2: Area
- L3: Field (defines CDF space)
- L4: Installation
- L5: Well
Currently, if there are 10 distinct fields across multiple assets, the DB Extractor requires 10 separate extraction queries to distribute timeseries data across 10 spaces - one query per field, even though they all read from the same source table with identical transformations.
With multiple assets, areas, and fields, this results in hundreds of extraction queries for a single source table. This creates:
- Configuration Bloat: Hundreds of nearly identical YAML configurations to maintain
- Filter Duplication: Adding or modifying a filter condition requires updating hundreds of queries
- Scheduling Overhead: The same database table is scanned multiple times per ingestion cycle instead of once
- Resource Inefficiency: Hundreds of independent jobs running on the same data multiplies query load
- Error Management: One failed field extraction leaves only that space with stale data; manual recovery required per space
- Maintenance Burden: Updating logic across all queries is error-prone and time-consuming
- Scalability: As fields are added or wells multiply, query count grows exponentially
Proposed Solution:
Allow space to be derived dynamically from query result columns using template syntax, similar to how external_id is generated.
Check the
documentation
Ask the
Community
Take a look
at
Academy
Cognite
Status
Page
Contact
Cognite Support
