How-To: Use a CSV file to populate data points for a time series (data model) with the CDF DB Extractor

Forum|Forum|4 months ago
October 30, 2025
0 replies
86 views

+1

tmolbach
Practitioner

Objective

Get started with setting up the Cognite DB Extractor on Windows to upload CSV data points from a local file into CDF time series data model objects (CogniteTimeSeries).

Assumptions

Windows 10/11
You have an existing .env file from your CDF Toolkit setup
You have access to ingest data into CDF time series
CSV file located at: C:\Cognite\Data\csv\timeseries-values.csv

Example CSV file

Copy/paste this content to your CSV file if you need example data points:

"key","externalId","status","timestamp","value"
"ts_A1234.PD_1749064081000","ts_A1234.PD","Good","1749064081000","106.98"
"ts_A1234.PD_1749116640000","ts_A1234.PD","Good","1749116640000","114.41"
"ts_A1234.PD_1749127740000","ts_A1234.PD","Good","1749127740000","100.61"
"ts_A1234.PD_1749132780000","ts_A1234.PD","Good","1749132780000","105.52"
"ts_A1234.PD_1749134339000","ts_A1234.PD","Good","1749134339000","112.02"
"ts_A1234.PD_1749147420000","ts_A1234.PD","Good","1749147420000","96.53"
"ts_A1234.PD_1749148257000","ts_A1234.PD","Good","1749148257000","101.62"
"ts_A1234.PD_1749154741000","ts_A1234.PD","Good","1749154741000","99.96"

Notes:

The CSV must include columns: externalId, timestamp, and value.
The extractor inserts data points into the time series identified by the externalId.
If a time series does not exist, the extractor will create a minimal time series with only the externalId. Additional attributes (name, description, unit, etc.) need to be added later.

Explanation (Steps)

1 — Download and Extract

Download the Windows DB Extractor executable from CDF:
Data management → Extractors → Search for: DB Extractor

Extract to:

C:\Cognite\db-extractor\

2 — Copy file with environment variables

Place your .env file in the same folder as the extractor executable, e.g.:

C:\Cognite\file-extractor\.env

Your .env contains the environment variables the extractor reads, at a minimum:

CDF_URL=https://<your-cdf-host>

CDF_PROJECT=<your-project>

IDP_CLIENT_ID=…

IDP_CLIENT_SECRET=…

IDP_TOKEN_URL=https://login.microsoftonline.com/<tenant-id>/oauth2/v2.0/token

IDP_SCOPES=https://<your-cluster>.cognitedata.com/.default

3 — Create Configuration File

Save as C:\Cognite\db-extractor\dbextractor-csv-to-timeseries-config.yml and update destination space to match your configuration:

# Configuration template for the Cognite DB Extractor version 2.x

# Config schema version (this template is for v2)
version: 2

# Configure logging to standard out (console) and/or file. 
logger:
  console:
    level: INFO

# Information about CDF project
cognite:
  host: ${CDF_URL}
  project: ${CDF_PROJECT}

  idp-authentication:
    client-id: ${IDP_CLIENT_ID}
    token-url: ${IDP_TOKEN_URL}
    secret: ${IDP_CLIENT_SECRET}

    scopes:
      - ${IDP_SCOPES}

databases:
  -
    type: spreadsheet
    name: local-csv-file
    # path to your local spreadsheet file
    path: C:\Cognite\Data\csv\timeseries-values.csv

queries:
  -
    # User-given name for query
    name: test-local-csv-file
    # Name of database to use (as specified in the databases section)
    database: local-csv-file
    # Name of the excel sheet you want to query.
    sheet: Sheet1

    # The extractor expects SQL syntax in order to query data from a local spreadsheet
    query: >
      SELECT
      *
      FROM
      Sheet1

    # Where to upload data in RAW
    destination:
      type: time_series
      destination_mode: cdm
      data-model:
        space: springfield_instances

Time series objects are created as CogniteTimeSeries objects in the Core Data Model.

Best practice tip: Keep a local stub of the configuration and manage the rest of the config in a CDF extractor pipeline: https://docs.cognite.com/cdf/integration/guides/interfaces/configure_integrations

4 — Run the Extractor

cd C:\Cognite\db-extractor .\dbextractor-standalone-v3.9.0-win32.exe dbextractor-csv-to-timeseries-config.yml

5 — Verify in CDF

Open CDF → Industrial Tools → Search
Click ⚙️ (cog wheel) → Cognite Core Data Model
Select Time series → Filter: "name is not set"
You should see a time series with externalId ts_A1234.PD (but no name/description) and a few data points around June 5, 2025

6 — What now? Populate the time series attributes?

After ingesting the data points, you can use a CDF transformation to populate the time series with additional attributes (name, description, unit, etc.):

Open CDF → Data modeling → Transformations → Create transformation
Set the source as a raw database and table with time series metadata
Map metadata to time series properties
Configure the transformation to upsert or update the time series
Run the transformation so that your time series objects are fully populated

Tip: You typically populate the time series objects first and then load data points later as a continuous process.

Example of running the database extractor

More about configuring the DB Extractor can be found here: https://docs.cognite.com/cdf/integration/guides/extraction/configuration/db