Skip to main content

How-To: Getting started with the CDF File Extractor from local folder to data model

  • October 29, 2025
  • 0 replies
  • 59 views

tmolbach
Practitioner
Forum|alt.badge.img+1

Objective

Getting started with setting up the CDF File Extractor on Windows to upload files from a local folder into Cognite Data Fusion (CDF) as CogniteFile objects in the Core Data Model.

 

Assumptions

  • Windows 10/11

  • Local folder with files to upload: C:\Cognite\Data\pdf-files

  • You have an existing .env file from CDF Toolkit setup

  • You are using and extending the Cognite Core Data Model 

 

Explanation (Steps)

 

1 — Download and Extract

  1. Download the Windows File Extractor executable from CDF: Data management → Extractors → Search for: File Extractor.

  2. Extract to:

    C:\Cognite\file-extractor\

 

2 — Copy file with environment variables

Place your .env file in the same folder as the extractor executable, e.g.:

C:\Cognite\file-extractor\.env

Your .env contains the environment variables the extractor reads, at a minimum:

CDF_URL=https://<your-cdf-host>
CDF_PROJECT=<your-project>
IDP_CLIENT_ID=…
IDP_CLIENT_SECRET=…
IDP_TOKEN_URL=https://login.microsoftonline.com/ee...04/oauth2/v2.0/token
IDP_SCOPES=https://<your-cluster>.cognitedata.com/.default

3 — Create Configuration File

Save as file-extractor-folder-config.yml and update target space and path to match your configuration:

# Configuration template for the Cognite File Extractor version 0.1.0

# Configure logging to standard out (console) and/or file.
logger:
console:
level: INFO

# Information about CDF project
cognite:
# Read these from environment variables
host: ${CDF_URL}
project: ${CDF_PROJECT}

idp-authentication:
client-id: ${IDP_CLIENT_ID}
token-url: ${IDP_TOKEN_URL}
secret: ${IDP_CLIENT_SECRET}
scopes:
- ${IDP_SCOPES}

# Information about files to extract
files:
# Destination mode for uploaded files. Whether to use the CDM data model or the classic file model.
destination-mode: cdm
data_model:
space: springfield_instances

# Information about file provider
file-provider:
type: local

# For local files: Absolute or relative path to directory to watch
path: C:\Cognite\Data\pdf-files

Files are created as CogniteFile objects in the Core Data Model.

Best practice tip: Keep a local stub of the configuration and manage the rest of the config in a CDF extractor pipeline: https://docs.cognite.com/cdf/integration/guides/interfaces/configure_integrations

 

4 — Run the Extractor

cd C:\Cognite\file-extractor

.\file-extractor-standalone-2.9.1-win32.exe .\file-extractor-folder-config.yml

 

5 — Verify in CDF

  1. Open CDF → Industrial Tools → Search

  2. Click ⚙️ (cog wheel) → Cognite Core Data Model 

  3. Uploaded files appear as File instances

 

Note: To populate an extended data model (e.g. myFile extending CogniteFile) so that these files are visible in the extended model, use a transformation that reads CogniteFile and outputs myFile.

 

Example of running the file extractor

More about configuring the File Extractor can be found here: https://docs.cognite.com/cdf/integration/guides/extraction/configuration/file-extractor/