Cognite Extractors for streaming real time data into cdf

Hi, I wanted to understand if Cognite Data Extractors can stream real-time data into CDF (with or without delay) for timeseries analysis , or do we need to schedule and extract a specific number of rows every time we run the extractor?

2 replies

Userlevel 2

Hey @vaibhavsancheti25 , If you are referring to a PI extractor, it is specifically designed for continuous running and streaming of data.

Ref from doc:

The Cognite PI extractor connects to the OSISoft PI Data Archive and detects and streams time series into Cognite Data Fusion (CDF) in near real-time. In parallel, the extractor ingests historical data (backfill) to make all time series available in CDF. The PI points in the PI Data Archive correspond to the time series in CDF.

Userlevel 1

Roman has provided the solution for continuing stream of data from PI (no delay option).  


My solution is for the “delay” scenario where there is no continuous streaming of data into CDF. In our use-case, the timeseries data was stored in the Oracle database. We used the DB Extractor to set up the Data Pipeline. The extractor was set up in such a way that it extracts data incrementally. We are running the extractor every 10 minutes which provides the near real-time streaming of data. 

You can find the documentation for the extractor at About the Cognite DB extractor | Cognite Documentation

In the configuration section (Configuration settings | Cognite Documentation), you can find instructions about how to set up the incremental extract. 

The last note, we are running the extractor using the Windows Scheduler.

Please let me know if any additional questions.