Solved

Data Streaming with CDF

  • 8 October 2021
  • 1 reply
  • 109 views

Badge

I am trying to stream data from CDF to Azure Event Hub with the Python SDK and cannot find anything related to streaming datasets. Only option so far (as I know) is 

dps = c.datapoints.retrieve_latest(id=184691546499795)

which would need a trigger of some sort to keep running. Data from CDF are so called timeseries from different types of sensors. Is there any documentation on Streaming Data for CDF that I could look at or is Streaming really supported?

 

icon

Best answer by Kjetil Halvorsen 11 October 2021, 18:54

View original

1 reply

Userlevel 2

Hi Niros.

We don’t offer native data push/stream capabilities in CDF yet. It is a capability that we have on our radar, but it is not a part of our (short to medium term) road map. 

So you have to look at some variation of setting up an agent that polls data from CDF, keep a watermark, and push to your destination. When polling data from CDF, it is worth noting the following:

  • The time series API is eventually consistent. That is, there may be a small delay from a data point is published to CDF until is queriable. Also, CDF does not guarantee that the data points become available in a sorted order. The consequence of this is that you probably want to let the data settle for a few seconds before you query for it. That is, you client should implement a “polling offset” where you query for a time window “t - <polling offset>”.
  • If your data source publish historic data points then CDF does not have a good way of communicating that to you--there is no “last updated time” on the data points themselves. So, if your source is capable of historical updates then you need balance the requirement of completeness (capturing all data points updates) with cost (the complexity of capturing historical changes).

Unfortunately there is not a very simple recipe for this. 

We will implement experimental support for streaming (including data points) in the community Java SDK (https://github.com/cognitedata/cdf-sdk-java) within the next month. That could probably give you some pointers on how to implement a similar agent in Python.

 

Reply