Realtime use cases vs near realtime use cases

Question

I’ve been introduced to CDF in the last 2 weeks, very interesting and learning through.

Just wondering if CDF has been used for real time monitoring use cases, vs. predictive/BI type use cases. For example, how soon can the alerts be triggered from the time a new abnormal measurement is detected. I imagine that it depends on a lot of factors such as network latency, source extractors, transform functions & schedules, Is there a threshold (say 1hr response time) beyond which CDF can be used.

Einar Omang · Accepted Answer

CDF itself is quite quick at being able to display new data. For datapoint extractors we generally aim for latency below 10 seconds, though it is heavily dependent on the source system. In practice we often have latency of around 2 seconds, typically that’s due to buffering: The source system buffers and releases data once per second, and the extractor does something similar, so the actual latency is between some minimum (200-500ms) and maximum (2000-2500ms).

CDF timeseries are immediately consistent, so any ingested datapoints are immediately available. Of course you would have to factor in the time to fetch the data, and how frequent your use case fetches the data from CDF, perhaps adding another second or two.

This all depends a lot on the source system and extractor.

Alex Narayanan · Answer

Einar, Thanks for your answers, this helps much!

Reply

Sign up

Log in to the community

Scanning file for viruses.

This file cannot be downloaded