CDF Data model

Question

We have a unique use case where there are data points collected from PI server for refinery plant and alongside, we have other refinery data like Crude assays, Diet, Mass Balance. All these data points (in tabular format) are collected and then some formula is applied across to get properties (like swing cut %, CBLISS, Vol%, etc.) and some derived tables are created. Then some input feed is made from the actual data sent to a Petro-Sim tool (math simulation tool/modeling) and then output from that tool is collected and stored. All these data wrangling is done and finally charts are finally made for some 50+ crude product variants to measure yield tracking (Actual data, non-linear model data, linear model data). As per design, we are planning to use the CDF raw tables and then do all the computations and create derived raw tables to fulfil the purpose. Then use Petro-Sim connector and then store the data output from the model tool as well in raw tables.

So, we don't tend to see the typical CDF style data construct like Assets, Sequences, work-order, maintenance-data, files, events, labels 3D diagrams etc. So, when we see the data-model, it is very unique.

Can someone suggest if our approach to implement in CDF is right and if our interpretation is wrong?

Please share pointers and also on some guides on how to design and architect a solution model.

icon

Best answer by Everton Colling 20 June 2023, 14:56

View original

Everton Colling · Accepted Answer

HiEashwar! Data models are quite flexible, anything that you represent in RAW can be represented in a data model as well.With data models, you also has the capability to create custom types, associate relationships, dependencies that can simplify the data consumption.We are working on a library of sample data models, to help our customers to get started, but that’s not available today. To help you get started, here’s a very simple example of a crude diet data model that contains time based entries for the crude feed of distillation units. You can easily extend it by adding more properties and relationships according to the data available and the use case requirements.type CrudeAssay {    name: String!    id: Int!    volume: Float    volumeFraction: Float    mass: Float    massFraction: Float}type CrudeDiet {    timestamp: Timestamp    refinery: String    distillationUnit: String    diet: [CrudeAssay]}For more information about data models please check the documentation:https://docs.cognite.com/cdf/data_modeling/guides/create_dm/

Everton Colling · Answer

HiEashwar! The use case you are mentioning is an extremelyvaluable one and it’s definitelysolvable using Cognite Data Fusion.I see that you are planning to rely on RAW tables, but I would not recommend it for this case. RAW is intended as a staging area, to store data dumped from extractors until it’s transformed into clean data (data stored into high performantCDF resources). I suggest creating a well defined data model for the source data and another one for the “calculated data” to store the multiple attributes and their dependencies.Any time indexed data should be storedinto timeseries (that are referenced in the data model).To populate the datamodels with the source data you can use Transformations or Cognite Functions. To populate the calculations (formula that generates the new “tables”) data model, I suggest using Cognite Functions to include error handling/logging/data quality checks.When it comes to the CDF Petro-SIM connector, it’s an early adopter capability that is enabled for selected customers. If you are interested we can schedule a meeting to discuss enrolling into the early adopter program.

CDF Data model

3 replies

Reply

Reply

Sign up

Log in to the community

Scanning file for viruses.

This file cannot be downloaded