Sesam.io is a master data hub that simplifies the process of making up-to-date master data available in a data platform architecture.
Cognite CDF, well, you all know what is it, right?
And we are going to send some data from Sesam to CDF without creating and deploying any connectors, only by using out-of-box functionality provided by Sesam.
What we need:
- Cognite CDF project configured with an Oauth2 identity provider, such as Azure Active directory
- Credentials such as client id, secret, oauth token url and scope with access capabilities that allow us to write to Raw tables
- Provisioned Sesam node with some data.
First login into our Sesam instance and create an endpoint system we will use to send data through (I assume you already have some data you want to send to CDF) by choosing “Systems” in left-side menu and then “New system” on the top bar . We will use REST API and push data by making HTTP requests towards CDF. Sesam provides built in REST connector that supports basic and Oauth2 authentication.
system config
{
"_id": "cdf-rest-connector",
"type": "system:rest",
"oauth2": {
"client_id": "<oauth client id>",
"client_secret": "<oauth client secret>",
"scope": "<oauth scope, e.g https://api.cognitedata.com/.default>"],
"token_url": "<token url e.g https://login.microsoftonline.com/:tenant_id/oauth2/v2.0/token>"
},
"operations": {
"push-to-raw": {
"method": "POST",
"payload-type": "json",
"url": "api/v1/projects/{{properties.project}}/raw/dbs/{{properties.db_name}}/tables/{{properties.table_name}}/rows?ensureParent=true"
}
},
"rate_limiting_delay": 5,
"rate_limiting_retries": 3,
"url_pattern": "https://api.cognitedata.com/%s",
"verify_ssl": true
}
We defined one endpoint operation “push-to-raw” which can be used by multiple pipes to send data to multiple CDF projects, Raw databases and tables, by substituting curly-braced properties. We also defined retry policy to prevent possible issues with CDF rate limiting.
Than we need to connect this system with the data we want to send. We do it by creating a “pipe” that will take data from a Sesam dataset and push it through our REST connector
pipe config
{
"_id": "cdf-rest-endpoint",
"type": "pipe",
"source": {
"type": "dataset",
"dataset": "dataset-with-data-we-want-to-send"
},
"sink": {
"type": "rest",
"system": "cdf-rest",
"operation": "push-to-raw",
"properties": {
"db_name": "<our-raw-database>",
"project": "<our-cdf-project>",
"table_name": "<our-raw-table>"
}
},
"transform": {
"type": "dtl",
"rules": {
"default": r
"filter",
"not", "_S._deleted"]
],
""add", "payload",
a"dict", "items",
i"list",
"apply", "make_payload", "_S."]
]
]
]
],
"make_payload":
"add", "key", "_S._id"],
"add", "columns", "_S."]
]
}
}
}
Purpose of this pipe is to fetch data from an existing dataset called here “dataset-with-data-we-want-to-send” and shape data as it required by both Sesam REST sink and CDF Raw API. We also filter out deleted entities with “filter” function. Sesam takes care of things which are deleted, but we don’t necessarily need them.
Properties section contains attributes that will substitute placeholders in our REST system so we can point many pipes towards one REST connector.
“make_payload” transformation will simply put Sesam entity id, denoted with _id as Raw row key and then put all source item columns into output entity.
Well, that’s it, now you can press “start” button and relax in your comfortable chair while data is being pushed into CDF