Cognite Replicator

Forum|Forum|3 years ago
October 4, 2022
7 replies
239 views

Pierre Pernot
Seasoned Practitioner

A common setup that we have at Cognite is having 3 different environments: development, test and production. We have one specific Cognite Data Fusion (CDF) project for each of these environments. This setup allows to develop new features without interfering with production data. Also, it makes the deployment of new features to production much safer, since they are tested before.

Sometimes, you still want to have the data you used during your tests, in your production environment. For example, if you extracted some data from sensors and saved them as time series in CDF, you may want to keep this historical data when you move to production. Same goes with files, assets, datasets, functions etc. Another example could be the replication of the data from a full production CDF project, to one of your customer’s project, containing only a subset of the source one.

At Cognite, we have a python package (available at https://pypi.org/project/cognite-replicator/) whose purpose is to copy the selected resources from one project to another. It is quite easy to use: you only have to fill a config file with some information about authentication, the resources you want to replicate and some additional features. Then it is run thanks to a CI/CD pipeline. The time it takes depends on the data you replicate, but it is generally less than a minute. Having this package makes the lead time between development and production much faster, and allows to continuously add new features to your production environment, at scale.

This package is still getting updated with some requests from customers: if you would like to see a new feature in it, please let us know!

@Gaetan Helness, you also worked with the replicator package: any tips or best practices you would like to share ?

Ben Brandt
Seasoned
Forum|Forum|3 years ago
October 4, 2022

The Cognite Replicator has worked great for us for refreshing our Dev and QA environments from Prod. The only gap we found was the annotations that link assets to 3d models or images was missing, but I have not verified if this is still an issue.

For any bugs we find or suggestions, we are able to submit them as an issue on the GitHub repo like this:
Add Support For Mapping of Annotations · Issue #174 · cognitedata/cognite-replicator (github.com)

Like

Pierre Pernot
Author
Seasoned Practitioner
Forum|Forum|3 years ago
October 5, 2022

Thank you @Ben Brandt for that clarification !

Like

APSHANKAR Sagar
Committed
Forum|Forum|1 year ago
May 27, 2024

Where do you run the Cognite Replicator? From the description in the link, I have the impression that it is running on some third computer/EC2 instance, separate from CDF.

Is there anyone who is using it from inside CDF itself, perhaps as a CDF function?

Like

R

+2

rupali
Active
Forum|Forum|1 year ago
May 30, 2024

I tried to run cognite replicator script through CDF function, but it is neither printing any error nor giving any output. What exactly happening at the end of CDF function not getting.

It will be good to run it within cognite function to monitor and keep track.

Can anyone help here?

Like

S

+2

Sendil Sadasivam
Practitioner
Forum|Forum|8 months ago
May 22, 2025

Please let me know whether this Replicator can be used to replicate the Production CDF data to other data storage like S3 bucket or Databricks. Will this replicator scale to copy those data in near real time and provide the fault tolerance performance

Like

APSHANKAR Sagar
Committed
Forum|Forum|8 months ago
May 25, 2025

Please let me know whether this Replicator can be used to replicate the Production CDF data to other data storage like S3 bucket or Databricks. Will this replicator scale to copy those data in near real time and provide the fault tolerance performance

Hello Sendil,

AFAIK, no, this isn’t for that. It simply allows to copy CDF data objects from one project to another.

Like

S

+2

Sendil Sadasivam
Practitioner
Forum|Forum|8 months ago
May 26, 2025

Ok thanks for the response Apshankar. I felt the same when i looked this code is not scalable for Production replication. We need to use this approach to replicate the data from CDF Data model to the Client S3 bucket

Like

Sign up

Welcome to Cognite Hub

Scanning file for viruses.

This file cannot be downloaded