Skip to main content
Gathering Interest

Cognite Data Fusion: Data extract enhancement request

Related products:Search and Data Exploration
  • Marius Biermann
  • Lukas Volker

As a data validation end user, I need the ability to extract the data out of CDF and compare it to the source data platform. Currently in CDF, there is no option to “select all” and all all events have to be manually extracted for download out of CDF. Additionally, when the metadata is downloaded, it comes through as a .json file which is not easily comparable to other system extracts. I need the file to be extracted from CDF in a user friendly format like csv to be compared to source data for easy data integrity validation. 

3 replies

Anita Hæhre
Seasoned Practitioner
Forum|alt.badge.img+1
  • Head of Academy and Community
  • 590 replies
  • May 31, 2022
Updated idea statusNewGathering Interest

Noah Karsky
Practitioner
Forum|alt.badge.img
  • Data Engineer
  • 6 replies
  • December 9, 2022

It sounds like you are downloading an individual event by clicking the download button. I do see your pains on not having a way to download bulk data directly from fusion right in the UI; however, there are multiple other methods that one can leverage to explore data in CDF.

PowerBI would be a good option for someone not wishing to code.

Some code approaches:
I've tacked a similar problem by leveraging the Cognite Spark Data Source (some info in Cognite Docs) where you can execute sql to read the entirety of a particular resource type. The metadata is still in a similar format, but leveraging instructions here should flatten them out to columns. This approach would work best if you able to leverage databricks/pyspark. 

If you do not have access to databricks/pyspark a similar workflow could be accomplished using the python sdk. Where you can list all of a particular resource type, easily retrieve it as a pandas dataframe, and manipulate the code snippit below to flatten the metadata.

###importing pandas and cognite-sdk
cdf_df = c.events.list(limit=-1).to_pandas()
cdf_df = pd.concat([cdf_df.drop('metadata', axis=1), pd.json_normalize(cdf_df['metadata'])], axis=1)


 


 


Forum|alt.badge.img

How about enabling a better search and download capability with UI/UX in CDF?


Reply


Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie Settings