Cognite Data Fusion: Data extract enhancement request

Related products: Search and Data Exploration

As a data validation end user, I need the ability to extract the data out of CDF and compare it to the source data platform. Currently in CDF, there is no option to “select all” and all all events have to be manually extracted for download out of CDF. Additionally, when the metadata is downloaded, it comes through as a .json file which is not easily comparable to other system extracts. I need the file to be extracted from CDF in a user friendly format like csv to be compared to source data for easy data integrity validation. 

Updated idea statusNewGathering Interest

It sounds like you are downloading an individual event by clicking the download button. I do see your pains on not having a way to download bulk data directly from fusion right in the UI; however, there are multiple other methods that one can leverage to explore data in CDF.

PowerBI would be a good option for someone not wishing to code.

Some code approaches:
I've tacked a similar problem by leveraging the Cognite Spark Data Source (some info in Cognite Docs) where you can execute sql to read the entirety of a particular resource type. The metadata is still in a similar format, but leveraging instructions here should flatten them out to columns. This approach would work best if you able to leverage databricks/pyspark. 

If you do not have access to databricks/pyspark a similar workflow could be accomplished using the python sdk. Where you can list all of a particular resource type, easily retrieve it as a pandas dataframe, and manipulate the code snippit below to flatten the metadata.

###importing pandas and cognite-sdk
cdf_df = c.events.list(limit=-1).to_pandas()
cdf_df = pd.concat([cdf_df.drop('metadata', axis=1), pd.json_normalize(cdf_df['metadata'])], axis=1)


 


 


How about enabling a better search and download capability with UI/UX in CDF?