Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
@Jatin Sablok , There’s a builtin CDF function for this:https://docs.cognite.com/cdf/integration/guides/transformation/write_sql_queries#dataset_idyou use the external_id and it will resolve the id for you. Scales across cdf dev/stage/prod projects this way:)Jason
@Risa Shereen Without knowing your exact use case (one time batch or recurring incremental process) you could accomplish this in a combination of several approaches: You could run the end 2 end data pipeline from the source systems into the new CDF project. This demonstrates repeatability of your pipelines and possibly the ‘easiest’.Alternatively, you can use CDF Transformations to read from one CDF project and write to another CDF other. Have a look at the credentials for each Transformation. You’ll see that there are different read and write credentials. This will work for Assets, Time series and Sequences metadata. To move the bulk data for Timeseries DataPoints, Sequence Rows, Files, you will need to use the SDK. -Jason
@Parth Sinha I would start here: https://docs.cognite.com/cdf/dashboards/guides/grafana/admin_oidcThere’s some work that may need to be done on the EntraID side.-Jason
@Karina Saylema Functions limits are based on the base cloud provider limitations. These are documented here: https://docs.cognite.com/cdf/functions/technical I’d like to understand your workflow intent for using Cognite Functions. I’d be happy to help with solutioning.Could you leverage CDF transformations? Transformations can read from CDF Raw and write to many targets: CDF Staging (Raw), CDF Resource Types and Data Models.Could you partition your use case? -Jason
@Karina Saylema I think what’s happening is that chunk.set_index() is resetting the index for each chunk which is resulting in the sequence rows being overwritten because the row indexes are the same as the prior batch.I’m not a pandas expert, but something like explicitly setting the index based on your batch size might be what you are looking for: https://pandas.pydata.org/docs/reference/api/pandas.RangeIndex.htmlHope this helps,Jason
@Karina Saylema I think you can change the dtype of the column or create a new string column based on the double column of the pandas DataFrame.
@Karina Saylema I believe you would run the following code on your input Dataframe before uploading to CDF: Essentially replace the NaN’s with ‘’ (empty string).Let me know if this works :).-Jasondf.fillna('', inplace=True)
@Satish Basa To create the “tags” in CDF, as an example, you could try something like:select concat('ts-', name) as externalId, name as namefromvalues ('H20'), ('CO2') as tab(name) Then you’ll run 2 transformations (one for CO2, one for H20): select 'ts-CO2' as externalId, -- CHANGE this for H20 -- you'll need to enter the correct format to_timestamp(datetime, 'FORMAT PATTERN') as timestamp, CO2 as value -- CHANGE this for H20from `_cdf`.your_cdf_raw_data_tableHope this helps,-Jason
@dalvaniamp,If you created this calculated time series via SDK, you can set the asset_id property of the time series object at the time of creation:https://cognite-sdk-python.readthedocs-hosted.com/en/latest/time_series.html#create-time-seriesIf you created your calculated time series via Charts “save and schedule”, I believe that you will need to run a post process (via sdk, contextualization pipeline or equivalent) to set asset_id. If this is the case, it seems to be a nice feature request to have these set at the time of creation 😉.-Jason
@Rajendra Pasupuleti You have at least 2 options here:You rerun your data pipeline from source to Raw in the Prod environment. I would argue this is best practice, but maybe not ideal for your circumstance. Alternatively, you can use CDF Transformation to read from Raw staging in Dev and write to CDF Prod. CDF Transformations support a read credential as well as a write credential when you configure transformation scheduling. Whether this transformation is managed in dev or prod is up to you. Hope this helps, Jason
@Sangavi M , I set to an empty string (“”)-Jason
@Shweta Sorry for the delay. It appears there may have been a bug in the SDK that is now fixed. Please try with the following code snippet:run = ExtractionPipelineRunWrite(extpipe_external_id="csv:oee:datapoints", status="failure", message="custom error message")result = client.extraction_pipelines.runs.create(run)result-Jason
@Sangavi M Please try upgrading to the latest Cognite Python SDK. I understand there was a bug.The following code snippet works for me with v7.22: update = ExtractionPipelineUpdate(external_id="csv:oee:datapoints")update.contacts.set([ExtractionPipelineContact("jason", "jason.dressel@cognite.com", "sa", False)])update.description.set("")res = client.extraction_pipelines.update(update)res-Jason
@Shreya Pandey ,Have a look at: https://docs.cognite.com/cdf/integration/guides/interfaces/setup_data_factory/Hope this helps,Jason
@Alex Narayanan,DB Extractor now support local spreadsheet files: https://docs.cognite.com/cdf/integration/guides/extraction/db/db_configuration#databases
@Shweta what version of the python sdk are you using? You many need to upgrade.Try just using ExtractionPipelineRun-Jason
@Shweta Each extraction run can have it’s own message: https://cognite-sdk-python.readthedocs-hosted.com/en/latest/data_ingestion.html#cognite.client._api.extractionpipelines.ExtractionPipelineRunsAPI.createhttps://cognite-sdk-python.readthedocs-hosted.com/en/latest/data_ingestion.html#cognite.client.data_classes.extractionpipelines.ExtractionPipelineRunWrite Is this what you need?-Jason
@Dexter Nguyen Displaying custom metadata fields for Datasets in the UI is not possible today. Please share this feature request with @Aditya Kotiyal.-Jason
@Neerajkumar Bhatewara @Gargi Bhosale, was Hakon able to answer your queries? Are there any outstanding questions?
@Dexter Nguyen Is there an potential race condition here? Would a thread try to upload datapoints before the timeseries is actually created? Given that you only observe this behavior during concurrent jobs, this seems likely. Is is possible to have 2 jobs types: First, create all the necessary timeseries objects. You can POST to create 1000 time series objects in a single request. Second, post to create the datapoints?-Jason
@Dietmar Winkler Can you please try again. I believe the links are active again.https://indsl.docs.cognite.com/contribute.htmlRegards,Jason
@VamsiGrandhi Have you followed the links below? Downloading the collection and setting up environments will come in very handy 😀. Looking at your setup. You need to to change your scope parameter to be https://{{cluster}}.cognitedata.com/user_impersonationhttps://api-docs.cognite.com/20230101/#section/Postmanhttps://developer.cognite.com/dev/guides/postman/-Jason
@Stuart Donaldson,It’s in CDF roadmap for Unit support in Data Models. I’ll need @Everton Colling to provide an update on timeline of capabilities.JasonPS, I understand the current focus is unit support for time series data points.
@Niranjan Madhukar Karvekar This is not currently supported, but on the short term roadmap scheduled for October release.-Jason
@Viswanadha Sai Akhil Pujyam , What’s helpful in these instances it to output the result of your transformation to a CDF Raw table. This way, you can query instances where startTime is greater than endTime. It’ll help debug your transformation. Looking at your transformation, it does not explicitly filter out instances where start is greater than end. Dealing with this can be tricky. Jason
Already have an account? Login
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.