Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
Thanks @HaydenH . But I have to do this derived timeseries objects within a cognite function. I already have a function which creates the base timeseries and then adds the datapoints to the corresponding timeseries objects. I wanted to add the new timeseries as synthetic timeseries objects but don’t want to give any other parameters such as start, end, granularity etc. I am basically creating the base objects t1,t,2,3,t4,t5 and in the same sequence, I would like to add the subsequent objects T1,T2,T3 etc. as per the equations I gave above.
Hi @Håkon V. Treider , the dataframe that I listed is the one based on which I need to create the timeseries objects. Each of those individual columns refer to the ext-id(s) and the values in each column correspond to the datapoints agains the date in the ‘date’ column. Based on this construct, how do I manage to create timeseries and add datapoints and while I create the timeseries objects, they have to be linked to a specific asset. Upsert doesnt give that option at all. Please advise.
Hi @HaydenH When I run the same piece of code in my VS code and then it runs successfully. This fails when I run the code in Jupyter notebooks in CDF online. I think there is a problem in the CDF Jupyter notebooks.If there is a problem in APIs or variables, then it must fail and throw the same error in VS code as well.In local VS code, I pass the parameters such as client secrets, tenantid etc and create the CogniteClient object. There seems to be a problem in the Jupyter notebook where it doesnt create the client object in the right manner for the CDF project it is associated with.
Yes @Carin Meems . Thanks @Ivar Stangeby
Please could you share a clear example with steps; I dont see any type or such setting in CDF charts in front-end.
Hi @Knut Vidvei I would want to show that in the same chart itself. That is what my customer is asking for. If I have to work outside charts, I can use core Python and other visualization tools to show that details. For ex: this code will show it pretty well using core Python. import numpy as npimport pandas as pdimport matplotlib.pyplot as plt# Generate example time series datanp.random.seed(42)n_points = 100timestamps = pd.date_range(start='2023-01-01', periods=n_points, freq='D')values = np.random.normal(loc=50, scale=10, size=n_points)values[20] = 150 # Adding an outlierdata = pd.DataFrame({'Timestamp': timestamps, 'Value': values})data['Z-Score'] = (data['Value'] - data['Value'].mean()) / data['Value'].std()outlier_threshold = 2outliers = data[data['Z-Score'].abs() > outlier_threshold]# Plottingplt.figure(figsize=(10, 6))plt.plot(data['Timestamp'], data['Value'], label='Time Series')plt.scatter(outliers['Timestamp'], outliers['Value'], color='red', label='Outliers')plt.xlabel(
Thanks @Knut Vidvei. Is there a way to use Outlier detection and visualise that in the same timeseries?I am not able to understand and use that in CDF charts. Actually, It must look like the box plots and must show all the normal points and the outliers in a different color.
Thanks @HaydenH . Customer is asking for a csv file only. The final dataframe is having a long list of columns (each column representing some specific series)They would like to fetch the csv from CDF and then use it to analyse offline
Thanks @HaydenH . Updated the requirements with the version = 6.15.0 and the function ran till the end. Now I have a unique error. At the end of the function, I have a statement and this ran fine in a jupyter notebook.df_final.to_csv("Extracted_data.csv")This is the error I am getting. Please help. How Do i store this csv in CDF through this function.I need to store this final dataframe into a csv. Traceback (most recent call last):File "/home/site/wwwroot/function/_cognite_function_entry_point.py", line 455, in run_handleresult = handle(*function_argument_values)File "/home/site/wwwroot/function/handler.py", line 57, in handledf_charts.to_csv('Extract_Charts.csv')File "/home/site/wwwroot/.python_packages/lib/site-packages/pandas/core/generic.py", line 3772, in to_csvreturn DataFrameRenderer(formatter).to_csv(File "/home/site/wwwroot/.python_packages/lib/site-packages/pandas/io/formats/format.py", line 1186, in to_csvcsv_formatter.save()File "/home/site/wwwroot/.python_packages/lib/si
Thanks @absqueued . My customer is asking for a xls so that they can download all the charts in CDF. They only want the final output values (if there are some calculations attached in the chart) so that it can be validated offline. This must be specific for a particular time frame input by the user (start = date1, end = date2). Also, I am already creating a Cognite function and trying to work out this mechanism. This Cognite function can be executed manually by the client by providing the start and end and then they can derive the data. If you can take this design and incorporate it into the ‘Charts’ product in CDF, that would be great. Hope I am credited with this feature idea to recognize my contribution:) import pandas as pdfrom cognite.client import CogniteClientstart=pd.Timestamp("2023-01-01")end=pd.Timestamp("2023-01-07")client = CogniteClient()ts_names_list = client.time_series.list(limit=None,asset_subtree_ids=[785298418492686])time_series_data_extids = ts_names_list.as_exte
Hi @HaydenH I have to create around 137 charts using some base timeseries objects. After testing each of the chart, I need to push them into PRODUCTION as well. So hence I was wondering if there is a way to mechanize this using SDK so that when it's all tested, I can replicate the same setup in different project (environment) as well.
Hi @HaydenH I completed the code and ran the code-block. It created all the necessary timeseries objects and I also was able to link these objects to its corresponding assets too. I have one further question: This does the job when I take one record at a time and perform through a series of computation to create the necessary timeseries objects. When I am given a range, I will now have to pick all the records in that range (say, 300+ rows) and then perform the same series of computations for each row. After successfully processing and deriving the datapoints for the timeseries objects, I will have to move to the next row and continue until I finish the range. In this case, I cannot get away from loops but iterate through each row to derive the datapoints. Each processing job takes around 90 - 100 seconds. Please advise if this approach is okay in CDF.
Hi @Håkon V. Treider It automatically got resolved now. Now it is not throwing any error.
Hi @Håkon V. Treider print(client.version)6.13.1my_projects = client.iam.token.inspect().projectsprint([p.url_name for p in my_projects])---------------------------------------------------------------------------MissingSchema Traceback (most recent call last)Cell In[33], line 1----> 1 my_projects = client.iam.token.inspect().projects 2 print([p.url_name for p in my_projects]) File /lib/python3.11/site-packages/cognite/client/_api/iam.py:184, in TokenAPI.inspect(self) 168 def inspect(self) -> TokenInspection: 169 """Inspect a token. 170 171 Get details about which projects it belongs to and which capabilities are granted to it. (...) 182 >>> res = c.iam.token.inspect() 183 """--> 184 return TokenInspection._load(self._get("/api/v1/token/inspect").json())File /lib/python3.11/site-packages/cognite/client/_api_client.py:134, in APIClient._get(self, url_path, params, headers) 131 def _get
Hi @HaydenH I have tried to create the dataframe and the attached csv is the final df. (It has all the external IDs that is to be created in CDF along with the row for the corresponding values. there is also a timestamp column) Before I run this line to insert the datapoints with :client.time_series.data.insert_dataframe(df,external_id_headers=True)I need to create the timeseries objects using the list of externalIDs that are there in this df. Can I pass the list of externalIDs to the upsert operation? TimeSeriesAPI.upsert()Also, each of the timeseries has to be linked to an asset. Can I Pass that as a list into the upsert function?Please advise.
Yes @Carin Meems . I used this and added the following code to make it available as a pandas DataFrame. If CDF wants to create a new library function (to fetch the file / csv as DataFrame), this can be used. fileobj= (client.files.list(name='file.csv'))[0]file_content = client.files.download_bytes(id=fileobj.id)csv = file_content.decode('utf-8')file_df = pd.read_csv(StringIO(str(csv)))
Thanks @HaydenH . I will adopt the vectorization construct and implement those for creating timeseries objects for each row in my final dataframe. I can try to vectorize and create a list that will contain all the list of externalIDs for the ‘volc_prod’ and a list for datapoints for those corresponding externalIDsso like list_volc_prod = [ts_<value1>_volc_prod , ts_<value2>_volc_prod , ………..]list_datapoints_volc_prod [<value-for-prod1>, <val-for-prod2>,……..]I am not sure how to upsert them all in one -go when I have to create the objects or update (if they existed already in CDF) using the above two lists. Also, the docs (with examples) only lists for individual timeseries objects and not able to follow for bulk upsert.TimeSeriesAPI.upsert(item: TimeSeries | Sequence[TimeSeries], mode: Literal[('patch', 'replace')] = 'patch') → TimeSeries | TimeSeriesList Also, I need to upsert datapoints as well for those corresponding timeseries objects upserted.
Hi @HaydenH I am implementing this in a cognite function. So, I would want the function to take the file, download and then use that file to read it in pd.read_csv(). All these must happen in-tandem. I don't want to download to my local disk and then read the file from my local. Since it must all happen online within cognite itself, what is the best way to do that?Ideally, it is apt to have a function like client.files.retrieve_dataframe(filename = “myfile.csv”) etc. But I don't have any means like this now. Please advise.
Thanks @HaydenH I tried with asset_subtree_ids. It gave the expected result. root = client.assets.list(name='liquid_asset')child_assets=client.assets.list(parent_ids=[root[0].id])list_assets = []for list_item in child_assets: (list_assets.append(list_item.id))time_series_obj = (client.time_series.list(asset_subtree_ids=list_assets,limit=None))df_timeobj = (time_series_obj.to_pandas())df_timeobj[df_timeobj['name'].str.contains('Product_Vol',na=False)].loc[:,'id','external_id','name']]It took around 30 seconds to pull the final table that I wanted. Hope that is transient and won't be a problem when I include this code block in cognite function.
Thanks @Chyrus Ramesh for the response. CDF automatically takes the field as numeric only (#) in the raw table. At least that is what i observed after I uploaded from CSV. But when i run this command and fetch it as a data frame and try to see the total sum of this column, it says it has str values. Ideally, if it is stored as # columns then it should be a numeric column in dataframe too, why is the retrieve-dataframe converts the long numeric value into an object in a dataframe. Attaching the csv for your reference. You can try in any CDF project and check it out and do retrieve as a dataframe in a notebook and see the result. from cognite.client import CogniteClientc = CogniteClient()df = c.raw.rows.retrieve_dataframe("database-name", "table-name", limit=None)when you try to check the following, it should not give any error if it was properly retrieved as a numeric column. df[‘0NA’].sum()
Thanks @Håkon V. Treider . Is there a way to delete the dataset I created in SDK?In the script above, I manually created a dataset from UI and then used the ID to add in the asset hierarchy script. I want to delete the dataset and create a new dataset in the script itself (with name and description) and then attach all the assets to the newly created dataset. I am unable to delete the dataset that i created manually. Please help.
Thanks for the inputs @Jason Dressel . It helped! Please could you guide on how to add the new creation of asset and hierarchy with a dataset that I already created. When I run the code, it creates the assets and hierarchy but doesn't update them all in my dataset. Dataset is still empty. # Code to create the LP -Classes and store the meta-dataimport pandas as pdfrom cognite.client import CogniteClientfrom cognite.client.data_classes import Assetcsv_file = "lpm_asset_hierarchy.csv"df = pd.read_csv(csv_file)client = CogniteClient()def create_asset(asset_name, description, dataset_id,lims_prefix=None, weight=None,pid = None): metadata = { "LIMS_Prefix": lims_prefix if pd.notna(lims_prefix) else 'NA', "Weight": weight if pd.notna(weight) else 0 } asset = client.assets.create( Asset(name=asset_name, description=description, metadata=metadata,parent_id=pid) ) return assetdataset_id = 6572524964198398root_asset = create_asset("LPM_YT_MODEL", "Root Asset",
Hi @Carin Meems It did help!, Thanks for the inputs.
Hi @roman.chesnokov ,Thanks for the inputs. Is it possible to perform the insertion inside a cognite function but not return any json as output and deploy the function?I think Cognite fucntion expects to have a json output.
Hi @HaydenH Thanks for the response. I will try this option. Currently, I have decided to use the raw tables in CDF to store all of them instead of sequences and then use raw tables to fetch the dataset and to my processing.
Already have an account? Login
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.