Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
Hi @eashwar11, yes, there certainly exists examples in the docs. See here. Let us know if the examples are not clear.
Hi @Vishnuvallabha Iyengar, I have not used Google Colab with the Cognite SDK before, but assuming it works, I can help with the rest. I am a little unsure about how you will schedule rolling calculations in a Jupyter notebook, but you can use Cognite Functions to write scheduled calculations. You can even upload from a notebook. See here for details.Not 100% clear on what you mean for rolling calculations, but if you just mean to get the latest average values, you could do something like:from cognite.client import CogniteClientclient = CogniteClient()client.time_series.data.retrieve_dataframe( start="1d-ago", end="now", external_id=my_xid, aggregates=["average"], granularity="10m", include_aggregate_name=False)Note that you can specify aggregates and granularities (coarseness) to get data in the format you want. For example, the call above will get the data for my_xid for the last 24 hours, aggregated according to an average with 10 minute granularity (and also does
Can you please print/report back the versions of the Cognite Python SDK for each of those environments @eashwar11? If I recall correctly, the Jupyter Notebook environment in Fusion always has the latest version of the SDK installed.
Hi @eashwar11, it appears you asked a very similar question on this issue here. As can be seen in the error you quoted, the part that sticks out is the “undefined” bit (as Håkon wrote, one would expect this to be something like company-prod, company-dev, etc):MissingSchema: Invalid URL '/api/v1/projects/undefined/raw/dbs/Eashwar_MOTDB-db/tables/lp_input/cursors': No scheme supplied. Perhaps you meant https:///api/v1/projects/undefined/raw/dbs/Eashwar_MOTDB-db/tables/lp_input/cursors?Can you check your project details and if you are reading environment variables, can you check those are being read correctly? This might happen if the variables are None/not being read properly. Situation could be that you are using a library for reading .env files but not specifying a specific version and so libraries can be subject to regression in behaviour or new bugs are introduced. So maybe checking your requirements.txt file (if running this in Cognite Functions) is worth taking a look at too. If yo
@eashwar11 You should just be able to click on the corresponding coloured square under the “style” in your screenshot to get the same menu that @Knut Vidvei has shown.
No worries @eashwar11, regarding the last error, what is in the Dataframe? Are the columns time series datapoints or is it tabular? Regardless, I would suggest you upload them as datapoints to timeseries objects or as a sequence. Then if you need the results in a csv, they can be converted, as you can use .to_pandas() and then pandas has a to_csv method. On Cognite Functions, you only have a Read-only file system, as indicated by the error.
@eashwar11, can you please check which version each of them are running on? I am pretty sure the “as_external_ids” was a recent addition (to v6.15). You can check this with client.version. The solution is most likely to upgrade your requirements file in your Cognite function to the latest version.
Hi @eashwar11, I don’t know of any way to use the Cognite SDK to create a chart in Cognite Charts but you can do this from the Fusion (web) interface. If you click on Explore in the top menu and then “Analyze time series”, then this will take you to the menu where you can create new charts.After clicking on that, you should be taken to the following page: Then you just need to click on the new chart button and you should be on your way to creating new plots in Cognite Charts. Hope that helps!
@eashwar11 That depends on what you mean by logging and “functions”. If you mean what you are doing locally, that is not what CDF is designed for, then I would just use the logging library in Python. However, if you mean if there is a way to capture the output of your function, then there are multiple ways.First, if you have included print statements in your Cognite Functions code, then you can check upon them in the Fusion UI or with the SDK. Unfortunately as of writing, logging with the standard logging library in Python is not supported. First, if you arrive at the home page, and click on “Explore” in the top bar, the menu below should appear, of which you can then click on “Use Cognite Functions”, this will take you to the next screen.I have anonymised where the Function name would be, but this is the screen you would be taken to if you click on the above option. As you can see, all of the Cognite Functions that are running in a schedule are visible, as well as Responses and Logs.R
Hello @eashwar11. There are some nice examples in the docs here. As you can see, if end is not specified in the arguments, it will default to the current time. So you can leave the end parameter as is, and then specify the string “1d-ago” for start, and that will query for 1 day previous to now. However, if you always want it to be starting at 00:00 from the previous day, then you would simply do some datetime arithmetic. Something like:end = datetime.now().date()start = end - timedelta(days=1)would work. Hope that helps! 😀
@ibrahim.alsyed Can you please elaborate a bit on your comment? Yes, there are some basic fields available such as external ID, name, description, unit, etc. But there is also a metadata field which can contain a dictionary-like structure, which can contain any custom fields you may want.
Great to know that you have successfully created and linked them @eashwar11. 😀 Just as an aside in case you were not aware: another good practice I can advise is to create a “bootstrap” script/program out of your code you have created, such that you can replicate the process later if necessary. You may have both a “dev” and a “prod” environment (or perhaps you have more), and so it is important to be able to replicate across different projects/tenants to see that your solutions scale and bugs are not pushed to prod (or at least fewer bugs are pushed on).Now answering your actual question, well it all depends. Depending on whether the cloud service being used in your project is Google or Microsoft, you will have either a 10 or 15 minute time limit (with inverted CPU resources, i.e., lower time limit has higher resources) for your Cognite Function. So if you only have a few iterations to do, then you should be fine. You can of course also set multiple schedules if there is a way to part
So, you could perform the upsert on the Timeseries first, this means you need to have all the external IDs and if you need to contextualise them (such as attaching to an asset), then you should have that data as well to supply. This will be safe, and should not need any error handling, but will be slower. If the bottlenecks you experienced before came more from the lack of vectorisation, then this might be fine for this solution. Note that the example here shows how you can create one as well as update two TS objects.from cognite.client import CogniteClientfrom cognite.client.data_classes import TimeSeriesc = CogniteClient()existing_time_series = c.time_series.retrieve(id=1)existing_time_series.description = "New description"new_time_series = TimeSeries(external_id="new_timeSeries", description="New timeSeries")res = c.time_series.upsert([existing_time_series, new_time_series], mode="replace")So, as you can see in the last line, there is a list of two different TimeSeries objects, this
Thanks @HaydenH . I will adopt the vectorization construct and implement those for creating timeseries objects for each row in my final dataframe. I can try to vectorize and create a list that will contain all the list of externalIDs for the ‘volc_prod’ and a list for datapoints for those corresponding externalIDs so like list_volc_prod = [ts_<value1>_volc_prod , ts_<value2>_volc_prod , ………..] list_datapoints_volc_prod [<value-for-prod1>, <val-for-prod2>,……..] I am not sure how to upsert them all in one -go when I have to create the objects or update (if they existed already in CDF) using the above two lists. Also, the docs (with examples) only lists for individual timeseries objects and not able to follow for bulk upsert. TimeSeriesAPI.upsert(item: TimeSeries | Sequence[TimeSeries], mode: Literal[('patch', 'replace')] = 'patch') → TimeSeries | TimeSeriesList Also, I need to upsert datapoints as well for those corresponding timeseries objects upserted.
Hi @eashwar11, it is great to see that your use case is progressing along nicely. Here are some tips I can advise when developing on top of CDF:Try to minimise the number of calls to CDF where possible. One way to do this is to instead make use of a try-except block. You might have heard about this principle as “better to ask for forgiveness than ask for permission”. In the current if/else block, it will check every single time before it creates something. Instead just put the uploading part in the try block, and then catch the appropriate error (make sure not to do a plain except Exception) and then put the creation step in the except block. Where possible, upload the whole dataframe containing multiple datapoints for multiple time series objects in one go. Furthermore, iterating through a dataframe with iterrows is not super efficient. Instead, it is best to perform vectorised operations where possible. Just make sure you have the external IDs in the columns of the dataframe and the
Hi @eashwar11, when you list a file from CDF, you are not actually retrieving it, you are retrieving the object (including the metadata set in it). If you want to download it, then you could do as is described in the docs here. Then you can read from pandas using pd.read_csv(“my_file.csv”). I hope that helps. 😀
Hi @eashwar11, there are a couple of ways to do this. One such way would be to useclient.time_series.list()Inspecting the docs, we can see that the list method accepts an argument:asset_subtree_external_idswhich can be used if you know the external ID (or ID if you use the id argument instead). If you then convert to a Dataframe with the to_pandas() method, then you can take the column of external IDs and then retrieve data of these time series to do further processing if you desire. Let me know if that helps or if further clarification is needed.
Hi @Vishnuvallabha Iyengar, these are all great questions. I will try to help you as best as I can.This format for the data is fine, however in CDF, each of these time series (columns) you have would be in their own time series. The main point is to attach information to the objects you are interested in such that you can query those which you want. For example, if you wanted all the time series objects at location E05, you could add a metadata field to them such that you can call them that way. Similarly, you could add a description field that would allow you to retrieve all of your wind speed time series. You could attach all the time series at a given location to an asset that represents a factory/site as you have suggested. The visualising part however, is not something I would necessarily recommend (the Python SDK does not provide this kind of feature). You have two options here: 1. use the visualisation available in the Fusion browser (including Cognite Charts); or 2. convert the
Hi @roman.chesnokov , Thanks for the inputs. Is it possible to perform the insertion inside a cognite function but not return any json as output and deploy the function? I think Cognite fucntion expects to have a json output. Are you referring to the output from the handle function in a Cognite Function? If so, this is more for status purposes. E.g., if your handle/function was successful you might have a dictionary returned as follows:return {“status”: “success”, “message”: f“processed {len(total_data)} datapoints”}This dictionary would then be viewable as a JSON from the Functions webpage in Fusion. Maybe if you were handling a failure you might also write something like:return {“status”: “failure”, “message”: “no timeseries found”}The functions referred to by @roman.chesnokov do not return any JSON values though (they actually return None). Does that make sense? If you don’t want to return anything in the handle function, then you could return an empty dictionary, but good pra
Hi @eashwar11, I think the best way for you to approach your problem is to make use of the to_pandas() method from sequences (see here). I have not used sequences so much myself, but the core data models (sequences, time series, events and such) all support a to_pandas() method. So the call to your sequence might look like this:client = CogniteClient()client.sequences.data.retrieve( external_id=ext_id, start=0, end=None).to_pandas()Then you have access to all your pandas functionality. After you have finished modifying/processing your DF, then you can also make use of the functionality for uploading dataframes to a sequence (here). I hope that helps.
Already have an account? Login
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.