Solved

Urgent Help Needed: Implementing Time-Series Models and Asset Hierarchy with Python SDK

  • 22 September 2023
  • 5 replies
  • 104 views

Userlevel 1
Badge

Hello esteemed members of the Cognite community,

I come to you with a sense of urgency and a deep need for expert guidance. I've been attempting to familiarize myself with Cognite for the last six months, and despite my best efforts, I find myself at an impasse with some critical aspects. My situation is time-sensitive, and I am truly hoping for your assistance to navigate this complex but fascinating journey.

My Setup:
Environment: Google Colab (Free version)
SDK: Cognite Python SDK
Assets: Wind Farm with two sub-assets (Asset A and Asset B)

My Goals:

Connect to Cognite Project via Python SDK: I need to use my client ID and client secret to establish a connection from Google Colab.

Create Hierarchical Assets: My aim is to create a parent asset, named "Wind Farm," and within this parent asset, include two child assets, Asset A and Asset B.

Upload Time-Series Data from CSV: I have CSV files containing wind-speed and wind-power data for Asset A and Asset B. I need to upload these as time-series data to the corresponding assets. (Below is a snippet of what the data might look like - the numbers are all random and are made-up for the sake of this query!)

time

Asset A- Wind Speed

Asset B - Wind Speed

Asset A - True Wind Power

Asset B - True Wind Power

11/1/19 00:00

23.105

23.3822

1

1

11/1/19 00:10

23.3516

23.7522

1

1

11/1/19 00:20

22.681

23.7341

1

1

11/1/19 00:30

22.6935

24.3308

1

1

The data is recorded every ten minutes. Let’s say it ranges from November of 2019 to February 28th, 2019. 

Apply Models in a Rolling Window Fashion and Visualize Results: This is where it gets intricate. I plan to apply statistical and machine learning models like Persistence and Random Forest to the time-series data. I need to do this using a rolling window approach with a forecast horizon of 6 hours, a forecast range of 5 days, and a slide (or roll) of 6 hours, all while dealing with 10-minute intervals in the data. Additionally, it's essential for me to visualize these time-series and forecast results within Cognite, if possible. (Please note: that I am familiar with applying these models - I am more interested in linking these results and linking the visualizations to the Cognite project)

Link Forecasts to Assets: After running these models, I must save the forecast data as new time-series and link them to Asset A and Asset B within the Cognite platform.

My ultimate aspiration is to create a 3D Digital Twin of our wind farm. The vision is to have an interactive interface where one can click on a wind turbine and instantly view the analyses and forecasts applied to that asset.

I am truly at a critical point in my project, and your expert advice could be the turning point for me. I promise to give back to this community with whatever knowledge and expertise I gain through this experience.
 

I want to clarify that my request for help doesn't stem from a lack of effort on my part. I've combed through multiple queries and even taken courses in an attempt to resolve these challenges independently. However, the complexity and the practical aspects have made it difficult for me to proceed without guidance. Please understand that I am not taking this community's expertise for granted; I am genuinely in need of assistance and am committed to learning.

Thank you immensely for taking the time to read through my detailed request. I would be incredibly grateful for any guidance or advice you can provide. If there are flaws in my approach or understanding, I welcome your corrections wholeheartedly. If you have any questions or need further clarification, please don't hesitate to ask. I am also open to receiving any code snippets or other resources that could assist me in accomplishing my goals.
 

Warm regards,
Vishnu Iyengar

icon

Best answer by HaydenH 22 September 2023, 09:24

View original

5 replies

Hi @Vishnuvallabha Iyengar, I have not used Google Colab with the Cognite SDK before, but assuming it works, I can help with the rest. I am a little unsure about how you will schedule rolling calculations in a Jupyter notebook, but you can use Cognite Functions to write scheduled calculations. You can even upload from a notebook. See here for details.

Not 100% clear on what you mean for rolling calculations, but if you just mean to get the latest average values, you could do something like:

from cognite.client import CogniteClient
client = CogniteClient()
client.time_series.data.retrieve_dataframe(
start="1d-ago",
end="now",
external_id=my_xid,
aggregates=["average"],
granularity="10m",
include_aggregate_name=False
)

Note that you can specify aggregates and granularities (coarseness) to get data in the format you want. For example, the call above will get the data for my_xid for the last 24 hours, aggregated according to an average with 10 minute granularity (and also does not add the aggregate name). In other words, this will get you new averages but not necessarily rolling ones. Aggregates will be calculated upon getting the raw data, so whenever you retrieve aggregated data from CDF, it will be pre-computed. Otherwise I would recommend the rolling functionality in pandas after you have retrieved your data from CDF (see here). If you already have the data for your rolling windows and need to upload them, then assuming you have a Pandas DF, then you can simply upload as follows:

client.time_series.data.insert_dataframe(df)

Note that this assumes the column(s) in the DF exist in CDF, so if there is a chance in your logic that you have to create these dynamically, make sure you handle these (e.g., in a try-except clause). This insert dataframe method will also work if you have uploaded the data from a CSV that you have loaded into your notebook.

Now let us say you have two situations:

  1. you have already created the TS but forgot to link them to an asset, no worries, we can fix that; and
  2. you have not created it yet but are about to (e.g., in the case above for inserting data), then this can be linked on instantiation.

See the class-based approach here to update a TS (case 1), just replace things like description (which are attributes of the TS object) with asset_id and then call the .update() method on the client with the TS to be updated. If you have case 2 (which might be the easiest, i.e., delete all the TS objects you have now and recreate them), then you can do the following:

client.time_series.create(TimeSeries(name=my_ts, asset_id=my_asset_id, ...))

Regarding visualisation, then I would recommend Cognite Charts (if you don’t want to code it up yourself). Here, you can visualise any TS object in your CDF project, as well as perform computations on them.

I hope that helps you on your project. If there is anything more I can help with, please let us know.

Userlevel 3
Badge

Hi,

 

I can guide a bit towards ingesting data

 

It will be fine to login using Google Colab, using the code presented here

 

https://cognite-sdk-python.readthedocs-hosted.com/en/latest/quickstart.html#instantiate-a-new-client

 

Ypu can use this in a previous cell to install the required cognite-sdk
 

!pip install cognite-sdk

 

Instead of having the secret as clear text or environment variable, you can use input in Google Colab to enter the secret at runtime

 

client_secret = input('Enter your secret: ')

 

 

After you initiated the client, to create assets/asset hierarchy, I refer you to the endpoint in our SDK
https://cognite-sdk-python.readthedocs-hosted.com/en/latest/core_data_model.html#create-assets

https://cognite-sdk-python.readthedocs-hosted.com/en/latest/core_data_model.html#create-asset-hierarchy

They should be plenty of examples to get you started

 

Referring upload data to CSV, you can do that using RAW

https://docs.cognite.com/cdf/integration/guides/extraction/raw_explorer/

You can also use the SDK for that, after parsing the CSV
https://cognite-sdk-python.readthedocs-hosted.com/en/latest/data_ingestion.html#raw

  1. Create database
  2. Create table
  3. Ingest rows

 

And then writing transformations in order to:

  • 1. create the timeseries object
  • 2. ingest datapoints to the timeseries

https://docs.cognite.com/cdf/integration/guides/transformation/transformations

 

Hayden covered the rest of the questions I think. 

Hope that helps, let us know if you need more information

Userlevel 3
Badge

If you have access to Cognite Academy, I would also recommend going through the courses that go through those steps in details and with hands-on exercises

https://learn.cognite.com/page/cognite-learn-catalog#role_data-engineer

Userlevel 1
Badge

Hello @HaydenH,

I can't express how much I appreciate your comprehensive and detailed response. Your insights have illuminated several aspects that I was struggling with, particularly regarding time-series calculations and asset linking.

The code snippets you provided are particularly helpful, and I'm excited to try them out in my project. Your recommendation about using Cognite Functions for scheduled calculations and Cognite Charts for visualization has also given me new avenues to explore.

Thank you for taking the time to help me. Your expertise is invaluable, and I'm grateful for your willingness to assist. I'll be sure to reach out if I have more questions or run into further challenges.

Best regards,
Vishnuvallabha Iyengar

 

Userlevel 1
Badge

Hello @Gaetan Helness,

Thank you for your thoughtful response and for pointing me in the right direction, especially for ingesting data and asset creation. The links you shared are incredibly helpful, and I am eager to delve deeper into the SDK documentation you recommended.

Your tip about using `input` in Google Colab for entering the secret at runtime is particularly useful and something I'll be implementing right away.

I also appreciate your suggestion about the Cognite Academy courses. I'll make sure to check them out for more hands-on learning.

Your guidance is invaluable, and I'm sincerely grateful for your help. I'll keep you posted on my progress and may reach out with further questions.

Best regards,
Vishnuvallabha Iyengar

Reply