Hey @Viswanadha Sai Akhil Pujyam , it seems like an error related to the timestamp parsing. I observe that the date format in Basic Start Date and Basic Finish Date columns in the provided table is different from the format declared in the to_timestamp() function used.
@eashwar11 you can only archive datasets, there is no such possibility to remove them. This is due to the fact that data access rights can be determined by the dataset, and when it is deleted, a situation may arise when it is impossible to delete the data.
Hey @eashwar11 . Can you provide more details about attaching values to time-series? Do you want simply add a datapoint? According to the documentation, you can insert datapoints or insert a pandas dataframe as datapoints in multiple time-series simultaneously. Let me know if that’s helpful.
Hey @Mohit Shakaly . I'm sorry, but extractor-utils doesn't have that feature. However, you can choose to run it continuously and set the schedule within your code. Alternatively, you can terminate the extractor after each upload and set the schedule externally using a tool like cron.
Hey @eashwar11 , I’m not sure if the sequence data type fits your needs. Do you have any reasons for using that particular type? I would recommend using time series. instead. You can easily access the values as tabular data (pandas DataFrame).client.time_series.data.retrieve_dataframe(list_of_external_ids, start='1m-ago', end='now', aggregates=['average'], granularity='1d')But if you don’t need to visualise the data, you can do the same directly from the RAW.client.raw.rows.retrieve_dataframe(db_name, table_name, limit=-1)
@thomafred could you please create a support ticket about that problem?
@eashwar11 it depends on your setup. We usually use poetry for our local projects, so if I need to update something, I update the version in the toml file.
That’s true, I see it now. I will try to reach out to someone from the storage team on our side. What is the size of the file to be uploaded, by the way?
Hey @eashwar11 , please update the version of SDK installed locally.
Hello @Nonstad, can you provide more details? What actions did you take, what were your expectations, and what were the results? Please note that the "level" feature in the console section is only applicable to logs displayed in the console, not those saved in a file. As far as I see the debug level provides additional heartbeat logs every 10 minutes. Also, be aware that you need to restart the extractor after modifyinig the configurastion.
Hello Thomas, my suggestion for resolving this issue would be to implement a retry strategy with back-off for uploading blocks. I have noticed that there is no request ID in the error logs, which makes it difficult to investigate further on our end.
@eashwar11 So as far as I understand you have a lot of time-series and you want to use only a few of them in a hierarchical structure? That’s the perfect setup for using ‘Assets’ as I see that. You need to create a few additional objects, an asset per tag, but it will allow you to represent the hierarchical structure perfectly. You can also use Data Modeling capabilities, but it will probably be overkill for that task.
Hey @eashwar11 , it seems that you just need to create an asset hierarchy using SDK or CDF Transformations and assign your time-series to the assets. Then you can use Python SDK method .subtree() to retrieve all the time-series from hierarchy or assets.retrieve() to get a particular asset (tag). Please inform me if you have any further inquiries.
Hi @adeel.ali Unfortunately, the DB extractor right now only allows to use schedule.
Hey @Sonali Vishal Patil we have a query endpoint as described in the docs. But be aware that it has a maximum timeout of 240 seconds, the same as the preview in the UI.
@thomafred then, most probably, you need to create a blob with Put Blob as described in the documentation and then use Put Block to upload chunks and Put Block List to commit the chunks after they are all uploaded. But it should be proven experimentally, I haven’t dealt with that particular case.
@thomafred could you double-check that? If your IdP is Azure, that doesn’t mean yet that CDF resources are also hosted on Azure. But if you are, then I suppose you can leverage the documentation for Azure blob storage.
Hey @thomafred , According to the documentation :If the uploadUrl contains the string '/v1/files/gcs_proxy/', you can make a Google Cloud Storage (GCS) resumable upload request as documented in https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload. And following the link there is an instruction about uploading files in chunks. I hope it helps you. Feel free to let me know if you have any additional questions.
Hi @Ann Sullivan Thomas , If you have any trouble with the hands-on task at the end of the course, don't worry! We've got you covered. You can find a pre-made solution file in our GitHub repository to help guide you through it. Just follow the instructions provided in the course and refer to the solution file if needed.I hope this is helpful. If you have any questions, don't hesitate to ask.
Hey @eashwar11 , the file extractor allows you to upload files to CDF, and then you can load each file with pandas and upload each page as a RAW table. It’s also possible to write your own service, which will take the files directly and upload them as a RAW. For example, a snippet to upload data from each sheet to a new RAW table:def upload_xls_file(client, file_path, db_name='Test'): """ Uploads an XLS file as a Pandas dataframe, with each sheet as a new dataframe. Uploads it into RAW table. Args: client (CogniteCLient): a Cognite client instance file_path (str): The path to the XLS file, db_name (str): the name of the CDF RAW database Returns: A dictionary where each key is the sheet name and each value is a Pandas dataframe. """ xls_file = pd.ExcelFile(file_path) sheet_names = xls_file.sheet_names dataframes = {} for sheet_name in sheet_names: client.raw.tables.create(db_name, sheet_name) df = xls_file.parse(sh
Hey @vaibhavsancheti25 , If you are referring to a PI extractor, it is specifically designed for continuous running and streaming of data.Ref from doc:The Cognite PI extractor connects to the OSISoft PI Data Archive and detects and streams time series into Cognite Data Fusion (CDF) in near real-time. In parallel, the extractor ingests historical data (backfill) to make all time series available in CDF. The PI points in the PI Data Archive correspond to the time series in CDF.
@Paramale Arati It's important to note that in the event you have both a start and end date, so it cannot be displayed as a point marker. So, as @Ankit Kumar wrote before, you can’t directly display both timeseries and events in one chart in PowerBI. But it allows you to utilise a Line chart along with a custom timeline chart, such as a Gantt chart, for example, one under another, and apply the same filters on intervals. This method can be effective if there are not too many events within the observed interval.And that’s an example of how it looks like in Cognite Charts:
@Viswanadha Sai Akhil Pujyam to insert CSV as a RAW table, you can use the following snippet:import numpy as npimport pandas as pdfrom cognite.client import CogniteClientDATABASE_NAME = "db"TABLE_NAME = "table"CSV_FILENAME = "source.csv"client = CogniteClient()client.raw.databases.create(DATABASE_NAME)client.raw.tables.create(DATABASE_NAME, TABLE_NAME)df = pd.read_csv(CSV_FILENAME, index_col=0).fillna('')client.raw.rows.insert_dataframe(DATABASE_NAME, TABLE_NAME, df)
@eashwar11 It strongly depends on the particular architecture and the use case. That’s probably why there are no particular examples of automation in the docs. Basically, you can just install a few instances of extractors on a VM manually and forget about that, that’s how it’s done for many cases.
Hey @eashwar11, for the PI extractor, you need a Windows Server Machine. Then you can install and configure a few extractors, one for DEV and one for PROD. Extractors are usually running continuously. If you’re going to use a cloud VM, you can configure GH actions to update the config files and restart services, for example, depending on a particular cloud provider.
Already have an account? Login
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.