Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
Hi @Haw Keat Lim Thanks for question.RAW(aka the staging area) is a temporary storage area for data copied from source systems. It’s a json blob storage behind the scenes. Since it’s a temporary storage it’s not designed to store schema and is not optimised for direct querying or analytics. Users are expected to use transformations or functions to read data from RAW, enforce schema and write to the data model and then query the data. I hope that helps. Let’s us know if you think we you need schema retained in RAW for your usecase. We would like to understand better. Cheers!Sunil
Hi @Gaetan Helness , I get the access permissions you will need to create temp DB/tables in production environments. How about using Common Table Expressions to store results of sub query temporarily. Does that work for you? Thanks,Sunil
Gathering Interest→Parked
Hi Gaetan,The reason we don't support deletes in Transformations is that it would end up doing potentially many deletes (ie. hundreds of thousands), which is very harmful for RAW with its current implementation. Thanks,Sunil
Hi @Ben Brandt The fix is now deployed. Please take a look.This took bit longer than expected thank your for your patience :))
Hi @Ben Brandt This is taking longer than expected to debug our spark data source locally and unfortunate sick leaves due to weather. Please be assured this is still on the top of our sprint board along with orchestration capability within CDF. I will get back to you shortly once fixed. Sorry for the inconvenience.
Hi @Ben Brandt , We on it this sprint.Is it’s a bug on transformation side. When we queried directly using the timeseries APIs we get the results as expected. We will fix it shortly. Will inform you once fixed.Apologies for the delay. :))
@Ben Brandt This is weird. I will check this tomorrow and get back. Apologies for the delay :)
Thank you for sharing @Ben Brandt !! Really appreciate it.I have shared these with the development team currently evaluating orchestrators to find a good fit for our requirements and tech stack.
@Ben Brandt ( My bad, please ignore my previous comments - I confused myself with another jira ticket we have in the backlog related to renaming of transformations)So you can update the name and external id of a transformation in the new UI as shown in the image above. In the old UI we allowed updating name and external id of a scheduled Transformations as well but we don’t allow that in the new UI. During the design of the new UI we observed that scheduled Transformations are often deployed in controlled environments (production/staging) using CLI/SDK and allowing changes to these Transformations in the UI created out of sync issues with the code/script used for deploying them.
@Ben Brandt Unfortunately no :) Cognite functions is similar to Az functions i.e. for a long running Transformations the lifetime of a Cognite function is short. A patch solution for now would be to run the python script on a VM.A proper solution we believe is to have an orchestration service in CDF. We are currently working on the details and designs to figure out exactly how it's going to work. I can share more once the plans are more concrete.
https://docs.cognite.com/api/v1/#tag/Transformations/operation/runTransformation
Hi @Ben Brandt that change was intentional in the new UI. Users can run a transformation using its externalId via the API, SDK and CLI. In the past some user updated the externalIds on the UI which caused Transformations triggered by the SDK and CLI using the externalId fail.
@Ben Brandt you can run ad hoc SQL using Cognite’s .NET SDK and the Python SDK https://cognite-sdk-python.readthedocs-hosted.com/en/latest/cognite.html#preview-transformationshttps://github.com/cognitedata/cognite-sdk-dotnet/blob/e4e991a123c1732fd4220ba3294793e6bf69c611/CogniteSdk/src/Resources/Transformations.cs#L357Python SDK is our gold standard and very well supported. .NET is stable as well but the documentation needs to be improved. :)
Hi Ben, Yes, “id” of a timeseries is needed. I have created a dev ticket to investigate the error and provide a fix. :)I would encourage you to consider using explore data or transformations previews for querying as Postgres Gateway is intended only for ingestion. Though querying is possible we don’t officially support it. https://docs.cognite.com/cdf/integration/guides/interfaces/postgres_gateway#when-should-you-use-the-postgresql-gateway Best regards,Sunil
Hi Anders!Thanks for reaching out. I fully agree with your suggestion - “ All fields are filled except Client Secret which has **** indicating that it is set “ This is something on top of mind our mind as well. We are currently working on new transformation UI and we will address this along with other UX improvements. Cheers!
Already have an account? Login
Enter your E-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.