Impact 2024: The Industrial Data and AI Conference for and by Users | Nominate Speakers Now for a Ch...
This request relates to As a developer I want to make queries on the datamodel that retain information about the connections between objects.Making large queries on the graph connected by Relationships requires the client-side to re-construct the graph in memory. There are no semantics in CDF for preserving ordering in relationship queries, or imputation for order preservation. Meaning that for each step in the graph when making parallel queries, the graph has to be reconstructed locally. We understand that this feature is largely addressed by Flexible Data Models, but we’re curious how this will resolve in the query language for FDMs. Especially when we have unions of types that edges can refer to in the graph. This feature has the highest priority for the Data modelling and App teams at Statnett.
As a developer, I want to make verbose queries on Assets, so that I have to maintain less complex code on the client side of applications.Retrieving Assets by Relationships and the Hierarchy is cumbersome and divests complexity to the client. A verbose query language for the datamodel (like SQL or SPARQL, GraphQL formats allow) would delegate complexity back to the server, and is preferable to the in-memory re-building of the model that now takes place.We understand that most of this feature will be covered by Flexible Data Models, but we’re curious about the graph-traversal kapabilities it offers.This feature has the highest priority for App development at Statnett.
Hey!In our tooling for Power Analysts we compute synthetic-timeseries for them to evaluate scenarios of flow exceeding a threshold value. In the current implementation of SyntheticTimeseries only Average and Interpolation aggregates are allowed. This leads to scenarios where zoomed out (and down-sampled) views of data computed through the SyntheticTimeseries API displays non-informative values. Consider the example below, where the top image is the un-aggregated addition of two timeseries, and the bottom is with the use of SyntheticTimeseries.
As a developer, I want to explore the datamodel in CDF, so that I can verify my integration and understand and traverse the datamodel.Front-end and back-end developers are both dependent on visually exploring the datamodel in CDF. The current tooling in Fusion is lacking in two major regards: a) Traversing instances in the graph one gets easily lost in Fusion. Having a Graph-exploration tool allows for managing traversals in a relational datamodel for quickly creating and iterating on Applications.b) Traversing a meta-graph (graph describing the relations between different resources in CDF - PowerTransformer to Terminal, based on e.g. a set of Labels.) would allow the developer to verify the tegridy of their model, and the app-developer UX-designer becomes more independent of the backend/SME expert when solving usecases. This use case has low priority from the Statnett Data Modelling team, and medium priority for the App deveopment team.
As a developer, I want to use a Streaming data connector in the data integration, so that the data pipeline codebase can be reduced in complexity. Extracting data from a Stream prorcess to CDF RAW/CLEAN is a critical piece of infrastructure maintained at Statnett. The same features as exists for SQL extraction would allow us to reduce complexity and increase maintainability of the extraction process. This code must be executable on-prem at Statnett.
Hi! As a Data Scientist at Statnett I need to have the capability to sum >100 TimeSeries in one call to CDF. Currently, I can achieve this by making an addition “tree” in charts. But, that scales extremely poorly. I can not solve this with Synthetic TimeSeries as there is a hard limit on 100 series per call. This request supports multiple use-cases at Statnett, which will provide both operational and business-value.
Hey, At Statnett we’ve repeatedly found the need for exploring the datamodel in CDF as a graph, visually. Similar to the now deprecated relationship view in an old version of ADI (I may be misremembering). Our app developers and designers (and users) would like a way to incrementally traverse the graph around a node (TimeSeries, Asset, Event, Sequence) of interest. Does Cognite have some tooling for this type of graph-exploration? Is something in the works?
As a developer, I want to be able to retrieve min/max from synthetic time series and not just the average values, from the javascript sdk The custom calculation feature in tnt is complicated by not being able to retrieve min max values from synthetic time series. We therefore have to load hourly granularity to show a representative time serie. We would like synthetic time series to return the min max values as well.Should take into account that each time serie might not have max/min values at the same time within the granularity. This has the highest priority from the Statnett App development team.
As a developer, I want to only interface with externalIds in cdf, so that the state between resources can be maintained consistently and, and for easy state-management accross projects.With the internalId of resources exposed, a vulnerability in the state of CDF occurs when parts of the tenant are e.g. recovery. Additionally, the management of internalIds complicate the development and maintenance of the "fusion" part of CDF: linking Events to Assets, TimeSeries to Assets etc.We understand most of this feature to be covered by Flexible Data Models, and temporarily with a managed consistency tool in Cognites recovery portefolio.This feature has the highest priority for the modelling of Statnett data.
Hi, When parsing a large production model some high level concepts we want to filter on are structured as assets. I.e. a fiscal region for power in the Statnett case. Our model is now hitting limitations of subtree queries. With more than 100k assets per region for some regions. What is your thinking around how to handle such cases in a model? To spike the conversation we’ve considered moving high-concept parts of the tree to labels, or making more of the DB type of operations locally. However, the former demands a pipeline for moving concepts suited for a tree to a Label “just because”. The latter requires quite a lot of iron present on the local instance processing the query. An instance of the SDK quering we do can be seen in the power-SDK in github.
As a developer/architect, I want to version my datasources in CDF, so that we can manage compatability between datasources (fusion layer) and compatability internally in the datasource. This request pertains to two related challenges in CDF:a) A versioned datamodel allows a concept of a rollback of the datamodel (by enabling a concept of breaking changes inside/between datasources). It also allows for an unambiguous recovery of a tenant/project where dependencies in datasources/resources are explicitly declared.b) A versioned datamodel will detect breaking changes in the datasource, and allow the maintainer to reason about blocking a version (in case of an error), incrementing a breaking major version, etc. This feature request has medium priority in from the Statnett data modelling team.
As a developer, I want to search for resources, so that the SME can access relevant equipments.The current search functionality in CDF is at a bare minimum, and getting the SME to the correct equipment is cumbersome and requires significant complexity in the client. This feature has medium priority at the App development team at Statnett.
Hey, Currently, it’s hard to maintain a coherent versioned datamodel in CDF. With Templates, and further developments there, this becomes easier.Even with a versioned datamodel it seems to me that there still isn’t a good way to track changes in the model that are not version-breaking.In our datamodel we have two types of changes that we would like to track in a structured way: Field value updates, as simple as a metadata field and as complex as the unit-multiplier on a TimeSeries changing. Edge field value updates. Our model is an adption of a RDF representation of a Common Information Model in the Common Grid Model Exchange Specification for Norway. A field on an object can imply an edge in the datamodel, and this field can change. Currently this is solvable with Labels and Start/End time on Relationships today, though it isn’t obvious that we would want to introduce an Edge type to Templates?This question is a part of the larger struggle of maintaining a sane structure on a time-
Hey,At Statnett we’re thinking about our progress towards a world of more flexible data models. Although your introductions have been great, we’re still confused about the particulars surrounding the transition between types of models.In particular, we consider the lineage of and managability of transformations of data. In the language of your data-modelling: we want the arrows between Source, Domain and Solution models to be versioned and configured as code. Is this a planned feature? If so, do you have any sketches of how this will look?Transformations might be what we call “Interfaces” in the Domain and Solution models. I.e. the physical transformer from (e.g.) Siemens is implemented by a functional 300kv to 132kv transformer at the Substation “Foo”. Currently, we model this as two assets (phys. trafo and func.trafo) with a Relationship with label “Implements”. These relationships are more or less hand-crafted. But, we would like to have these mappings, and links, as code. Preferra
Hey! In accordance with how you’re thinking about flexible-datamodels. Has Cognite done any exploration around flexible time-series modelling? Akin to what you’re doing with charts, a fleet of sensors might be viewed as having the same transformational needs as an Asset does. At Statett, we have “views” of data (typically some linear combination of time-series). We persist these with synthetic time-series as an intermediary, then upload to a new TimeSeries. Unfortunately, the data is not immune to updates or backfills. So these computed TimeSeries either have to have a pretty severe lag, or we have to re-compute at an alarming rate, wasting compute. These computed time-series are further used in computations that we would like persisted. Do you envision expanding Charts to cover this functionality? I.e. having the configuration as code, with Stream logic for re-computes etc.? Or, do you have no plans for Managed Data Transformations? Thanks for your response :)
Hi, When developing with the Cognite python SDK, a common restriction is the API imposed restrictions on queries. Quering time-series, for example, is restricted by 100 asset-ids. I believe the Python SDK should recognize these constraint violations and batch / concurrently dispatch requests in chunks that do not violate constraints. Optionally with a performance-warning to the developer.Is this sane, or do you think that the developer/customer should maintain a wrapper on the SDK for batching each endpoint (as we’ve currently done at Statnett)?
Hey, In our current workflow we’re expanding the use of Functions as a tool. We’re somewhat hampered by how “clumsy” it is to bundle and version control proprietary dependencies. Is there something in the pipe to address this issue? Kind regards,Robert
Hey!In our tooling for the power-analyst we make computations using the SyntheticTimeseries API. Results from these are used in subsequent analysis, where we have identified a problem for us. When a synthetic timeseries is computed with a specified aggregate and resolution, that specification is returned irrespective of there being data on the originating timeseries. In our case, we compute aggregates over long stretches of time which results in situations like the image below: In the figure, the opaque line is the comptued by addition of the two other signals. Spanning over roughly a year, the period in the middle has a linear rise in the SyntheticTimeseries while the originals are empty. These values are of course meaningless and should be omitted in the successive steps of the analysis. Have you considered implementing a density filter or the like for these types of situations? Or, do you believe this is best solved client side by identifying the holes prior to a set of Syntheti
Hey! I’ve a problem with filling a gap from a source to a timeseries in CDF. Problem descriptionWe’re filling a hole in a time-series from time A to B. There are some datapoints on the edges of the interval in CDF.Data is extracted from the source, and in python prepped for the datapoints API as a list-of-tuples payload. For an arbitrary period I extract 2976 datapoints which I upload to CDF. Subsequently, I query the time-series for the same period of time and recieve 2928 datapoints. There are no NAN values in the input for either the date-time or value. The data is also hourly, and so I’m wary of it just being an edge effect of poor timestamp specifications for the retrieval. What other PEBCAK things have I missed? Simplified example included below:payload>> [{'externalId': 'ts_externalid', 'datapoints': [...]}]payload[0]["datapoints"][10]>> (1617271200000, 0.0)client.datapoints.insert_multiple(payload)meter_data = client.datapoints.retrieve( start=dates[0], end=dat
When exploring the datamodel in CDF through fusion, it would be nice to add labels to the table overview, as shown below.
Hey, To what extent does CDF handle possible race-condition triggering situations like the update of an object from two different systems?The example we are currently considering is whether a disjoint set of metadata on objects (events) can be updated without regard to timing. /Robert
Already have an account? Login
Enter your username or e-mail address. We'll send you an e-mail with instructions to reset your password.
Sorry, we're still checking this file's contents to make sure it's safe to download. Please try again in a few minutes.
Sorry, our virus scanner detected that this file isn't safe to download.