Skip to main content

Product Ideas Pipeline

1195 Ideas

Hosted Event Hub Extractor Triage / Routing to Different DatasetsClosed

Posted for inquiry here: Cognite Hub but I don’t believe the feature exists currently. We need a way of routing time_series and datapoint items from within a Hosted Extractor configuration to different, or possibly multiple datasets. Question copied from referenced posting here, for product feature descriptiveness: Within our implementation we have an existing Hosted Extractor reading data from an IoT Hub that contains multiple sites worth of data. Our Hosted Extractor Mapping Template filters for events that have a particular deviceId on it, representative of the location these events are coming from. In effort of ingesting another site-location datafeed, I wanted to extend the template with an ELSE IF condition that has the mapping rules for the other location, which are almost identical to the first except for the target datasets, which I’ve come to realize is set in the Sink section of the Extractor Configuration. The net result here is needing to create redundant Hosted Extractor configurations that change only a filter, rather than having a cascading ELSE IF ruleset that applies to the full stream. For example, this pseudo-template for our existing hosted extractor configuration for one site: if (context.messageAnnotations.`iothub-connection-device-id` == "SITE_A") { input.map(record_unpack => { "type": "raw_row", "table": "tb_iot_test", "database": "db_iot_testing", "key": concat("TS_CO_", record_unpack.NAME), "deviceId": context.messageAnnotations.`iothub-connection-device-id`, "TAG":record_unpack.NAME, "IP_INPUT_VALUE":record_unpack.IP_INPUT_VALUE, "IP_INPUT_TIME":record_unpack.IP_INPUT_TIME, "IP_INPUT_QUALITY":record_unpack.IP_INPUT_QUALITY }).filter(item => item.IP_INPUT_QUALITY == "Good").flatmap( item => [ { "type": "time_series", "name": item.TAG, "externalId": item.key, "metadata": { "IP21_DEVICE_ID": item.deviceId }, "isString": false, "isStep": false }, { "type": "datapoint", "externalId": item.key, "value": try_float(trim_whitespace(item.IP_INPUT_VALUE), null), "timestamp": to_unix_timestamp(item.IP_INPUT_TIME,"%Y-%m-%dT%H:%M:%S.%6fZ"), } ] ) } else { [] } Ideally I want to add an ELSE IF condition where the only change to the outcome is the iothub-connection-device-id equals SITE_B in this case. While the record_unpack step is able to route different locations to different staging tables, I was unable to find a means to specify an alternative target to route the time_series and datapoint items in the outside of creating a whole new configuration and having a different Sink setting for this extractor.  We have multiple sites all sharing one IoT Hub. IoT Hub has a limit to the number of consumers that can be connected, in this case larger than our number of sites. We handle routing in the Consumer logic of this stream however Cognite appears not to provide us with this mechanism in the templating language, instead specifying it in the sink.  This presents a configuration-limit issue for us with respect to our source data, and the constraints presented by the Cognite Hosted Extractor with specifying the target.  Am I missing something and is there a way to achieve what we’re looking for, or will this require a secondary hosted extractor configuration to be implemented? If we must go down the path of redundant hosted extractors with modified logic and a different Sink, we’re going to hit an Azure IoT limit prior to our full scale out, and would like to understand if this is a feature that can be provided, or if we should be planning otherwise.

Uploading files to Canvas: the list shows data sets I can read, not data sets I can write toGathering Interest

This is the cause of a lot of frustration from our users at the moment. When trying to manually upload a file to Canvas you need to select a data set. However, the list you get is not the data sets that you can actually upload a file to, but simply a list of all datasets you have read access to. This makes it really difficult to identify the data set that you can actually use as there is no indication before you click upload and get an error popping up.In general end users of CDF at us will not have access to upload files manually to most of our data sets in production due to governance. Which is why we’ve set up a dedicated data set for this purpose. So we think a better solution would be if this list showed all the data sets that the user could actually upload to (all data sets the user has write access to) rather than all the data sets that will produce an error.Or at the very least have some kind of an indicator of the access level. Trial and error on the user side here is just a nuisance. We don’t see the point in showing all the data sets that will fail without indicating that they will fail.The intended audience for Canvas are not the developers that know all aspects of our implementation of CDF, but the domain experts that know the data. We want their experience to be as seamless as possible and right now it is not, and to avoid the system even allowing for these kinds of mistakes. Markus PettersenAker BP - Technical Domain Architect for CDF