Skip to main content
Closed

PI Extractor Case Sensitive Tag Feature Request

Related products:Extractors
  • May 14, 2024
  • 1 reply
  • 84 views

Forum|alt.badge.img+1

The Cognite Pi extractor is case sensitive and it will only pick up tags with the exact name as they are in the OSI Pi server data source. 
This is causing major issues for our data ingestion process since the tags shared by the use case proponent are not entirely the same as on the Pi server itself (may have different capitalization of letter through its name).

 

For example:
-  Shared tag name: A01AAB0A.pv
-  Tag name in PI server: a01AAB0A.PV
In this situation, the following config would fail to ingest the listed tag:
     extractor:
         include-tags:
             - "A11AAB0A.pv"

It would be great if the PI extractor could have a case-sensitive: true/false flag to allow case-insensitive tags filtering for ingestion.
For example, something like this would allow the listed tag to be ingested:
     extractor:
        case-sensitive: false
        include-tags:

            - "A11AAB0A.pv"

We often fail to ingest timeseries with this kind of discrepancy, which causes inconvenience on the user side and consume time from our side as Cognite Pi extractor maintainers.  
We believe it would be much easier to read/ingest the pi tags regardless of the capitalization of the letters or if we could add a flexibility parameter or a regex expression that would pick up a greater specific range of a specific tag. 

1 reply

Forum|alt.badge.img

Case insensitivity is a fairly well known and complex problem. From an extractor perspective, the simplest (and least valid perhaps) argument against the extractor doing it that it implies changing the source (meta)data. 
 

However, the simplicity of the problem depends on supported character systems in the sources, and character system support in the data stores in CDF. It’s a pretty sizable problem space and adds a lot of complexity for the «service in the middle» (the extractor) to try and get right in every possible edge case. And if experience is an indicator, it’s unlikely to be solved in a satisfactory manner across our current and future customer base.

As a result, at this point in time, the extractor will not be adding additional functionality to ignore the case of the source data when doing matching.