Skip to main content

P&ID Annotation Workflow using CDF Data Modeling [Cognite Official]

  • March 3, 2025
  • 0 replies
  • 747 views

Jan Inge Bergseth
MVP
Forum|alt.badge.img

This How-to article describes and provides a structured example template for automating the P&ID annotation process in Cognite Data Fusion (CDF). The process leverages CDF Data Modeling and Workflows to automate annotation, linking P&ID documents to assets and other related files within the data model.

Overview of the Workflow

The workflow consists of two automated functions scheduled and executed using CDF Workflows:

  1. Metadata & Alias Processing: Updates or creates metadata and aliases for files and assets.
  2. P&ID Annotation Processing: Uses the generated aliases to annotate P&ID documents automatically.

The final result is populated annotations in the data model, linking P&ID files to assets and interrelated P&ID diagrams, as illustrated below:

AD_4nXeojo4x3cxaec0iic0JxojuuYPwwKn8ND-RRIuRxXL4MupIsIpTJMrYC-229nB0JixDVh21olTfnewbvi9V3koE_txqBPMli7-S1lUyI0RlZFsUfZgm9eognfDsifoMPtm6j9EDMA?key=BP5dpwLNDUP5C90sHdGkhK-v

 

Key Features of the Workflow

1. Integrated Extraction Pipeline

  • Both functions are connected to a dedicated extraction pipeline.
  • The pipeline stores overall documentation, configuration, and maintains logging & notifications.

2. Metadata and Alias Processing

  • AI/LLM-generated metadata summaries: If no description exists, a summary is generated.
  • Tagging of documents: If processing diagrams are found, the tag PID is automatically added.
  • Alias generation for filenames: Since full file names contain versions and revisions that are often absent in P&ID references, alias variations are created to improve matching.
  • Alias generation for assets/tags: System numbers are removed to enhance precision in asset matching.
  • State storage: A RAW table stores state information, preventing reprocessing of already annotated files.

Note: Naming conventions should be adjusted based on project-specific standards.

Executing the P&ID Annotation Process

The workflow configuration aligns with the data modeling structure in CDF, including:

  • Instance spaces: Where data is stored.
  • Schema spaces: Definitions for the data model.
  • External ID for views/types: For extended Cognite Asset types.
  • Search properties: Defines how matching is performed.
  • Filtering properties: List of possible values (e.g., tag lists).
  • DEBUG mode: Enables detailed logging.

Delete & Cleanup Functionality

  • If thresholds for automatic annotation approvals are changed, previous annotations should be deleted.
  • The process only removes annotations generated by this workflow, leaving manual annotations intact.
  • Without deletion, existing external IDs prevent duplicate annotations.
  • State store for incremental support ensures that only new/updated files are processed.

To clean previous annotations, set: cleanOldAnnotations = True in configuration.

Annotation Process

  • Filters can be applied to identify relevant Files and Assets.
  • Search properties are used to match P&IDs to corresponding files/assets.
  • If a batch processing error occurs:
    • Retry up to 3 times.
    • If failures persist, switch to individual file processing.
    • If individual processing fails, log the error and skip the file.
  • Matches are stored in a RAW table for documentation.
  • A threshold-based system determines whether annotations are automatically approved or suggested.
  • Annotations are created using CDF Data Modeling (DM) service.
  • Log status updates are stored in the extraction pipeline log.

Output & Visualization

  • The workflow generates annotations that are displayed on P&ID diagrams as:
    • Purple boxes – Linking to assets.
    • Orange boxes – Linking to files.
  • These annotations enhance data contextualization and improve traceability between P&IDs and related assets.

AD_4nXcKeEtWUcHZevWKpC72FJuoxwklzV6u6t_HrsQlvgycfBlJQlDadXIcCMxCEye2lctKn_TmjaPciNTY13UeC_Q8Fa5hBNg7rF8f2yS87TUqoq5VpaDFWi54u2sZhlfRJVwH8yAW?key=BP5dpwLNDUP5C90sHdGkhK-v

How to Use the Provided Code Example

The provided example is structured as a CDF Toolkit module for governing and deploying the annotation workflow within a CDF project. The content of the toolkit module looks like this:

AD_4nXcPbwuBlBcctR4tKlFqNThC-NjzORI9eG-jIsG2M0UmqD660TraIwnZHKho23e9OivH_oAHS2Unm1jRbRgLaFrIXbYk9fK6px2hd8jlNqpemVrQ7RMs_d52bmUkq6H9645ZYMHH?key=BP5dpwLNDUP5C90sHdGkhK-v

In the README.md file you will find a description of each of the resources in the
module.

Setup Instructions:

  1. Ensure CDF Toolkit is set up: Follow the guide here.
  2. Download the module code from GitHub:
    • Repository: GitHub link
    • Module name: cdf_p_and_id_annotation
  3. Step 1: Enable External Libraries
     

    Edit your project's cdf.toml and add:

    [alpha_flags] 
    external-libraries = true
    [library.cognite]
    url = "https://github.com/cognitedata/library/releases/download/latest/packages.zip"
    checksum = "sha256:795a1d303af6994cff10656057238e7634ebbe1cac1a5962a5c654038a88b078"

    This allows the Toolkit to retrieve official library packages.

    To help improve the Deployment Pack:

    cdf collect opt-in 

    Run:
     

    cdf modules init . --clean 

    This opens the interactive module selection interface.

    ⚠️ Disclaimer: This command will overwrite existing modules. Commit changes before running, or use a fresh directory.

  4. Integrate the module into your CDF project:
    • Copy the module into your CDF Toolkit-based structure.
    • Update the default.config.yaml file with project-specific configurations.
    • Add the module name to the list of selected modules in your project configuration file.

AD_4nXdP2I26xrWKQOaAczYt0m20uaFP4jcTB3THMwrmHQA06PHMKvm2_qxVCvLbOtal-GWZH7pY9ktcPI0eVH5Z0HvQq41U-MvYWFs-vMgDSUxjTkiStLjaJlgcNiesRekamCca1y88dA?key=BP5dpwLNDUP5C90sHdGkhK-v

AD_4nXfTCz_jctGZ_f6Lgq-xeioGcIatxrCNCk-TzbrIcJMTn0NBqVsXA5ZR5BmoJ6LX1SjsUghlQPESYoHVynGyjLDfmMrf5dau_VrpWBOSleoiAlDiWsbBtZc9PNAH3kmYsc_CjaoD?key=BP5dpwLNDUP5C90sHdGkhK-v

 

  1. Deploy the module using CDF Toolkit:

Run build & deploy in the Toolkit to ensure a successful deployment.

 

After deployment, you should have:

In Staging (RAW Database & Tables):

AD_4nXdbTOcx0e3LTHeiN1xv6ronhxBNlpo8imYNee-185BvxOK_MJQIg66PwHXTorCVvB-1vWoXhNjE0tTlFt1171vKs_ihxSt_nZIj56X4IkY18T0guaz-v3k9glDaNM8T47q0ioL1?key=BP5dpwLNDUP5C90sHdGkhK-v

 

In Extraction Pipelines:

 

AD_4nXdpjehmMBzYSiiRb6VCMniS67bk5PzcnwWUNmCUXP-vj0p8z3z-myfiXjNQMEMXFr77Rk79GY_i6rrzYaNxQXvVpy_OSTQso7AeFd5WUvU9R-QipEGfgggNit-V2OKEhKWS8Gm1?key=BP5dpwLNDUP5C90sHdGkhK-v

 

In Functions:

AD_4nXfHkhmDQdjfO5KSv-QPVFHjJG1FqSGsisA2Wg4bGihtQbUM9OCgWcs4_kjXAaErzHepI-MRtutFr3Hg_i0zhR6Zca-fuSBl4CFu4qD9FcWKXDkNFWL7zNQHSKqAHWG8JSQOGgIE?key=BP5dpwLNDUP5C90sHdGkhK-v

In Data Workflows:

AD_4nXeuBb6aKzjBqdsPGrLd24svj7PSLv6j4IEo7KZXUNixZIkar-EIkIltNqOOgXqHK91H1BxwIma4qPpUdmiyGlGNTEPvx2motQPqsERojCOEOoagGu7_f94ldweLN9R81hSxBttUMg?key=BP5dpwLNDUP5C90sHdGkhK-v

In Access Management Configurations:

AD_4nXcTHZuV1DyEiLk9MmBNMlLtfd5iIHVTbz0zWLJNQXVMNUTZUu1fQVC_xws6syAElUsIkl_G6PTzpE-_hnIj9rWmfL8IJfUGhizrdSDhYgrQYCItm9V_92e-9gFtrgrVSuPVdFLJ?key=BP5dpwLNDUP5C90sHdGkhK-v

Now that you have a working deployment, proceed to modify the module’s configurations and scripts to align with your project requirements.

Modify the module’s default configuration to fit your project

  •  In your CDF toolkit configuration yaml file, update: 
    • location_name: replace LOC with the location name / Asset name that you are working on
    • source_name: replace SOURCE with the source where your P&ID files are extracted from
  • Update the folder structure:

Rename files replacing LOC and SOURCE where applicable.

AD_4nXfbKg3RWl2Q36P7iP0coAH3p-d7v5DtkYSZy753qsuE6fW7FEv4BzLeBRckAoc8te8j4o-haBj25k__GLl4duUIvMRxfVkFRtCi3KchNPcCfOAdh3OVGoPupoK-Wv9kyj444UM?key=BP5dpwLNDUP5C90sHdGkhK-v

  • Modify the alias functions in the module if necessary:
    • Update get_file_alias_list
    • Update get_asset_alias_list

Use the Toolkit to build & deploy to test your modifications. If you want to test locally before deploying this is also possible and described in the module README.md file

 

Testing & Validation in CDF

Once deployed, trigger the workflow run to validate the P&ID annotation process:

  • Use CDF Search (in Industrial Tools) to verify that annotations are correctly applied.
  • Confirm that P&ID assets are correctly linked within the data model.

AD_4nXcVOlHj3bbUjD_v1hIx7mqp9EHrdB2JSLNSzG8CG-cKWu1_DeHRxcywuM2TwpAJvBYkL4edef1KL4LZG6oBNn1xQKCN_0Y_yVZDKn35_Qz-NNhjqnUoxKgQXuTxF-vdxCamvEj48A?key=BP5dpwLNDUP5C90sHdGkhK-v

Next Level

With the P&ID Workflow you have an important building block in creating a workflow or process around the annotation process. In the illustration below you now have the processes for:

  •  Process for Alias creations
  • Reading of your sources to be annotated
  • Diagram detection or annotation of the P&ID
  • The output related to mapped Documents & Assets in RAW tables

Then utilizing this in your custom application or process with P&ID Edit functionality would provide high quality annotations that can be used in multiple settings/applications /use cases.

AD_4nXdhRL_P31wRKUaGBM3R6G7MID6A1bCDOzzd4M4Y0GZFsO9uwLbZPfP22rrErYTzWsNuslPVKCQoIHi1L7AvQSaJlSXHoLJLt67jiecccpcM8t-gQp_Cg0_mBRj4qEVH9sI5lDBomA?key=BP5dpwLNDUP5C90sHdGkhK-v