I have to build two 2 types of extractors.
1. PI extractor to connect to the PI server and fetch the data and ingest into CDF
2. SharePoint online - Extract data from files present in SharePoint online and ingest into CDF.
I wanted to know the construct in building the extractors. Mainly wanted to account for scenarios where the PI server is not available and PI extractor is unable to fetch the data from the pipeline. How to handle these kinds of situations and incorporate them in the code while building the extractor. Also, how to handle monitoring while performing extractor. Are there some sample code repos that can be referenced for getting complete idea for building extractors.
We have pre-built extractors for both of these systems. You can read about them on our docs page:
The extractors themselves can be downloaded from fusion.cognite.com after you have logged in.
@mathialo for your inputs. Please could you share monitoring guidelines and means of handling the DataStream if there is any hiccups in the PI server connectivity etc. How does the extractor resume from where it stopped / halted. How to handle those scenarios in the extractor scripts.
The extractor will reconnect automatically on connectivity issues. It also keeps track of extraction state in order to resume from where it left off in the case of the extractor being shut down or crashed.
From the docs page:
Thanks for the inputs
@mathialo . So, I don't need to invest into doing any custom coding while using prebuilt extractors. I just need to configure with the right parameters and the data extraction can be accomplished.
Correct, for something like Osisoft PI, it should be relatively plug-and-play. When you download the extractor from CDF you are also given an example configuration file you can use as a starting point for your own setup.
@mathialo . I have two environments, DEV and PROD. So, should I manually do the same steps for extraction for PROD as well separately? Can I setup Extraction pipelines using Git actions so that we can automate these steps and not do these steps manually in the respective environments?
@eashwar11, for the PI extractor, you need a Windows Server Machine. Then you can install and configure a few extractors, one for DEV and one for PROD. Extractors are usually running continuously. If you’re going to use a cloud VM, you can configure GH actions to update the config files and restart services, for example, depending on a particular cloud provider.
@roman.chesnokov . Could you please share any reference material that has all the technical details provided. (using GH actions etc). I can note that the main documentation site shares the details in some abstract fashion.
I am looking for something that has the necessary details like in the boot-camp documentation.