Skip to main content
Solved

Deploying cognite functions with requirement file


Forum|alt.badge.img+3

Hello,

Can I get an example of a CDF function that deploys a .pickle or .joblib model into CDF, then calls the model?

Thanks,

Diana

Best answer by Everton Colling

To deploy a machine learning model using Cognite Data Fusion (CDF), you can train your model locally using a relevant library such as scikit-learn and save the model as a .pickle or .joblib file. Once your model is trained and saved, you can upload this file to the CDF Files API, which makes the file available for predictions inside your function context.

Here is a minimal example to exemplify how you could proceed:

  1. Local Model Training:

    • Train your model using a relevant library, e.g., scikit-learn.
    • Save the model to a file using the .pickle or .joblib format.
      import joblib
      from sklearn.linear_model import LinearRegression
      
      # Assume X_train and y_train are your data
      model = LinearRegression()
      model.fit(X_train, y_train)
      
      # Save model to a file
      joblib.dump(model, "model.joblib")
  2. Upload Model to CDF:

    • Upload the saved model file to CDF using the Files API.
      from cognite.client import CogniteClient
      
      # Create a client instance
      client = CogniteClient()
      
      # Upload the model file
      file = client.files.upload(
          "model.joblib",
          external_id="my-amazing-model"
      )
  3. Create and deploy a Cognite Function:

    • Create a Cognite Function that will load the model from CDF and perform predictions. Note that the function code only has write access to the /tmp directory. Read more about downloading files inside a Function here.
      from cognite.client import CogniteClient
      
      # define the function
      def handle(client, data):
          """
          [requirements]
          joblib==1.3.2
          numpy==1.24.0
          scikit-learn==1.3.1
          [/requirements]
          """
          import joblib
          import numpy as np
          from sklearn.linear_model import LinearRegression
      
          # Download the model file
          client.files.download(
              external_id="my-amazing-model",
              directory="/tmp"
          )
      
          # Load the model
          model = joblib.load("/tmp/model.joblib")
      
          # Perform predictions
          input_data = np.array(data["input_data"]).reshape(-1, 1)
          predictions = model.predict(input_data)
      
          return {"result": predictions.tolist()}
      
      client = CogniteClient()
      
      # create a Congnite Function
      function = client.functions.create(
          external_id="my-amazing-function",
          name="my-amazing-function",
          function_handle=handle,
          runtime="py38"
      )
      
  4. Call the Function:

    • You can now call the function, passing in the input data for predictions.
      from cognite.client import CogniteClient
      
      client = CogniteClient()
      
      call = client.functions.call(
          external_id="my-amazing-function",
          data={"input_data": [5.1, 3.5, 1.4, 0.2]}
      )
      response = call.get_response()
      

Ensure that the environment (Python version, library versions, operating system) is consistent between the model training and prediction phases. When deploying with Cognite Functions, you can select the Python version and manually specify the requirements to ensure consistency.

This is a minimal example, which you can extend to read data from CDF and even write predictions back to CDF directly inside the function. If the model is simple and training does not take a long time, you could also use a function to train the model and upload it to CDF files.

For more details and options on deploying functions, you may refer to the Cognite SDK documentation.

View original
Did this topic help you find an answer to your question?

2 replies

Everton Colling
Seasoned Practitioner
  • Seasoned Practitioner
  • 161 replies
  • Answer
  • October 3, 2023

To deploy a machine learning model using Cognite Data Fusion (CDF), you can train your model locally using a relevant library such as scikit-learn and save the model as a .pickle or .joblib file. Once your model is trained and saved, you can upload this file to the CDF Files API, which makes the file available for predictions inside your function context.

Here is a minimal example to exemplify how you could proceed:

  1. Local Model Training:

    • Train your model using a relevant library, e.g., scikit-learn.
    • Save the model to a file using the .pickle or .joblib format.
      import joblib
      from sklearn.linear_model import LinearRegression
      
      # Assume X_train and y_train are your data
      model = LinearRegression()
      model.fit(X_train, y_train)
      
      # Save model to a file
      joblib.dump(model, "model.joblib")
  2. Upload Model to CDF:

    • Upload the saved model file to CDF using the Files API.
      from cognite.client import CogniteClient
      
      # Create a client instance
      client = CogniteClient()
      
      # Upload the model file
      file = client.files.upload(
          "model.joblib",
          external_id="my-amazing-model"
      )
  3. Create and deploy a Cognite Function:

    • Create a Cognite Function that will load the model from CDF and perform predictions. Note that the function code only has write access to the /tmp directory. Read more about downloading files inside a Function here.
      from cognite.client import CogniteClient
      
      # define the function
      def handle(client, data):
          """
          [requirements]
          joblib==1.3.2
          numpy==1.24.0
          scikit-learn==1.3.1
          [/requirements]
          """
          import joblib
          import numpy as np
          from sklearn.linear_model import LinearRegression
      
          # Download the model file
          client.files.download(
              external_id="my-amazing-model",
              directory="/tmp"
          )
      
          # Load the model
          model = joblib.load("/tmp/model.joblib")
      
          # Perform predictions
          input_data = np.array(data["input_data"]).reshape(-1, 1)
          predictions = model.predict(input_data)
      
          return {"result": predictions.tolist()}
      
      client = CogniteClient()
      
      # create a Congnite Function
      function = client.functions.create(
          external_id="my-amazing-function",
          name="my-amazing-function",
          function_handle=handle,
          runtime="py38"
      )
      
  4. Call the Function:

    • You can now call the function, passing in the input data for predictions.
      from cognite.client import CogniteClient
      
      client = CogniteClient()
      
      call = client.functions.call(
          external_id="my-amazing-function",
          data={"input_data": [5.1, 3.5, 1.4, 0.2]}
      )
      response = call.get_response()
      

Ensure that the environment (Python version, library versions, operating system) is consistent between the model training and prediction phases. When deploying with Cognite Functions, you can select the Python version and manually specify the requirements to ensure consistency.

This is a minimal example, which you can extend to read data from CDF and even write predictions back to CDF directly inside the function. If the model is simple and training does not take a long time, you could also use a function to train the model and upload it to CDF files.

For more details and options on deploying functions, you may refer to the Cognite SDK documentation.


Forum|alt.badge.img+3

Thank you!


Reply


Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie Settings