Solved

Deploying model in Cognite streamlit : How to add '.pkl' file ?

3 months ago
December 16, 2024
7 replies
75 views

Ankit Kothawade
Committed
13 replies

I am developing a what-if tool for a client. I have trained my model to use it as pickle.
Can someone please guide me, how to use / add pickle in ‘cognite - streamlit’??
Also, can I add ‘requirements.txt’ for my specific requirements??

Best answer by Everton Colling

Hello Ankit!

The approach suggested by @Lars Moastuen works perfectly fine.

Here’s a simple example in which I create a regression model, dump it to a pickle and upload it to CDF files.

from cognite.client import CogniteClient
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
import pickle

# Instantiate Cognite SDK client
client = CogniteClient()

# Generate sample data
X, y = make_regression(n_samples=100, n_features=1, noise=20, random_state=42)

# Create and train the model
model = LinearRegression()
model.fit(X, y)

# Save model to disk
model_filename = "regression_model.pkl"
with open(model_filename, "wb") as file:
    pickle.dump(model, file)

print(f"Model saved to {model_filename}")

# Upload model file to CDF files
file = client.files.upload(
    path=model_filename,
    external_id="regression_model",
    name="Regression model"
)

# Wait until the file finishes uploading
# And then check if the model has been uploaded properly
file_update = client.files.retrieve(external_id="regression_model")

After the file has been successfully uploaded to CDF files, you can load it in Streamlit as below:

import streamlit as st
from cognite.client import CogniteClient
import numpy as np
import pickle
import io

st.title("ML model example")

client = CogniteClient()

@st.cache_data
def fetch_model():
    # Download the model file as bytes
    file_content = client.files.download_bytes(
        external_id="regression_model"
    )
    # Load model from bytes
    loaded_bytes_io = io.BytesIO(file_content)
    loaded_model = pickle.load(loaded_bytes_io)
    return loaded_model

loaded_model = fetch_model()

# Make predictions with loaded model
X_new = np.array([[2.0], [3.0], [4.0]])
predictions = loaded_model.predict(X_new)

st.write("Loaded model predictions:", predictions)

The code above demonstrates this with a simple linear regression model, but the same approach will work for any pickle-serializable model (scikit-learn, XGBoost, etc.).

A few important points to keep in mind:

Make sure to add all required packages to the app “Installed Packages” section. For this example, you’ll need: scikit-learn
Use some method of caching, like the @st.cache_data decorator in my example, to prevent the model from being downloaded repeatedly with each app interaction
Users accessing your app will need read permissions for the model file in CDF

I hope this helps you move forward with your use case.

View original

Did this topic help you find an answer to your question?

Lars Moastuen
Seasoned Practitioner
68 replies
3 months ago
December 16, 2024

Hi @Ankit Kothawade and thanks for reaching out.

I’m not familiar with pickle, but I think you can achieve this by storing your pkl-files using the Files API in Cognite Data Fusion. Files can be uploaded through various user interface in Fusion. After uploading the file, you can use the Files API to generate a signed download URL in your Streamlit app. This download URL can be used with pickle. Here’s some (untested) code that shows how to generate a download URL:

file_id = 123456789
file_url = client.files.retrieve_download_urls(id=file_id)
file_url = file_url[file_id]

See https://docs.cognite.com/cdf/streamlit/#install-third-party-packages for instructions how to add requirements to your application. See also https://api-docs.cognite.com/20230101/tag/Files.

I hope this helps you resolve your issue.

Lars Moastuen (Software Engineer, Atlas AI)

APSHANKAR Sagar
Committed
26 replies
3 months ago
December 16, 2024

Hi @Ankit Kothawade ,

For integrating your requirements file, click on the settings icon when you are in your streamlit app in Cognite:

You will then see a place where you can paste the requirements of your project. Bear in mind, you can't use all packages here. You can only use packages for which a 'pure wheel file’is avaialble on PyPi.

Ankit Kothawade
Author
Committed
13 replies
3 months ago
December 16, 2024

Lars Moastuen wrote:

Hi @Ankit Kothawade and thanks for reaching out.

file_id = 123456789
file_url = client.files.retrieve_download_urls(id=file_id)
file_url = file_url[file_id]

See https://docs.cognite.com/cdf/streamlit/#install-third-party-packages for instructions how to add requirements to your application. See also https://api-docs.cognite.com/20230101/tag/Files.

I hope this helps you resolve your issue.

Hi @Lars Moastuen thanks for the quick response. I initially tried ‘Files API’ only. However, I am not sure if it is treating my uploaded file as pickle (serialized file to call pretrained trained model.
Please share a reference document or an alternative method, if you get one.

Thanks!!

iAnkit

Everton Colling
Seasoned Practitioner
163 replies
Answer
2 months ago
December 16, 2024

Hello Ankit!

The approach suggested by @Lars Moastuen works perfectly fine.

Here’s a simple example in which I create a regression model, dump it to a pickle and upload it to CDF files.

from cognite.client import CogniteClient
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
import pickle

# Instantiate Cognite SDK client
client = CogniteClient()

# Generate sample data
X, y = make_regression(n_samples=100, n_features=1, noise=20, random_state=42)

# Create and train the model
model = LinearRegression()
model.fit(X, y)

# Save model to disk
model_filename = "regression_model.pkl"
with open(model_filename, "wb") as file:
    pickle.dump(model, file)

print(f"Model saved to {model_filename}")

# Upload model file to CDF files
file = client.files.upload(
    path=model_filename,
    external_id="regression_model",
    name="Regression model"
)

# Wait until the file finishes uploading
# And then check if the model has been uploaded properly
file_update = client.files.retrieve(external_id="regression_model")

After the file has been successfully uploaded to CDF files, you can load it in Streamlit as below:

import streamlit as st
from cognite.client import CogniteClient
import numpy as np
import pickle
import io

st.title("ML model example")

client = CogniteClient()

@st.cache_data
def fetch_model():
    # Download the model file as bytes
    file_content = client.files.download_bytes(
        external_id="regression_model"
    )
    # Load model from bytes
    loaded_bytes_io = io.BytesIO(file_content)
    loaded_model = pickle.load(loaded_bytes_io)
    return loaded_model

loaded_model = fetch_model()

# Make predictions with loaded model
X_new = np.array([[2.0], [3.0], [4.0]])
predictions = loaded_model.predict(X_new)

st.write("Loaded model predictions:", predictions)

The code above demonstrates this with a simple linear regression model, but the same approach will work for any pickle-serializable model (scikit-learn, XGBoost, etc.).

A few important points to keep in mind:

Make sure to add all required packages to the app “Installed Packages” section. For this example, you’ll need: scikit-learn
Use some method of caching, like the @st.cache_data decorator in my example, to prevent the model from being downloaded repeatedly with each app interaction
Users accessing your app will need read permissions for the model file in CDF

I hope this helps you move forward with your use case.

Ankit Kothawade
Author
Committed
13 replies
2 months ago
December 17, 2024

Thanks @Everton Colling.

iAnkit

Ankit Kothawade
Author
Committed
13 replies
2 months ago
December 27, 2024

Hello @Everton Colling !!

I have further doubts related to this.

I use connection.py file to connect with cognite.

I call this one as -- client = __get_connection_by_token(project_name)

But I am curious, how can I use .env file in streamlit app to get the TOKEN??
Following is my connection.py file.

***

from msal import PublicClientApplication

from cognite.client import CogniteClient

from cognite.client.config import ClientConfig

from cognite.client.credentials import Token

from dotenv import dotenv_values

TENANT_ID = "**********"

CLIENT_ID = "**********"

TOKEN_URL = f"https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token"

CDF_CLUSTER = "az-pnq-gp-001"

BASE_URL = f"https://{CDF_CLUSTER}.cognitedata.com"

SCOPES = [f"{BASE_URL}/.default"]

AUTHORITY_HOST_URI = 'https://login.microsoftonline.com'

AUTHORITY_URI = AUTHORITY_HOST_URI + '/' + TENANT_ID

__public_client_app = PublicClientApplication(client_id=CLIENT_ID, authority=AUTHORITY_URI)

__env_vars = dotenv_values(".env")

def get_dev_connection():

return __get_connection(project_name="my-dev")

project_name="my-test"

def __get_connection_by_token(project_name):

cnf = ClientConfig(client_name="adhoc-client",

project=project_name,

credentials=Token(__env_vars.get("TOKEN")), base_url=BASE_URL)

client = CogniteClient(cnf)

print(client.iam.token.inspect().projects)

return client

***

iAnkit

Everton Colling
Seasoned Practitioner
163 replies
2 months ago
December 27, 2024

Hello Ankit!

When running Streamlit apps directly in Fusion, you don't need to handle the authentication logic yourself. You can simply initialize the client as:

from cognite.client import CogniteClient

client = CogniteClient()

The client will be automatically authenticated towards the CDF project where you're logged into in Fusion. All the OAuth token handling is managed for you behind the scenes.

The authentication code you shared in your snippet would only be needed if you were running the Streamlit app locally or deploying it outside of Fusion. But since you're using Cognite's hosted Streamlit environment, you can remove all that authentication logic and use the simplified initialization above.

Reply

Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos

Reply

Related topics

Descontinuação serviço de email

mail clix nao enviaicon

webmail clix problema a receber alguns emailsicon

Acesso ao domínio netcabo.pt

Sintonização Canais Digitais na Televisão sem box

Sign up

Log in to the community

Scanning file for viruses.

This file cannot be downloaded

Cookie Policy

Cookie settings