Why PyGen generated SDK lags in performance when compared to query method in cognite sdk ?

Question

We have generated a SDK using pygen on our Model.
Post that we requested a data from particular view in below eg - Property_type. It contains ~138k reocrds

Observe the below results on time taken to fetch via

Cognite SDK - 44 secs
Pygen SDK - 82 secs

Its fetching the same data from the same cognite project, but still seeing the performance lag. Its taking almost double time. Both numbers are from my local sandbox.

Cognite SDK Code

config = {
    "client_name": "abcd",
    "project": "slb-odf-qa",
    "base_url": "https://westeurope-1.cognitedata.com/",
    "credentials": {
        "client_credentials": {
            "client_id": "e063088ad3b4548d4911bd4a617990aa",
            "client_secret": "",
            "token_url": "https://p4d.csi.cloud.slb-ds.com/v2/token",
            "scopes": ["9237c91ce1ea434fa5a91262a5ea3646"],
        },
    },
}
cognite_client = CogniteClient.load(config)

view_id_1 = ViewId(space="slb-pdm-dm-governed", external_id="PropertyType",version="1_6")

def _get_timeseries():
    next_cursor = None  # Initialize the cursor as None at first
    all_data = []  # List to hold all results
    while True:
        # Construct the query with the current cursor
        query = Query(
            with_={
                "PropertyType": NodeResultSetExpression(
                    limit=10000,
                    filter=HasData(views=[view_id_1])
                )
            },
            select={"PropertyType": Select([SourceSelector(view_id_1, properties=['*'])])},
            cursors={"PropertyType": next_cursor}  
        )

        result = cognite_client.data_modeling.instances.query(query)

        if "PropertyType" in result.data:
            all_data.extend(result.data["PropertyType"])  
        next_cursor = result.cursors.get("PropertyType", None)

        if not next_cursor:
            return all_data

start_time = time.time()
data = _get_timeseries()
end_time = time.time()
print(len(data))
print(end_time - start_time)

##Output
# 138205
# 82.43460583686829

Pygen SDK

config = {
    "client_name": "abcd",
    "project": "slb-odf-qa",
    "base_url": "https://westeurope-1.cognitedata.com/",
    "credentials": {
        "client_credentials": {
            "client_id": "e063088ad3b4548d4911bd4a617990aa",
            "client_secret": "",
            "token_url": "https://p4d.csi.cloud.slb-ds.com/v2/token",
            "scopes": ["9237c91ce1ea434fa5a91262a5ea3646"],
        },
    },
}
client = CogniteClient.load(config)

from my_domain.client import MyClient
pygen_client = MyClient(client)

def _get_property():
    return pygen_client.property_type.list(limit=None, retrieve_connections='skip')

start_time = time.time()
data = _get_property()
end_time = time.time()

print(len(data))
print(end_time - start_time)

##Output
# 138205
# 82.43460583686829

Looks like an issue in the pygen sdk. Can this be looked into.

Anders Albert · Answer

​@Neerajkumar BhatewaraThanks for testing this.Note that when you run the `pygen_client.property_type.list(…, retrieve_connections='skip')` pygen is using the /list endpoint, while you compare it to the /query endpoint. In the PySDK, this would mean calling `cognite_client.data_modeling.instances.list(sources=[view_id_1])`.Still, these should have similar performance. I suspect this could have something to do with the pagination on the server side.Can you try to runpygen_client.property_type.list(limit=None, retrieve_connections='skip', sort_by='external_id')?

Why PyGen generated SDK lags in performance when compared to query method in cognite sdk ?

1 reply

Reply

Cookie Policy

Cookie settings

Reply

Related topics

apagar foto do perfilicon

Apagar minha foto de perfilicon

Foto no perfil do app do celularicon

Alterar foto do perfil artistaicon

como alterar a minha foto do perfil?icon

Sign up

Log in to the community

Scanning file for viruses.

This file cannot be downloaded

Cookie Policy

Cookie settings