Error running the Cognite replicator with Option 5

  • 17 February 2023
  • 16 replies
  • 153 views

Badge +6

I have set up both the environment variables SOURCE_CLIENT_SECRET (Note that the Client secret is generated from the OID widget as per the documentation)and DEST_CLIENT_SECRET and running the replicator with the config file using option 5.I am getting error as per below log

AMAC02Z3123LVCJ:oid-replicator j.subhash.parandekar$ poetry run python3 ./oid_replicator/replicate.py 
2023-02-17 18:46:48,239 root INFO - Config file - Repeat line 5: 

2023-02-17 18:46:48,239 root INFO - Config file - Repeat line 14: 

2023-02-17 18:46:48,239 root INFO - Config file - Repeat line 23: 

Starting replication of resources
Replicating assets...
Traceback (most recent call last):
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/credentials.py", line 364, in _refresh_access_token
    token_result = self.__oauth.fetch_token(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/requests_oauthlib/oauth2_session.py", line 366, in fetch_token
    self._client.parse_request_body_response(r.text, scope=self.scope)
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 427, in parse_request_body_response
    self.token = parse_token_response(body, scope=scope)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 441, in parse_token_response
    validate_token_parameters(params)
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 448, in validate_token_parameters
    raise_from_error(params.get('error'), params)
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/oauthlib/oauth2/rfc6749/errors.py", line 399, in raise_from_error
    raise cls(**kwargs)
oauthlib.oauth2.rfc6749.errors.InvalidClientError: (invalid_client) AADSTS7000216: 'client_assertion', 'client_secret' or 'request' is required for the 'client_credentials' grant type.
Trace ID: b0528310-0b70-4ee0-9715-daca09e35200
Correlation ID: 5987d7dd-eb61-404c-9525-00954829211c
Timestamp: 2023-02-17 13:16:49Z

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/j.subhash.parandekar/oid-replicator/./oid_replicator/replicate.py", line 8, in <module>
    main()
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/replicator/__main__.py", line 401, in main
    cognite.replicator.assets.replicate(
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/replicator/assets.py", line 298, in replicate
    assets_src = client_src.assets.list(limit=None)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/_api/assets.py", line 295, in list
    return self._list(
           ^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 489, in _list
    for resource_list in self._list_generator(
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 395, in _list_generator
    res = self._post(url_path=url_path or resource_path + "/list", json=body, headers=headers)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 135, in _post
    return self._do_request(
           ^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 148, in _do_request
    headers = self._configure_headers(accept, additional_headers=self._config.headers.copy())
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 190, in _configure_headers
    auth_header_name, auth_header_value = self._config.credentials.authorization_header()
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/credentials.py", line 121, in authorization_header
    self.__access_token, self.__access_token_expires_at = self._refresh_access_token()
                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/j.subhash.parandekar/Library/Caches/pypoetry/virtualenvs/oid-replicator-cwaK-6ym-py3.11/lib/python3.11/site-packages/cognite/client/credentials.py", line 372, in _refresh_access_token
    raise CogniteAuthError(
cognite.client.exceptions.CogniteAuthError: Error generating access token: invalid_client, 401, AADSTS7000216: 'client_assertion', 'client_secret' or 'request' is required for the 'client_credentials' grant type.
Trace ID: b0528310-0b70-4ee0-9715-daca09e35200
Correlation ID: 5987d7dd-eb61-404c-9525-00954829211c
Timestamp: 2023-02-17 13:16:49Z


16 replies

Badge +6

Below are the content of the yml file

resources: # Which resource types to replicate

- timeseries

- datapoints

- assets

- events

 

# OIDC PROJECTS --------------------------------------------------------------------------------------------------------

# source CDF project identity variables for Open Industrial data project

src_boolean_client_secret: True # OIDC: whether the source project is being authenticated through a client secret or not

src_TENANT_ID: 48d5043c-cf70-4c49-881c-c638f5796997 # OIDC: azure AD tenant of the source CDF project

src_CLIENT_ID: 1b90ede3-271e-401b-81a0-a4d52bea3273 # OIDC: Azure client app registration ID of the source CDF project

src_CDF_CLUSTER: api # cluster the source CDF project is running on

src_COGNITE_PROJECT: publicdata # name of the source project

src_AUTHORITY_HOST_URI: "https://login.microsoftonline.com" # login uri for the source project

 

# destination CDF project variables

dst_boolean_client_secret: True # OIDC: whether the destination project is being authenticated through a client secret or not

dst_TENANT_ID: e0793d39-0939-496d-b129-198edd916feb # OIDC: azure AD tenant of the source CDF project

dst_CLIENT_ID: c1f94aa4-bf73-4c93-93d8-d4bc58df0627 # OIDC: without client secret a0ed92d8-dab1-4f73-acb1-3c3a0c8c7261" # Azure client app registration ID of the source CDF project

# dst_client_secret: DEST_CLIENT_SECRET # OIDC: Name of env variable for Client secret of source project

dst_CDF_CLUSTER: api # cluster the source CDF project is running on

dst_COGNITE_PROJECT: accenture-tiger-training # name of the source project

dst_AUTHORITY_HOST_URI: "https://login.microsoftonline.com" # login uri for the source project

 

high_frequence_variability: false # True if there are many time series being replicated which have new datapoints coming at very different freqences

delete_if_removed_in_source: false # Remove objects that were replicated and are now deleted in source

delete_if_not_replicated: false # Remove all objects in destination that aren't from source

batch_size: 10000 # Number of items in each batch 1-10000. Only applies to Raw, Events, Timeseries, and Files. (The SDK automatically chunks to 10000. This is used in conjuction with threads if you wanted smaller/more efficient threads for batches less than 10k. EX: 20 threads with 2000 batch sizes each.)

batch_size_datapoints: 10000 # Number of datapoints in each batch (The SDK will automatically paginate so it's generally not needed with a value here)

number_of_threads: 10 # Number of threads to use

client_timeout: 120 # Seconds for clients to timeout

client_name: cognite-replicator # Name of client

log_path: log # Folder to save logs to

log_level: INFO # Logging level

events_exclude_pattern: # Optional - Regex pattern to prevent replication of matching events. Example: ^SYN_

timeseries_exclude_pattern: # Optional - Regex pattern to prevent replication of matching timeseries. Example: ^SYN_

timeseries_exclude_fields: # Optional - List of metadata fields to exclude from the extraction

files_exclude_pattern: # Optional - Regex pattern to prevent replication of matching files. Example: ^SYN_

datapoints_start: 10d-ago # Must be an integer timestamp or a "time-ago string" on the format: <integer>(s|m|h|d|w)-ago or 'now'. E.g. '3d-ago' or '1w-ago'

datapoints_end: now # Must be an integer timestamp or a "time-ago string" on the format: <integer>(s|m|h|d|w)-ago or 'now'. E.g. '3d-ago' or '1w-ago'

value_manipulation_lambda_fnc: # "lambda x: x*0.2" # Lambda function as a string if value manipulation for datapoints is needed.

dataset_support: false # Boolean to enable or not the dataset support

 

and the script I am running is 

####################

import yaml

from cognite.replicator.__main__ import main

import os

 

if __name__ == "__main__": # this is necessary because threading

COGNITE_CONFIG_FILE = yaml.safe_load("config/oid.yml")

os.environ["COGNITE_CONFIG_FILE"] = COGNITE_CONFIG_FILE

main()

Badge +6

I need to add src_client_secret: SOURCE_CLIENT_SECRET  and dst_client_secret: DEST_CLIENT_SECRET to yml file  to start replicator without the authentication error but the code is hanging for lot of time while replicating events as per below screen shot

Userlevel 3

Yes, if you src_boolean_client_secret: True then you need to specify a client_secret as environment variables like you did. 
There could be a lot of events, it might take some time to replicate. 
the script is not hanging, but replicating in the background. Depending on the number of events on the source, it could take a considerate amount of time. For your case, they are over 40 million events in the publicdata project, which is why it takes some time

Badge +6

The code fails during event replication with below error

2023-02-17 19:27:21,866 root INFO - Finished depth 12, updated 0 and posted 6 assets (total of 6 assets).
2023-02-17 19:27:21,868 root INFO - Finished copying and updating 1115 assets from source (publicdata) to destination (accenture-tiger-training).
Replicating events...
Traceback (most recent call last):
  File "/Users/j.subhash.parandekar/oid-replicator/oid_replicator/replicate.py", line 8, in <module>
    main()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/replicator/__main__.py", line 412, in main
    cognite.replicator.events.replicate(
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/replicator/events.py", line 238, in replicate
    events_src = client_src.events.list(limit=None)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/client/_api/events.py", line 283, in list
    return self._list(
           ^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 489, in _list
    for resource_list in self._list_generator(
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 395, in _list_generator
    res = self._post(url_path=url_path or resource_path + "/list", json=body, headers=headers)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 135, in _post
    return self._do_request(
           ^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 182, in _do_request
    self._raise_api_error(res, payload=json_payload)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/client/_api_client.py", line 856, in _raise_api_error
    raise CogniteAPIError(msg, code, x_request_id, missing=missing, duplicated=duplicated, extra=extra)
cognite.client.exceptions.CogniteAPIError: Unauthorized | code: 401 | X-Request-ID: d838efa4-540a-95cb-bbee-65502e48b41a

Userlevel 3

That is strange, it looks like the client secret you are not using does not have enough rights to list the events on publicdata. 

I would check with Cognite support who provided you the key to make sure you have the rights to view events.

Otherwise, if you are confident using the Cognite SDK, you can perform a token inspect or try to list events manually to pinpoint the issue. 

Badge +6

Sure please let us the if I need to use different secret

Userlevel 3

Hi @jaydeep

Would you be able to create a ticket for Cognite Support via support@cognite.com, mentioning your issue and sharing your request ID? With that, they should be able to see if you have enough rights.


Best,

Carin 

Badge +6

created support request to check the access rights on the events resource type for the publicdata project

https://cognite.zendesk.com/hc/requests/7756

Userlevel 2
Badge +1

Hi @jaydeep 

 

I’m Viraj from Cognite support and we have received your ticket related to this issue. We will follow up via the support ticket and I hope its okay for you. 

 

Best regards, 
Viraj 

Userlevel 2
Badge

raise CogniteAPIError(msg, code, x_request_id, missing=missing, duplicated=duplicated, extra=extra)

cognite.client.exceptions.CogniteAPIError: Unauthorized | code: 401 | X-Request-ID: d838efa4-540a-95cb-bbee-65502e48b41a

As per the above error that you have shared, it is not an error due to insufficient access rights. It looks like the access token has expired by the time you have tried to replicate the events.
 
Since there is a huge number of assets, events, time series, etc. It will take some time to replicate them all at once. Which will result in an access token expiration. It would be better if you can replicate one resource type at a time to avoid token expiration.

Badge +6

I am trying to replicate timeseries and data points  however I am getting below error

AMAC02Z3123LVCJ:oid-replicator j.subhash.parandekar$ /usr/local/bin/python3 /Users/j.subhash.parandekar/oid-replicator/oid_replicator/replicate.py
2023-02-21 11:31:43,353 root INFO - Config file - Repeat line 5: 

2023-02-21 11:31:43,353 root INFO - Config file - Repeat line 15: 

2023-02-21 11:31:43,353 root INFO - Config file - Repeat line 24: 

Starting replication of resources
Replicating time series...
2023-02-21 11:31:53,330 root INFO - There are 408 existing time series in source (publicdata).
2023-02-21 11:31:53,330 root INFO - There are 10291 existing time series in destination (accenture-tiger-training).
2023-02-21 11:31:57,719 root INFO - If a time series asset id is one of the 1115 assets that have been replicated then it will be linked.
2023-02-21 11:31:57,719 root INFO - These copied/updated time series will have a replicated run time of: 1676959317000.
2023-02-21 11:31:57,719 root INFO - Starting to copy and update 408 time series from source (publicdata) to destination (accenture-tiger-training).
Traceback (most recent call last):
  File "/Users/j.subhash.parandekar/oid-replicator/oid_replicator/replicate.py", line 8, in <module>
    main()
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/replicator/__main__.py", line 427, in main
    cognite.replicator.time_series.replicate(
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/cognite/replicator/time_series.py", line 298, in replicate
    if len(ts_src) > batch_size:
       ^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '>' not supported between instances of 'int' and 'dict'

Userlevel 3

What is the value of  “batch_size”? It looks like it is an dictionary. 
its value is set in the config file for example

len(ts_src) is an int and printed above (408)

batch_size should also be an int, can be set to 10000 for example

Userlevel 3

Ok, apologies, i think you found a bug, we will look into it and fix it today

Userlevel 3

Could you upgrade the package to the latest version

pip install cognite-replicator --upgrade

Version should be 1.2.3
 

Badge +6

yes , after upgrading the cognite-replicator package to 1.2.3 this error is resolved and I am able to execute as per below. Thanks

2023-02-21 16:56:01,725 root INFO - Ext id:  pi:160571 Number of datapoints: 0
2023-02-21 16:56:04,095 root INFO - Ext id:  pi:160699 Number of datapoints: 219394
2023-02-21 16:56:04,138 root INFO - Ext id:  pi:160265 Number of datapoints: 3864
2023-02-21 16:56:04,138 root INFO - Ext id:  pi:161026 Number of datapoints: 0
2023-02-21 16:56:05,471 root INFO - Ext id:  houston.ro.REMOTE_AI[4] Number of datapoints: 122735
2023-02-21 16:56:05,471 root INFO - Ext id:  pi:160551 Number of datapoints: 0
2023-02-21 16:56:05,511 root INFO - Ext id:  houston.ro.REMOTE_AI[75] Number of datapoints: 3590
2023-02-21 16:56:07,950 root INFO - Ext id:  pi:160672 Number of datapoints: 224161
2023-02-21 16:56:07,951 root INFO - Ext id:  pi:160222 Number of datapoints: 0
2023-02-21 16:56:09,931 root INFO - Ext id:  pi:160766 Number of datapoints: 182149
2023-02-21 16:56:09,966 root INFO - Ext id:  pi:160256 Number of datapoints: 3228
Ready to insert datapoints... Tue Feb 21 16:56:09 2023
DATAPOINTS INSERTED AT:  Tue Feb 21 16:57:39 2023

Userlevel 3

Great to hear. Thank you for identifying this bug

Reply