Skip to main content

We are trying to ingest the data from csv file into Staging table. Below is the config.yaml file.

logger:

  console:
    level: INFO

# Information about CDF project
cognite:
  # Read these from environment variables
  host: https://fusion.cognite.com/
  project: sandbox

  idp-authentication:
    # OIDC client ID
    client-id: abc345

    # URL to fetch OIDC tokens from
    token-url: https://login.microsoftonline.com/gvbiy76/oauth2/v2.0/token

    # Alternatively, you can specify an Azure tenant to generate token-url automatically
    #tenant: azure-tenant-uuid

    # OIDC client secret - either this or certificate is required
    secret: abcd

    # List of OIDC scopes to request
    scopes:
      - api://abc345/.default


  # Data set to attach uploaded files to. Either use CDF IDs (integers) or
  # user-given external ids (strings)
  data-set:
    id: 28834
    #external-id: File_Extractor


# (Optional) Extractor performance tuning.
extractor:

  schedule:
    type: interval
    expression: 5m

# Information about files to extract
files:
  # (Optional) A list of extensions to fetch. If included, only files matching
  # these extensions will be uploaded.
  extensions: 
  - .csv
 
  # (Optional) Include metadata in file upload
  with-metadata: true

  # (Optional) Write file metadata to CDF Raw instead of adding it to the files themselves. The with-metadata flag must be true.
  metadata-to-raw:
    # The name of the Raw database used to store file metadata.
    database: File_Extractor
    # The name of the Raw table used to store file metadata.
    table: csv

  # Information about file provider
  file-provider:
    # Provider type. Supported types include local, sharepoint_online,
    # gcp_cloud_storage, azure_blob_storage, aws_s3, smb_protocol, ftp and sftp.

  # (Optional) Prefix added to the directory property on files in CDF.
  #directory-prefix: "/my/files"
    type: local

    # For local files: Absolute or relative path to directory to watch
    path: example.csv


While trying to run the extractor, it gives below error.
2024-11-25 12:25:23.797 UTC ERROR   ] ThreadPoolExecutor-1_0 - Job "FileExtractor (trigger: intervalC0:05:00], next run at: 2024-11-25 18:00:23 IST)" raised an exception
Traceback (most recent call last):
  File "apscheduler\executors\base.py", line 125, in run_job
  File "file_extractor\extractor.py", line 155, in run_internal
  File "cognite\extractorutils\configtools\elements.py", line 404, in get_data_set
  File "cognite\client\_api\data_sets.py", line 159, in retrieve
  File "cognite\client\_api_client.py", line 384, in _retrieve_multiple
  File "cognite\client\utils\_concurrency.py", line 77, in raise_compound_exception_if_failed_tasks
  File "cognite\client\utils\_concurrency.py", line 104, in _raise_basic_api_error
cognite.client.exceptions.CogniteAPIError: <html>
<head><title>405 Not Allowed</title></head>
<body>
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx</center>
</body>
</html>
 | code: 405 | X-Request-ID: None
The API Failed to process some items.
Successful (2xx): -]
Unknown (5xx): T]
Failed (4xx): e{'id': 28834}, ...]

@Mayuri Bhoge the project name and the scope looks incorrect. Can you please let me know the name of the CDF project. If you need further assistance I can create a support ticket for you.


@Mithila Jayalath 
CDF Project name: sandbox
scope: api://<client-id>/.default


@Mayuri Bhoge I’m not sure if your project name is sandbox. As per my knowledge the project name name should contain a prefix along with “sandbox”.

The scope should be in the following format https://<my_cluster>.cognitedata.com/.default

Please refer to the documentation here for more information regading scopes.

 


Reply