Skip to main content
Solved

Issue in File Extractor

  • November 25, 2024
  • 3 replies
  • 41 views

Forum|alt.badge.img+4

We are trying to ingest the data from csv file into Staging table. Below is the config.yaml file.

logger:

  console:
    level: INFO

# Information about CDF project
cognite:
  # Read these from environment variables
  host: https://fusion.cognite.com/
  project: sandbox

  idp-authentication:
    # OIDC client ID
    client-id: abc345

    # URL to fetch OIDC tokens from
    token-url: https://login.microsoftonline.com/gvbiy76/oauth2/v2.0/token

    # Alternatively, you can specify an Azure tenant to generate token-url automatically
    #tenant: azure-tenant-uuid

    # OIDC client secret - either this or certificate is required
    secret: abcd

    # List of OIDC scopes to request
    scopes:
      - api://abc345/.default


  # Data set to attach uploaded files to. Either use CDF IDs (integers) or
  # user-given external ids (strings)
  data-set:
    id: 28834
    #external-id: File_Extractor


# (Optional) Extractor performance tuning.
extractor:

  schedule:
    type: interval
    expression: 5m

# Information about files to extract
files:
  # (Optional) A list of extensions to fetch. If included, only files matching
  # these extensions will be uploaded.
  extensions: 
  - .csv
 
  # (Optional) Include metadata in file upload
  with-metadata: true

  # (Optional) Write file metadata to CDF Raw instead of adding it to the files themselves. The with-metadata flag must be true.
  metadata-to-raw:
    # The name of the Raw database used to store file metadata.
    database: File_Extractor
    # The name of the Raw table used to store file metadata.
    table: csv

  # Information about file provider
  file-provider:
    # Provider type. Supported types include local, sharepoint_online,
    # gcp_cloud_storage, azure_blob_storage, aws_s3, smb_protocol, ftp and sftp.

  # (Optional) Prefix added to the directory property on files in CDF.
  #directory-prefix: "/my/files"
    type: local

    # For local files: Absolute or relative path to directory to watch
    path: example.csv


While trying to run the extractor, it gives below error.
2024-11-25 12:25:23.797 UTC [ERROR   ] ThreadPoolExecutor-1_0 - Job "FileExtractor (trigger: interval[0:05:00], next run at: 2024-11-25 18:00:23 IST)" raised an exception
Traceback (most recent call last):
  File "apscheduler\executors\base.py", line 125, in run_job
  File "file_extractor\extractor.py", line 155, in run_internal
  File "cognite\extractorutils\configtools\elements.py", line 404, in get_data_set
  File "cognite\client\_api\data_sets.py", line 159, in retrieve
  File "cognite\client\_api_client.py", line 384, in _retrieve_multiple
  File "cognite\client\utils\_concurrency.py", line 77, in raise_compound_exception_if_failed_tasks
  File "cognite\client\utils\_concurrency.py", line 104, in _raise_basic_api_error
cognite.client.exceptions.CogniteAPIError: <html>
<head><title>405 Not Allowed</title></head>
<body>
<center><h1>405 Not Allowed</h1></center>
<hr><center>nginx</center>
</body>
</html>
 | code: 405 | X-Request-ID: None
The API Failed to process some items.
Successful (2xx): []
Unknown (5xx): []
Failed (4xx): [{'id': 28834}, ...]

Best answer by Mithila Jayalath

@Mayuri Bhoge I’m not sure if your project name is sandbox. As per my knowledge the project name name should contain a prefix along with “sandbox”.

The scope should be in the following format https://<my_cluster>.cognitedata.com/.default

Please refer to the documentation here for more information regading scopes.

 

View original
Did this topic help you find an answer to your question?

3 replies

Mithila Jayalath
Seasoned Practitioner
Forum|alt.badge.img

@Mayuri Bhoge the project name and the scope looks incorrect. Can you please let me know the name of the CDF project. If you need further assistance I can create a support ticket for you.


Forum|alt.badge.img+4
  • Author
  • Active
  • 1 reply
  • November 25, 2024

@Mithila Jayalath 
CDF Project name: sandbox
scope: api://<client-id>/.default


Mithila Jayalath
Seasoned Practitioner
Forum|alt.badge.img
  • Seasoned Practitioner
  • 397 replies
  • Answer
  • November 26, 2024

@Mayuri Bhoge I’m not sure if your project name is sandbox. As per my knowledge the project name name should contain a prefix along with “sandbox”.

The scope should be in the following format https://<my_cluster>.cognitedata.com/.default

Please refer to the documentation here for more information regading scopes.

 


Reply


Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie Settings