Currently, the SharePoint File Extractor fails with a ValueError: Couldn't obtain document library id when attempting to extract files from a document library located within a nested subsite (a subsite or deeper).
The extractor successfully connects and works for standard site structures, but fails exclusively on deep hierarchies. Troubleshooting confirms this is not a permissions issue, and the URLs are fully accessible via a standard web browser.
Technical Context & Limitation:
The current Cognite SharePoint extractor expects the URL schema to follow a single-level subsite structure:https://<tenant>.sharepoint.com/sites/<parent-site-name>/<subsite-name>/library-name
When provided with a multi-level subsite URL (e.g., https://<tenant>.sharepoint.com/sites/<site-name>/<subsite1>/<subsite2>/library-name), the extractor attempts to treat the intermediate subsites as document libraries. Because the extractor currently uses a trial-and-error method to differentiate between a library and a subsite, recursively checking multi-level nested sites would generate an excessively high number of API error calls, potentially hitting rate limits.
Requested Solution:
Enhance the SharePoint File Extractor to support multi-level subsite traversal. This may require evaluating and leveraging alternative Microsoft Graph API endpoints that can efficiently map directory structures without relying on the current trial-and-error query method, thereby avoiding API rate limit issues.
Check the
documentation
Ask the
Community
Take a look
at
Academy
Cognite
Status
Page
Contact
Cognite Support