Solved

Retrieving file sizes for CDF files

4 months ago
21 December 2023
6 replies
170 views

Rajeev z ranjan
Active
0 replies

Hi Team,

I am trying to get details of broken files uploaded in CDF . Is there any way to query actual file size from CDF using python sdk without downloading the file? Please add a feature to query file size using python sdk to get the corrupted/broken files.

Thanks in advance.

icon

Best answer by Dilini Fernando 24 January 2024, 11:46

View original

6 replies

Harsha
Committed
8 replies
3 months ago
5 February 2024

It hasn't helped.

Userlevel 4

Dilini Fernando
Seasoned Practitioner
429 replies
3 months ago
24 January 2024
Answer

Hi @Rajeev z ranjan,

I hope the above helped. As of now, I’m closing this thread. Please feel free to create a new post if you have any questions.

Best regards,

Dilini

Userlevel 4

Dilini Fernando
Seasoned Practitioner
429 replies
3 months ago
11 January 2024

Hi @Rajeev z ranjan,

Did the above help you?

Br,
Dilini

Userlevel 1

Ola Liabøtrø
Practitioner
22 replies
4 months ago
3 January 2024

I also just now learned that you can send HEAD requests to the download urls, like this:

def get_content_lengths(client, external_ids):
    urls = client.files.retrieve_download_urls(external_id=external_ids)
    return {
        external_id: client.files._http_client.request("HEAD", urls[external_id]).headers["Content-Length"]
        for external_id in external_ids
    }

This might be a workaround if the files for some reason are not available in the documents API. If they are in the documents API, this should require the fewest requests.

Que Tran
Practitioner
8 replies
4 months ago
3 January 2024

Hi,
You can get the size using the Documents API. List documents gives you the size (in bytes) in `sourceFile`.`size`.
You can also use the Python SDK. https://cognite-sdk-python.readthedocs-hosted.com/en/latest/documents.html#cognite.client._api.documents.DocumentsAPI.list
Example code:

from cognite.client.data_classes.documents import DocumentProperty
from cognite.client.data_classes import filters

file_id_filter = filters.Equals(DocumentProperty.id, YOUR_FILE_ID)
doc = client.documents.list(filter=file_id_filter)
doc[0].source_file.size

Userlevel 1

Ola Liabøtrø
Practitioner
22 replies
4 months ago
3 January 2024

Hi!

I cant see that the API has a way to get the file size without downloading the file.

It seems like a very useful feature to have though.
If this is a one time cleanup job, it would maybe be ok, though time consuming to do len(client.files.download_bytes(file_id=file_id)) for all the file ids?

Reply

Sign up

Log in to the community

Scanning file for viruses.

This file cannot be downloaded