Skip to main content
Solved

Retrieving file sizes for CDF files


Forum|alt.badge.img

Hi Team,

I am trying to get details of broken files uploaded in CDF . Is there any way to query actual file size from CDF using python sdk without downloading the file? Please add a feature to query file size using python sdk to get the corrupted/broken files.

 

Thanks in advance.

Best answer by Dilini Fernando

Hi @Rajeev z ranjan,

I hope the above helped. As of now, I’m closing this thread. Please feel free to create a new post if you have any questions.

Best regards,

Dilini

 

View original
Did this topic help you find an answer to your question?

Hi!

I cant see that the API has a way to get the file size without downloading the file. 

It seems like a very useful feature to have though. 
If this is a one time cleanup job, it would maybe be ok, though time consuming to do len(client.files.download_bytes(file_id=file_id)) for all the file ids?
 


  • Practitioner
  • January 3, 2024

Hi,
You can get the size using the Documents API. List documents gives you the size (in bytes) in `sourceFile`.`size`. 
You can also use the Python SDK. https://cognite-sdk-python.readthedocs-hosted.com/en/latest/documents.html#cognite.client._api.documents.DocumentsAPI.list
Example code:
 

from cognite.client.data_classes.documents import DocumentProperty
from cognite.client.data_classes import filters

file_id_filter = filters.Equals(DocumentProperty.id, YOUR_FILE_ID)
doc = client.documents.list(filter=file_id_filter)
doc[0].source_file.size

 


I also just now learned that you can send HEAD requests to the download urls, like this:

def get_content_lengths(client, external_ids):
    urls = client.files.retrieve_download_urls(external_id=external_ids)
    return {
        external_id: client.files._http_client.request("HEAD", urls[external_id]).headers["Content-Length"]
        for external_id in external_ids
    }

This might be a workaround if the files for some reason are not available in the documents API. If they are in the documents API, this should require the fewest requests. 


Dilini Fernando
Seasoned Practitioner
Forum|alt.badge.img+2

Hi @Rajeev z ranjan,

Did the above help you?

Br,
Dilini


Dilini Fernando
Seasoned Practitioner
Forum|alt.badge.img+2

Hi @Rajeev z ranjan,

I hope the above helped. As of now, I’m closing this thread. Please feel free to create a new post if you have any questions.

Best regards,

Dilini

 


  • Committed
  • February 5, 2024

It hasn't helped.


Reply


Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie Settings