Skip to main content

Upload data to EDR usk SDK/API

  • June 18, 2025
  • 0 replies
  • 20 views

Hi,

I would like to update data to the EDR. The EDR has a build in tool to upload data, but I would like to use the API. As I understand it is not officially supported, but we would like to give it a try as we have hunderds of files we would like to upload.

Before I asked my questions by e-mail, but I am directed to this hub to ask my questions, so there we go:

 

File upload:

I'm using the client.files.upload() function [1]. It seems to work when only the file path is provided, but I'm unsure whether this is sufficient or if I should be populating more of the optional arguments. Here's what I'm looking at:
  • path
  • external_id
  • name
  • source
  • mime_type
  • metadata
  • directory
  • asset_ids
  • source_created_time
  • source_modified_time
  • data_set_id
  • labels
  • geo_location
  • security_categories
  • recursive
  • overwrite
 
name: It appears this is automatically assigned based on the filename, is that okay?
directory: This seems to default to the dataset ID, but I’ve run into an error when using 968cade8-e20d-4350-8cbd-6f1188e45694/. I can only get it to work when I prepend a slash, like /968cade8-e20d-4350-8cbd-6f1188e45694/. Should I be explicitly setting the directory, and if so, what's the correct format? Note that the files already in the EDR do not have the slash in front of the dataset ID.
metadata: Should we include all available metadata (e.g., datasetId, location, season) from the NetCDF files, or will this be automatically extracted in phase 2? Additionally, should the metadata only include fields that are not already present in the EDR? For example, serial_number is not an option in the EDR, but the field season is. Both are part of the metadata for the NetCDF file.
data_set_id: I’m unsure what this should point to, could you clarify?
geo_location: This is available in the NetCDF metadata as WKT. Is it something I need to set manually, or will it be extracted from the file that is uploaded?
 
I also noticed some differences between the files I uploaded and the ones already in the EDR. The mime_type and instance_id are missing from mine. Should we be explicitly setting the mime_type, or is it automatically inferred? And what about the instance_id? The instance_id seems important and it is: NodeId(space='sp_edr_cdm', external_id='NL_MPP_profile_20240828_084001_20240828_084505.nc'). I can find the files in the EDR but not under "spr_edr_source". 
 
Lastly, we’d like to re-upload some files to include this updated metadata. However, I noticed that uploading the same file again results in multiple copies appearing in the EDR. Is there a way to overwrite files? We're incrementing the revision_number, and ideally the file should just update in place.
 
Streamlit app:
Regarding the Streamlit app; although I should have access, I got this error message: 
 
 
Kind regards,