Skip to main content
Question

memory allocation issue when deploying huggingface model function


Forum|alt.badge.img+4

I have created a function that was configured to 5G of memory and 2 CPUs to run huggingface AI model (py311). Deploying the function went fine. However, when running the function it throws this error. 

 RuntimeError: [enforce fail at alloc_cpu.cpp:118] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 9437184 bytes. Error code 12 (Cannot allocate memory)

 

 

9 replies

Hi! This exception does not seem to be due to the function running out of memory. I did a quick Google search on huggingface AI, and this error is apparently known in the community. Ref this page:
https://discuss.huggingface.co/t/batch-transform-passing-entire-batch-at-once/50336


Forum|alt.badge.img+4
  • Author
  • Seasoned
  • 5 replies
  • March 5, 2025

It’s not running out of memory. It does not allow for memory allocation to load the model on the function. It might be related allocation limits and enablement.  I was wondering if there is a way to enable allocation/swap or workaround solution.

Total RAM: 5.00 GB

Available RAM: 4.13 GB
Used RAM: 0.55 GB
Memory used by Python script: 0.71 GB
Total Swap: 0.00 GB
Used Swap: 0.00 GB
Available Swap: 0.00 GB


Hi!

I see that the link in my previous answer was related to AWS specifically and probably doesn't relate to your issue. 

I don’t see any reason why memory allocation should fail if the function indeed does have enough memory available. A couple of follow-up questions to clarify:

  • Are you certain that you are debugging the correct function? I notice you say that you are using Python 3.11, but from the stack trace, the function is apparently running on Python 3.9.
  • What CDF project is this?

Forum|alt.badge.img+4
  • Author
  • Seasoned
  • 5 replies
  • March 5, 2025

That is right. There were different attempts with python 3.9 and 3.11. The snapshot shared is the attempt with python 3.9.

The project is cntxt-dmm-playground


Ok, I see. I don’t see any reason why memory allocation should fail. I’m not familiar with huggingface AI and PyTorch and what they do under the hood, but as far as I know, there should be nothing on the function itself blocking the allocation. Unless it tries to write to disk, which typically would fail since the function filesystem is read-only, except for the “/temp” folder, which is writeable.

As a sanity check, have you tried executing your code locally?


Andre Alves
MVP
Forum|alt.badge.img+13

That’s a great topic, ​@Henrik Eiding  and ​@Abdulmonim . My question is more related to design principles.

Should we run foundation models (e.g., Hugging Face models) using Functions inside the Cognite cluster? I remember that in the past, the cluster had some limitations, focusing more on out-of-the-box features and lightweight computations rather than ML inference. The general guidance was to run inference outside of Cognite.

Am I wrong in this understanding? Have things changed, or am I misunderstanding something?

Thanks,
André


Forum|alt.badge.img+4
  • Author
  • Seasoned
  • 5 replies
  • March 6, 2025

@Henrik Eiding I tested the code locally and worked as expected with no error. 
@Andre Alves I appreciate your feedback. The model size is around 9MB, lightweight model. 


@Andre Alves If your model is able to run with the provided cpu & memory, I don’t see why you should not use Cognite Functions. Unless you are hitting other limitations I’m not aware of. 

@Abdulmonim Ok, good that you verified it locally. I fail to see an explanation for why it fails inside the function. Could it possibly try to write to disk under the hood?


Andre Alves
MVP
Forum|alt.badge.img+13

Thanks, ​@Henrik Eiding !

I just wanted to highlight some potential challenges, even with lightweight models, as serverless environments like Azure Functions can introduce delays when scaling up from zero. Additionally, real-time inference can be impacted by execution time limits and the need to load models into memory for each function invocation.

That said, it's great to hear that teams have successfully deployed models in serverless environments! I’d love to learn more about their experiences, especially in a production setting, as their insights and lessons learned could help us determine the best approach moving forward.


Reply


Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie Settings