Question

memory allocation issue when deploying huggingface model function

6 months ago
March 4, 2025
9 replies
117 views

Abdulmonim
Seasoned
5 replies

I have created a function that was configured to 5G of memory and 2 CPUs to run huggingface AI model (py311). Deploying the function went fine. However, when running the function it throws this error.

RuntimeError: [enforce fail at alloc_cpu.cpp:118] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 9437184 bytes. Error code 12 (Cannot allocate memory)

Henrik Eiding
Cognite
15 replies
6 months ago
March 4, 2025

Hi! This exception does not seem to be due to the function running out of memory. I did a quick Google search on huggingface AI, and this error is apparently known in the community. Ref this page:
https://discuss.huggingface.co/t/batch-transform-passing-entire-batch-at-once/50336

Abdulmonim
Author
Seasoned
5 replies
6 months ago
March 5, 2025

It’s not running out of memory. It does not allow for memory allocation to load the model on the function. It might be related allocation limits and enablement. I was wondering if there is a way to enable allocation/swap or workaround solution.

Total RAM: 5.00 GB

Available RAM: 4.13 GB
Used RAM: 0.55 GB
Memory used by Python script: 0.71 GB
Total Swap: 0.00 GB
Used Swap: 0.00 GB
Available Swap: 0.00 GB

Henrik Eiding
Cognite
15 replies
6 months ago
March 5, 2025

Hi!

I see that the link in my previous answer was related to AWS specifically and probably doesn't relate to your issue.

I don’t see any reason why memory allocation should fail if the function indeed does have enough memory available. A couple of follow-up questions to clarify:

Are you certain that you are debugging the correct function? I notice you say that you are using Python 3.11, but from the stack trace, the function is apparently running on Python 3.9.
What CDF project is this?

Abdulmonim
Author
Seasoned
5 replies
6 months ago
March 5, 2025

That is right. There were different attempts with python 3.9 and 3.11. The snapshot shared is the attempt with python 3.9.

The project is cntxt-dmm-playground

Henrik Eiding
Cognite
15 replies
6 months ago
March 5, 2025

Ok, I see. I don’t see any reason why memory allocation should fail. I’m not familiar with huggingface AI and PyTorch and what they do under the hood, but as far as I know, there should be nothing on the function itself blocking the allocation. Unless it tries to write to disk, which typically would fail since the function filesystem is read-only, except for the “/temp” folder, which is writeable.

As a sanity check, have you tried executing your code locally?

+13

Andre Alves
MVP
176 replies
6 months ago
March 6, 2025

That’s a great topic, @Henrik Eiding and @Abdulmonim . My question is more related to design principles.

Should we run foundation models (e.g., Hugging Face models) using Functions inside the Cognite cluster? I remember that in the past, the cluster had some limitations, focusing more on out-of-the-box features and lightweight computations rather than ML inference. The general guidance was to run inference outside of Cognite.

Am I wrong in this understanding? Have things changed, or am I misunderstanding something?

Thanks,
André

André Alves

Abdulmonim
Author
Seasoned
5 replies
6 months ago
March 6, 2025

@Henrik Eiding I tested the code locally and worked as expected with no error.
@Andre Alves I appreciate your feedback. The model size is around 9MB, lightweight model.

Henrik Eiding
Cognite
15 replies
6 months ago
March 7, 2025

@Andre Alves If your model is able to run with the provided cpu & memory, I don’t see why you should not use Cognite Functions. Unless you are hitting other limitations I’m not aware of.

@Abdulmonim Ok, good that you verified it locally. I fail to see an explanation for why it fails inside the function. Could it possibly try to write to disk under the hood?

+13

Andre Alves
MVP
176 replies
6 months ago
March 11, 2025

Thanks, @Henrik Eiding !

I just wanted to highlight some potential challenges, even with lightweight models, as serverless environments like Azure Functions can introduce delays when scaling up from zero. Additionally, real-time inference can be impacted by execution time limits and the need to load models into memory for each function invocation.

That said, it's great to hear that teams have successfully deployed models in serverless environments! I’d love to learn more about their experiences, especially in a production setting, as their insights and lessons learned could help us determine the best approach moving forward.

André Alves

Page 1 / 1

Available RAM: 4.13 GB
Used RAM: 0.55 GB
Memory used by Python script: 0.71 GB
Total Swap: 0.00 GB
Used Swap: 0.00 GB
Available Swap: 0.00 GB

Hi!

I see that the link in my previous answer was related to AWS specifically and probably doesn't relate to your issue.

I don’t see any reason why memory allocation should fail if the function indeed does have enough memory available. A couple of follow-up questions to clarify:

Are you certain that you are debugging the correct function? I notice you say that you are using Python 3.11, but from the stack trace, the function is apparently running on Python 3.9.
What CDF project is this?

That is right. There were different attempts with python 3.9 and 3.11. The snapshot shared is the attempt with python 3.9.

The project is cntxt-dmm-playground

As a sanity check, have you tried executing your code locally?

That’s a great topic, @Henrik Eiding and @Abdulmonim . My question is more related to design principles.

Am I wrong in this understanding? Have things changed, or am I misunderstanding something?

Thanks,
André

@Henrik Eiding I tested the code locally and worked as expected with no error.
@Andre Alves I appreciate your feedback. The model size is around 9MB, lightweight model.

@Andre Alves If your model is able to run with the provided cpu & memory, I don’t see why you should not use Cognite Functions. Unless you are hitting other limitations I’m not aware of.

@Abdulmonim Ok, good that you verified it locally. I fail to see an explanation for why it fails inside the function. Could it possibly try to write to disk under the hood?

Thanks, @Henrik Eiding !

Reply

Cookie Policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

Cookie settings

We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.

Basic
Functional

Normal
Functional + analytics

Complete
Functional + analytics + social media + embedded videos + marketing

Reply

Related topics

What's New in Cognite Academy: Q3 2025

Streaming data from Cognite Data Fusion to Power BI [Cognite Official]

Product Release Spotlight - June 2024 Release

How to understand Function Memory Quotas in CDF [Cognite Official]

Facing issue in Delete Dangling Function while deploying ML model to AIR.icon

Sign up

Log in to the community

Scanning file for viruses.

This file cannot be downloaded

Cookie Policy

Cookie settings