neuroglancer v2.41.2 Path Traversal leads to Arbitrary File Read

Introduction

Neuroglancer is an advanced WebGL-based visualization platform designed for large-scale, 3-D volumetric datasets, primarily utilized in fields like connectomics and neuroimaging. It serves as a high-performance tool for researchers to interactively explore, segment, and annotate massive scientific volumes—ranging from gigabytes to petabytes—directly within a web browser, facilitating the collaborative analysis of complex biological structures across different research environments.

The architecture employs a distributed client-server model where a lightweight, WebGL-driven client handles high-performance rendering, while a backend (implemented in Python/Tornado or Go) serves data, meshes, and skeletons on demand. By leveraging efficient chunked-data streaming and a powerful Python API, Neuroglancer enables seamless programmatic manipulation of the viewer state and real-time visualization of local or remote data sources, making massive datasets accessible without the need for specialized high-end local workstations.

Vulnerability Description

The Neuroglancer Python backend is vulnerable to a Path Traversal attack when a custom file-based static content source is configured. An attacker can craft a URL using directory traversal sequences (../) to access sensitive files on the host system that are outside the intended directory.

Insufficient Regex Validation

In python/neuroglancer/server.py, the STATIC_PATH_REGEX (L47) is used to capture the requested resource path:

STATIC_PATH_REGEX = r"^/v/(?P<viewer_token>[^/]+)/(?P<path>(?:[@a-zA-Z0-9_\-][@a-zA-Z0-9_\-./]*)?)$"

The regex requires the first character of the path to be alphanumeric, which prevents paths starting directly with ... However, it permits dots and slashes anywhere else in the string. This allows a Regex Bypass using a dummy prefix (e.g., a/../../).

Unsafe Path Joining

In python/neuroglancer/static/__init__.py, the FileSource.get_content method (L81) processes the captured name (the URL path):

def get_content(self, name):
    full_path = os.path.join(self.file_path, name)
    with self.file_open(full_path, "rb") as f:
        return f.read()

The function uses os.path.join without performing any path normalization or verification that the resulting full_path still resides under self.file_path. Because os.path.join natively resolves traversal sequences, it escapes the base directory.

It is important to note that two different devices were used to realistically simulate the behavior of a victim user and an attacker, and that victim user interaction is required, either indirectly or directly, through actions such as configuration environments.

Impact

The impact of this vulnerability is categorized as significant because it enables unauthorized access to sensitive system and application files. By manipulating the static content source configuration, an attacker can bypass the intended directory boundaries and exfiltrate critical data such as environment variables (.env), internal project configurations, and even system-level sensitive files. This exposure is not limited to the repository folder but extends to any directory accessible by the user account running the Neuroglancer process, potentially leading to a full compromise of the host's data privacy.

Furthermore, the vulnerability highlights a critical lack of input validation within the application's core data-serving logic. While certain web frameworks or regex patterns may provide a superficial layer of protection through URI normalization, the underlying reliance on unvalidated os.path.join operations creates a highly brittle security posture. This "security by coincidence" is easily circumvented if the server is deployed behind a misconfigured proxy or if the routing logic is modified, transforming a dormant issue into an immediate, high-severity threat that compromises the integrity of the host system.

Exploitation Scenario

A realistic exploitation scenario begins with an attacker leveraging the Python-driven nature of Neuroglancer's research workflow. In many scientific environments, users share initialization scripts, data-processing pipelines, or Jupyter notebooks to set up their visualization sessions. An attacker could distribute a seemingly harmless configuration snippet—perhaps under the guise of optimizing local data loading—that executes neuroglancer.set_static_content_source(path='/'). By inducing the victim to run this command, the attacker successfully transitions the server from the secure default state to a vulnerable configuration that exposes the entire filesystem via the embedded web server.

Once the vulnerable configuration is active and the server is bound to an accessible network interface (a common requirement for remote collaboration), the attacker can begin remote reconnaissance and data exfiltration. By sending crafted HTTP GET requests that utilize the discovered regex bypass sequence (e.g., /a/../../etc/passwd), the attacker circumvents basic path filtering to probe for sensitive system files or credentials. This process can be automated to systematically traverse the host's directory structure, allowing for the stealthy collection of private research data or system secrets without triggering conventional security alerts that monitor only localized path access.

Step-by-Step Reproduction

Victim User - IP Address: `192.168.1.90`

Step No. 1. The victim user, in their normal workflow, clones the repository onto a device with a UNIX operating system (in my case, a Raspberry Pi); verifies its IP address, and deploys the technology to interact with it:

Attacker User - IP Address: `192.168.1.88`

Step No. 2. For their part, the attacker forces the reading of .env to exfiltrate the GEMINI API key and read /etc/passwd:

Executive Summary: Path Traversal and Arbitrary File Read in `Neuroglancer`

The research identifies a significant Path Traversal vulnerability in the Python backend of Neuroglancer v2.41.2.

Root Cause: The STATIC_PATH_REGEX in server.py uses a flawed regular expression that, while preventing paths starting with .., permits traversal sequences later in the string. Furthermore, the FileSource.get_content method uses os.path.join to process user-supplied paths without any normalization or boundary verification.
Exploitation Mechanism: By crafting a URL with a dummy alphanumeric prefix (e.g., /v/[token]/a/../../etc/passwd), an attacker bypasses the regex filter. The underlying os.path.join then resolves these sequences, allowing the server to access files outside the intended base directory.

Analysis of Scope and Security Implications

This vulnerability is of significant severity because it exposes the underlying host filesystem to unauthorized access from the web browser-based client.

1. Infection Scenarios

Configuration Poisoning: An attacker can socially engineer a victim into executing neuroglancer.set_static_content_source(path='/') in their research scripts (e.g., Jupyter notebooks), which effectively exposes the entire filesystem to the web server.
Sensitive Data Exfiltration: Once the server is configured to serve a broad directory, an attacker can systematically browse and exfiltrate sensitive files, including environment variables (.env), API keys (e.g., Gemini API keys), and internal project configurations.

2. Factors Exacerbating Risk

Regex Bypass: The application relies on "security by coincidence"—specifically a brittle regex pattern that is easily circumvented by using a dummy directory prefix (e.g., a/../../).
Unvalidated Path Joining: The backend's reliance on unvalidated os.path.join operations to resolve file paths creates a high-risk sink. Standard web framework normalization is insufficient here, as the application logic itself is fundamentally insecure.
Deployment Context: Neuroglancer is often deployed in collaborative scientific environments where remote access is a requirement, making the exposed server a persistent target for attackers.

Conclusion and Recommendation

This is a significant-severity vulnerability. The combination of flawed regex filtering and insecure path joining makes the application vulnerable to full directory traversal.

Suggested actions for the development team:

Implement Path Normalization: Before opening any file, use os.path.abspath() or os.path.realpath() and verify that the resulting path still starts with the intended base directory.
Robust Regex/Input Validation: Update the STATIC_PATH_REGEX to strictly disallow directory traversal sequences (..) regardless of their position in the string, or use a secure path-parsing library.
Principle of Least Privilege: Advise users to configure static content sources only for specific, non-sensitive directories and never set the root directory (/) as a source.