A critical remote code execution (RCE) vulnerability, identified as CVE-2025-31045, has been discovered in the popular PyTorch-GPU machine learning library. The flaw, which carries a CVSS score of 9.8 (Critical), allows an unauthenticated attacker to execute arbitrary code on vulnerable systems by tricking them into loading a maliciously crafted model file. The vulnerability was officially disclosed on January 22, 2025, and affects a wide range of AI and machine learning applications that rely on GPU acceleration through the library.
Security researchers from the AI security firm ‘Cerebral Defense’ are credited with the discovery. They found that specific CUDA-optimized tensor operations within the library fail to properly sanitize input dimensions when loading models in the TorchScript format. This oversight can lead to a buffer overflow, giving a remote attacker the ability to hijack the instruction pointer and achieve code execution with the permissions of the running application.
Technical Details and Impact
The core of CVE-2025-31045 lies in the `torch.jit.load()` function when used on systems with NVIDIA GPUs. An attacker can embed a malicious payload within the metadata of a seemingly benign model file. When an application loads this file to perform inference or continue training, the flawed deserialization process triggers the overflow, allowing the attacker’s code to run on the server’s GPU and then pivot to the CPU. The attack requires no user interaction beyond the application’s standard function of loading a model file, making it particularly dangerous for MLOps platforms, cloud-based AI inference APIs, and services that accept user-submitted models.
Affected Versions:
- PyTorch-GPU versions 2.5.0 up to, but not including, 2.7.1
The impact is severe, as a successful exploit grants the attacker full control over the underlying server. This can lead to complete data exfiltration, deployment of ransomware, or the use of the compromised high-performance servers in botnets for cryptomining or launching further attacks. Given PyTorch’s widespread use in both academic research and commercial production environments, the potential attack surface is enormous.
Mitigation and Recommendations
The PyTorch development team has responded swiftly by releasing a patched version. All users are strongly urged to take immediate action to mitigate this threat. The primary mitigation is to upgrade to a secure version of the library.
Immediate Actions:
- Upgrade PyTorch-GPU: Administrators should immediately update their environments to PyTorch-GPU version 2.7.1 or newer. The patch, detailed in the official PyTorch security advisory, resolves the buffer overflow by implementing stricter validation on tensor metadata during the model loading process.
- Scan for Vulnerable Instances: Organizations should use dependency scanning tools to identify all projects and servers running the vulnerable versions of PyTorch-GPU.
- Restrict Untrusted Models: As a best practice, avoid loading model files from untrusted or unverified sources. Implement a security gateway to scan all incoming models for malicious signatures before they are processed by the application.
For systems that cannot be immediately patched, a temporary workaround involves disabling JIT compilation for models from untrusted sources, though this may come with a significant performance penalty. This critical vulnerability underscores the growing need for robust security practices in the AI/ML software supply chain.