The world of cybersecurity has been abuzz with the recent discovery of critical vulnerabilities in Ollama, a popular open-source framework for running large language models locally. This revelation, code-named Bleeding Llama, has sent shockwaves through the tech community, highlighting the potential risks associated with this widely-used platform.
The Bleeding Llama: A Critical Memory Leak
At its core, the Bleeding Llama vulnerability is an out-of-bounds read flaw that, if exploited, could allow remote attackers to leak sensitive data from Ollama's process memory. With over 300,000 servers potentially impacted, this vulnerability is a significant concern.
The problem arises from Ollama's use of the unsafe package when creating models from GGUF files. This function, WriteTo(), bypasses memory safety guarantees, creating a pathway for attackers to execute operations that could lead to data leakage.
In a hypothetical attack scenario, an attacker could send a specially crafted GGUF file with an inflated tensor shape to an exposed Ollama server. This would trigger an out-of-bounds heap read during model creation, potentially exposing sensitive information such as environment variables, API keys, and user data.
What makes this particularly fascinating is the potential impact on organizations. As Dor Attias, a security researcher at Cyera, puts it, "An attacker can learn basically anything about the organization from your AI inference." This includes proprietary code, customer contracts, and more.
Persistent Code Execution: A Double Whammy
Adding fuel to the fire, researchers at Striga have detailed two additional vulnerabilities in Ollama's Windows update mechanism that can be chained for persistent code execution. These flaws, which remain unpatched, allow attackers to execute arbitrary code at every login.
The first vulnerability, CVE-2026-42248, is a missing signature verification issue. Unlike its macOS counterpart, the Windows version of Ollama does not verify update binaries, leaving it open to potential exploitation.
The second vulnerability, CVE-2026-42249, is a path traversal issue. The Windows updater creates the local path for the installer's staging directory directly from HTTP response headers, without sanitization, allowing attackers to write arbitrary executables into the Windows Startup folder.
Together, these vulnerabilities create a powerful attack chain. By controlling an update server and exploiting these flaws, attackers can achieve persistent, silent code execution at the user's privilege level. This opens the door to a range of malicious activities, from reverse shells to info-stealers and droppers.
Implications and Mitigation
The implications of these vulnerabilities are far-reaching. With Ollama's popularity, the potential for widespread impact is high. Organizations must take immediate action to mitigate these risks.
Users are advised to apply the latest fixes, limit network access, and audit running instances for internet exposure. It's crucial to isolate and secure Ollama instances behind firewalls and consider deploying authentication proxies or API gateways.
Additionally, turning off automatic updates and removing Ollama shortcuts from the Startup folder can help disable silent on-login execution pathways.
A Wake-Up Call for AI Security
The Bleeding Llama and the persistent code execution vulnerabilities serve as a stark reminder of the importance of security in the rapidly evolving world of AI. As we embrace the power of large language models, we must also prioritize their secure implementation and deployment.
From my perspective, this incident highlights the need for a comprehensive security strategy in the AI space. It's not enough to focus solely on the benefits of these technologies; we must also address their potential risks and vulnerabilities.
As we continue to explore the capabilities of AI, let's ensure that security remains a top priority. After all, a secure foundation is essential for building a robust and trustworthy AI ecosystem.