BeyondTrust’s data engineering team uncovered a vulnerability within Databricks audit logs that can create new pathways for compromise. This blog explores the details of this vulnerability and provides actionable steps for mitigation.
Author:
Charles Du
Director of Engineering
The Hidden Backdoor in Your Databricks Audit Logs
Charles Du
Director of Engineering
Databricks Audit Logs Vulnerability Explained
Link copied
Cloud practitioners and security professionals are familiar with Databricks’ fantastic auditing capabilities, which offer deep visibility into user activity. These audit trails are especially useful for analysis and can be instrumental in understanding your Databricks environment’s usage patterns, as well as for discovering any unwanted activities. However, BeyondTrust’s data engineering team uncovered a vulnerability within these logs that could allow an attacker to turn this defensive asset into an offensive capability.
This blog will explore the details of this vulnerability, demonstrate how it can be exploited, and provide actionable steps for mitigation.
The Critical Databricks Flaw: Unredacted Session IDs
Link copied
The center of this issue is the “system.access.audit” table within the Unity Catalog. The “system.access.audit” table is a feature in Databricks Unity Catalog that provides an in-depth record of user activities and events. It shows how your Databricks environment is being used and helps security professionals spot unusual or potentially malicious actions. While this table provides a comprehensive record of events, it can, under certain circumstances, log sensitive information in certain fields, such as the “session_id” field.
Figure 1 - List of session ids within the “system.access.audit” table
The “session_id” field is typically the identifier of the Databricks’ session from where the audited request came from, allowing security teams to track activity over a user session. While Databricks redacts sensitive information in many areas, we found that session IDs prefixed with “auth” are not consistently redacted. These aren’t just internal tracking identifiers; they are users' raw browser session tokens.
Exposing a session token is a serious security risk. If an attacker gets their hands on this token, they can skip the login process entirely, impersonate the user, and gain access to everything the user is authorized to see or do, highlighting the critical need for securing cloud access governance.
Exploitation: From Audit Log to Session Hijacking
Link copied
Exposed Databricks session tokens can leave a user vulnerable to session hijacking, a tactic that gives the attacker full control of the user’s session until the token is revoked or expires.
Step 1: Discovering Exposed Tokens
An attacker with read access to the “system.access.audit” table can exploit this flaw methodically.
Figure 3 - Exposed session token
To exploit this, simply grab any “session_id” matching the pattern “auth-%” in SQL:
Step 2: Reformatting the Token
To utilize the token, it must be reformatted. Using the following Python command, you can reformat the session_id value:
Step 3: Inject the Cookie to Hijack the Session
Once the token has been reformatted, an attacker can use it to login as another user. To hijack the user’s session, set the JSESSIONID cookie with the formatted token value to authenticate to the Databricks Workspace.
Databricks' rich web platform serves as an excellent developer environment, but this strength becomes a critical vulnerability. Since nearly all Databricks operations, including most administrative tasks, can be performed through the web browser, a compromised session cookie gives an attacker complete access until the cookie is revoked, or the session expires.
Top 3 Mitigation and Prevention Strategies
Link copied
BeyondTrust immediately reported the vulnerability to Databricks, who promptly patched the oversight. Session tokens are now correctly redacted as “REDACTED_JSESSIONID”. It is still recommended to validate historical records in the system audit tables and third-party systems that consume these audit tables.
1. Restrict Access with the Principle of Least Privilege
Fortunately, Databricks, by default, only allows account admins to access this audit trail. Nonetheless, restrict access strictly to essential personnel and automated processes by implementing modern privileged access management. If your organization leverages this data for additional analytics, investigate who has access to this table and restrict access as necessary.
Use the following command to review current permissions:
2. Create Secure Views for Broader Access
If other teams or services require audit log data, avoid granting direct table access. Instead, provide access through a secure view that hashes sensitive fields like session_id to prevent token leakage. In our implementation, we maintain a dedicated location where analysts can access these protected views.
3. Implement Proactive Monitoring and Detection
Establish comprehensive proactive identity security monitoring to identify potential security threats in your Databricks Workspace environment. Focus on detecting anomalous behaviors such as:
Unusual login patterns
Unexpected data access or administrative actions
Session activities from unfamiliar locations
Bulk operations or rapid API calls
Configure automated alerts for these indicators and maintain detailed audit logs to support incident investigation.
From Audit Trail to Attack Path
Link copied
While Databricks has patched this vulnerability, the incident demonstrates how security tools can become attack vectors when sensitive data isn't properly handled. This case highlights the broader challenge organizations face: comprehensive audit logging is essential for security, yet these same logs can expose critical information if not carefully managed. The unredacted session tokens reveal how seemingly protective features can inadvertently create new pathways for compromise, emphasizing the need for continuous security review of even the most trusted defensive capabilities.
Want to stay ahead of the next hidden vulnerability?
The Databricks flaw is just one example of how overlooked details can create new attack paths. To stay ahead of the latest threats, see what BeyondTrust Phantom Labs™ is uncovering about today’s most advanced attack methods and how to defend against them. Explore the latest research here.
FAQs
Link copied
Audit logs provide visibility into user and system activity, helping organizations detect unusual behavior, investigate incidents, and meet compliance requirements. However, if misconfigured or improperly managed, logs can also expose sensitive data that attackers may exploit.
Session tokens are temporary credentials that keep users authenticated. In Databricks, if these tokens are exposed in Unity Catalog audit logs, attackers can reuse them to impersonate legitimate users and gain unauthorized access.
Session hijacking occurs when an attacker steals a valid Databricks session token, such as a JSESSIONID, and uses it to bypass authentication. This attack gives the attacker full control of the user’s session until the token is revoked or expires.
Signs of session hijacking in Databricks include unexpected logins from unfamiliar IPs or locations, unexplained data access, bulk queries or rapid API calls, and account activity that doesn’t match the user’s normal behavior. Monitoring audit logs for anomalies and setting alerts on unusual session patterns can help detect hijacking attempts quickly.
A critical Databricks weakness was the exposure of unredacted session IDs in the Unity Catalog audit logs. These values contained raw JSESSIONID tokens that attackers could extract, reformat, and inject to hijack user sessions. With a valid token, an attacker could impersonate the user, bypass login, and gain full access to Databricks workspaces and administrative functions until the session expired or was revoked.
Firewalls block unauthorized network traffic but do not prevent session hijacking if an attacker already has a valid session token. Protecting Databricks environments requires securing audit logs, redacting sensitive session IDs, enforcing strong IAM policies, and monitoring for abnormal user behavior.
The principle of least privilege ensures users and systems only have the minimum access required to perform their tasks. Restricting access to sensitive logs and creating secure views for broader use can significantly reduce the risk of data exposure.
Organizations should regularly review permissions to audit logs, implement secure views to redact or hash sensitive fields, and configure proactive monitoring for suspicious activity. Continuous security reviews of logging and monitoring tools are essential.
It highlights that even trusted security tools can inadvertently introduce identity risks. Strong IAM policies, regular audits, and continuous monitoring are critical to ensure cloud environments remain secure.
About the Author
Link copied
Charles Du
Director of Engineering
Charles Du is a Director of Data at BeyondTrust. While his primary duties are focused around management and development of our multi-petabyte DataLake, he enjoys diving into different systems to understand how things tick behind the scenes. Once in a while, he discovers cool stuff whilst cosplaying as a security researcher!