Security researchers have uncovered a severe vulnerability in Apache Parquet, a widely-adopted columnar storage format, affecting all versions up to 1.15.0. The vulnerability, designated as CVE-2025-30065, has received the highest possible CVSS score of 10.0, indicating an urgent security risk that requires immediate attention from organizations utilizing this data format.
Understanding the Technical Impact
The critical flaw resides in the parquet-avro module’s data deserialization process, enabling remote code execution (RCE) through specially crafted Parquet files. This vulnerability is particularly concerning given Apache Parquet’s extensive adoption in modern big data architectures and cloud-native applications. When exploited, the vulnerability allows attackers to execute arbitrary code on affected systems, potentially compromising entire data processing pipelines.
Systems Processing Untrusted Parquet Files from External Sources
Any organization using Apache Parquet Java versions up to and including 1.15.0 is at risk. The vulnerability’s impact extends across major cloud platforms and enterprise environments, including AWS, Google Cloud, and Microsoft Azure. Notable organizations potentially affected include those running data lakes, ETL pipelines, and analytics platforms built on Parquet — such as systems used by Netflix, Uber, Airbnb, and LinkedIn. Any system that ingests externally supplied Parquet files is particularly exposed.
Attack Vectors and Security Implications
Successful exploitation requires an attacker to introduce a maliciously crafted Parquet file into the target system’s data processing workflow. Once achieved, attackers can:
- Execute arbitrary commands with system-level privileges
- Exfiltrate sensitive data from compromised systems
- Disrupt critical data processing operations
- Deploy additional malicious payloads, including ransomware
Immediate Actions Required
- Upgrade immediately to Apache Parquet version 1.15.1 or later — this is the only complete fix
- Audit all systems and applications that process Parquet files to identify exposure to the vulnerable parquet-avro module
- Implement strict input validation for all Parquet files ingested from external or untrusted sources
- Add file integrity checks (hash verification) before processing any Parquet files entering your pipeline
- Restrict which systems can accept Parquet files from external sources until patching is complete
The discovery of CVE-2025-30065 highlights the critical importance of maintaining robust security practices in big data environments. Organizations that cannot patch immediately should treat all externally sourced Parquet files as untrusted and quarantine them for inspection before processing.