
International Journal on Science and Technology
E-ISSN: 2229-7677
•
Impact Factor: 9.88
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 16 Issue 2
2025
Indexing Partners



















Parquet’s Columnar Storage Advantage: A Case Study in Big Data Analytics
Author(s) | Pradeep Bhosale |
---|---|
Country | United States |
Abstract | As enterprises increasingly rely on large-scale analytics to extract insights from data lakes and data warehouses, the choice of storage format has a profound impact on query performance, cost, and resource utilization. Apache Parquet, a popular columnar storage format, has gained widespread adoption in the big data ecosystem due to its efficient compression, encoding, and predicate pushdown capabilities. By storing data column-wise, Parquet reduces I/O, network transfer, and CPU overhead when analyzing selective subsets of large datasets. This paper provides a comprehensive examination of Parquet’s columnar architecture, comparing it to row-based formats and highlighting its benefits in terms of query acceleration, storage optimization, and seamless integration with modern analytical engines. Through architectural explanations, benchmarking results, code snippets, and real-world case studies, we illustrate how Parquet’s design principles translate into tangible performance gains in analytical workloads. We also present emerging best practices, discuss integration with query engines like Spark, Trino, and Presto, and consider future directions in columnar format evolution. By understanding Parquet’s advantages and applying its features judiciously, data engineers and architects can unlock faster, cheaper, and more flexible big data analytics. |
Keywords | Apache Parquet, Columnar Storage, Big Data Analytics, Data Lakes, Predicate Pushdown, Data Compression, Spark, Trino, Presto |
Field | Engineering |
Published In | Volume 15, Issue 2, April-June 2024 |
Published On | 2024-04-10 |
Cite This | Parquet’s Columnar Storage Advantage: A Case Study in Big Data Analytics - Pradeep Bhosale - IJSAT Volume 15, Issue 2, April-June 2024. DOI 10.5281/zenodo.14631461 |
DOI | https://doi.org/10.5281/zenodo.14631461 |
Short DOI | https://doi.org/g8zdqv |
Share this


CrossRef DOI is assigned to each research paper published in our journal.
IJSAT DOI prefix is
10.71097/IJSAT
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
