drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul challapalli <challapallira...@gmail.com>
Subject Compressing the metadata cache file
Date Mon, 28 Sep 2015 20:54:26 GMT
I have 10k complex parquet files with large footers. The schema for all
these files is the same. Drill ended up generating a cache file which is
2.26 GB. Now a simple count(*) query got hung from sqlline and did not
return.

In this specific case, I compared the footers for 2 files and there were
many parts which are identical. Would it make sense to store the common
information once and override the specific details?

- Rahul

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message