hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Parquet vs HFile
Date Tue, 03 Mar 2020 21:58:40 GMT
On Mon, Mar 2, 2020 at 10:27 AM Burd, Roni <roniburd@amazon.com.invalid>

> Has anyone looked at leveraging Parquet files to replace HFiles? I
> recognize that HFiles may be more advanced for the hbase case, but my
> assumption is that Parquet can be evolved as well.
> This would also help hfiles align better with a more widely adopted
> industry standard.
> Thoughts?

I'd think the mismatch between the formats would be expensive to little
benefit other than 'industry standard' unless work was done to teach hbase
about columns at least as far up as the hbase 'block' as described in the
'Ressi data layout' in [1].

 1. https://dl.acm.org/doi/pdf/10.1145/3035918.3056103

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message