hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: HFile V2 vs HFile V3
Date Fri, 13 Jun 2014 16:19:22 GMT
If you read down through that JIRA, you'll have the answer to that
question: The results were inconclusive and the changes in that patch broke
thread safety.

I also suggest returning to the bottom of the Cassandra wiki page you
mentioned and follow the link to the JIRA. Cassandra appears to have not
actually tested a Dremel-style storage format but rather modified their
existing file format inspired in limited ways by concepts from the Dremel

In the Apache ecosystem, we have Parquet, an I would say faithful
implementation of the ideas in the Dremel paper, see http://parquet.io/

I encourage you to look into the details of HFile and Parquet, and learn
more about the inner workings of HBase, as to why using a Dremel-style
columnar storage format with HBase might not be an easy undertaking.
Abstractly speaking it would be interesting to consider, could be nice to
provide support for bulk ingest of Parquet files for immutable data
perhaps. The next question is who would volunteer to do that.

On Fri, Jun 13, 2014 at 9:15 AM, abhishek1015 <abhishek1015@gmail.com>

> Thanks ted for providing the link to HBase-5313. Apparently, no one seems
> to
> work on this which is strange.
> Abhishek
> --
> View this message in context:
> http://apache-hbase.679495.n3.nabble.com/HFile-V2-vs-HFile-V3-tp4060405p4060418.html
> Sent from the HBase User mailing list archive at Nabble.com.

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message