accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <els...@apache.org>
Subject Re: Accumulo as a Column Storage
Date Thu, 19 Oct 2017 22:06:01 GMT
Yup, that's the intended use case. You have the flexibility to determine 
what column families make sense to group together. Your only "cost" in 
changing your mind is the speed at which you can re-compact your data.

There is one concern which comes to mind. Though making many locality 
groups does increase the speed at which you can read from specific 
columns, it decreases the speed at which you can read from _all_ 
columns. So, you can do this trick to make Accumulo act more like a 
columnar database, but beware that you're going to have an impact if you 
still have a use-case where you read more than just one or two columns 
at a time.

Does that make sense?

On 10/19/17 5:50 PM, Mohammad Kargar wrote:
> AFAIK in Accumulo we can use "locality groups" to group sets of columns 
> together on disk which would make it more like  a column-oriented 
> database. Considering that "locality groups" are per column family, I 
> was wondering what if we treat column families like column qualifiers 
> (creating one column family per each qualifier) and assigning each to a 
> different locality group. This way all the data in a given column will 
> be next to each other on disk which makes it easier for analytical 
> applications to query the data.
> 
> Any thoughts?
> 
> Thanks,
> Mohammad
> 

Mime
View raw message