accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Wells <>
Subject Re: Question about best practices on column names
Date Wed, 27 May 2015 13:17:48 GMT
On the surface it adds an additional level of specification/grouping.

The potential benefit we have in accumulo is that along with the fact that
identical rowID's are guaranteed to be in the same file. You can use
Locality Groups, to place specific Column Families into the same file as
well. Providing faster scans when looking for a specific column family.

On Wed, May 27, 2015 at 9:05 AM, David Patterson <> wrote:

> I've been trying to understand the difference between the two column name
> parts -- column family and column qualifier. I don't understand the value
> of using the columnFamily for the column name and an "empty text" (new
> Text(new byte[0])) field for the column qualifier vs. a non-unique column
> name and the distinct column name in the column qualifier position.
> I can sort-of understand the distinction if I have multiple distinct kinds
> of data in my data collection. I could use the column family part to
> determine how to interpret the rest of the data (what columns I can expect,
> etc.). But, that kind of data could also be handled with multiple databases.
> Any guidance would be appreciated.
> Thanks.
> Davie Patterson

*Andrew George Wells*
*Software Engineer*
* <>*

View raw message