accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <billie.rina...@gmail.com>
Subject Re: unique list of columns
Date Wed, 09 Apr 2014 00:21:59 GMT
Does this imply that the histogram option works in 1.5.0 as long as you
spell it "historgram"?


On Tue, Apr 8, 2014 at 4:54 PM, Josh Elser <josh.elser@gmail.com> wrote:

> Arshak,
>
> Looks like that was a bug against 1.5.0 and fixed in 1.5.1.
>
> https://issues.apache.org/jira/browse/ACCUMULO-1571
>
>
> On 4/8/14, 7:24 PM, Arshak Navruzyan wrote:
>
>> I am trying to print out the histogram with that command but get the
>> usage message instead.  --dump option is working fine.   I'm on Accumulo
>> 1.5.0
>>
>> PACKAGE=org.apache.accumulo.core.file.rfile
>> bin/accumulo $PACKAGE.PrintInfo --histogram
>> /accumulo/tables/53/t-0003371/A0003jbg.rf
>>
>> Usage: org.apache.accumulo.core.file.rfile.PrintInfo [options]  <file> {
>> <file> ... }
>>
>>    Options:
>>
>>      -d, --dump
>>
>>         dump the key/value pairs
>>
>>         Default: false
>>
>>      -h, -?, --help, -help
>>
>>         Default: false
>>
>>          --historgram
>>
>>         print a histogram of the key-value sizes
>>
>>         Default: false
>>
>>
>> Unknown option: --histogram
>>
>>
>>
>> On Sat, Feb 22, 2014 at 8:47 AM, Mike Drob <madrob@cloudera.com
>> <mailto:madrob@cloudera.com>> wrote:
>>
>>     There's not a single good way that I am aware of, but there are a
>>     couple ways that will get you close.
>>
>>     First, you can use the SortedKeyIterator to truncate values and
>>     potentially save yourself a lot of data transfer.
>>     Second, each RFile header block will track the columns contained, up
>>     to 1000 (possibly configurable). Check out PrintInfo[1].
>>
>>     Mike
>>
>>     [1]:
>>     https://github.com/apache/accumulo/blob/master/core/src/
>> main/java/org/apache/accumulo/core/file/rfile/PrintInfo.java
>>
>>
>>     On Sat, Feb 22, 2014 at 11:25 AM, Arshak Navruzyan
>>     <arshakn@gmail.com <mailto:arshakn@gmail.com>> wrote:
>>
>>         I don't know the inner workings of the Rfiles enough but I was
>>         wondering if there is a faster way to get a unique list of
>>         columns in Accumulo (short of doing a full mapreduce).  Is there
>>         some way to skip ahead all the volumes and just get to the next
>>         column?
>>
>>         Thanks
>>
>>
>>
>>

Mime
View raw message