accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Medinets <david.medin...@gmail.com>
Subject Re: List of unique qualifiers [SEC=UNOFFICIAL]
Date Thu, 16 Jan 2014 21:54:58 GMT
Clone the table, take the cloned table offline, use it for the map-reduce
job, then delete it. All of this work can be done through the Java API
which is nice if you'll be running the job more than once.


On Wed, Jan 15, 2014 at 8:27 PM, Corey Nolet <cjnolet@gmail.com> wrote:

> Matt,
>
> This should help:
>
> Collection<Pair<Text,Text>> cols = Collections.singleton(new
> Pair<Text,Text>(new Text("cityOfBirth"), null));
> AccumuloInputFormat.fetchColumns(job, cols);
>
>
>
> On Wed, Jan 15, 2014 at 7:29 PM, Dickson, Matt MR <
> matt.dickson@defence.gov.au> wrote:
>
>>  *UNOFFICIAL*
>> Thanks Keith.  I've run a simple mr job based on the UniqueColumns
>> example, but due to the size of the table this is taking a very long time.
>> Is it possible to pre-filter the data that goes to the MR job based on
>> family, eg only run the MR job on columns with a specific column family of
>> 'cityofbirth'?  I am currently going through every column in the table and
>> checking the column family in the mapper ... slow.
>>
>>
>>
>>  ------------------------------
>> *From:* Keith Turner [mailto:keith@deenlo.com]
>> *Sent:* Wednesday, 15 January 2014 12:06
>> *To:* user@accumulo.apache.org
>>
>> *Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL]
>>
>>
>>
>>
>> On Tue, Jan 14, 2014 at 6:06 PM, Dickson, Matt MR <
>> matt.dickson@defence.gov.au> wrote:
>>
>>>  *UNOFFICIAL*
>>> Just for simplicity, this is a one of request for managment so I was
>>> hoping to just scan via the shell and output to a file.
>>>
>>> If I need to do it via a mr job I can do it that way and would be keen
>>> to hear any suggestions.
>>>
>>
>> You could modify the following example in 1.4 to suit your needs.
>>
>>
>> src/examples/simple/src/main/java/org/apache/accumulo/examples/simple/mapreduce/UniqueColumns.java
>>
>>
>>>
>>>  ------------------------------
>>> *From:* David Medinets [mailto:david.medinets@gmail.com]
>>> *Sent:* Wednesday, 15 January 2014 09:36
>>> *To:* accumulo-user
>>> *Subject:* Re: List of unique qualifiers [SEC=UNOFFICIAL]
>>>
>>>   Why the restriction to the shell environment? A nice map-reduce job
>>> would be ideal for this task.
>>>
>>>
>>> On Tue, Jan 14, 2014 at 5:30 PM, Dickson, Matt MR <
>>> matt.dickson@defence.gov.au> wrote:
>>>
>>>>  *UNOFFICIAL*
>>>> Hi,
>>>>
>>>> I need to extract a list of unique qualifier values on a table from the
>>>> Accumulo shell.  For every column there is a column family that identifies
>>>> a specific qualifer, eg 'cityofbirth'.  I would like to get a unique list
>>>> of all cities that are a listed in the qualifier against 'cityofbirth' for
>>>> all rows.
>>>>
>>>> eg, If I had a table with
>>>>
>>>> Rowid                Family            Qual
>>>> 123                   cityofbirth         LosAngeles
>>>> 133                   cityofbirth         Brisbane
>>>> 222                   cityofbirth         London
>>>> 124                   cityofbirth         London
>>>> 124                   cityofbirth         London
>>>>
>>>> I want a list that is just;
>>>> LosAngeles
>>>> London
>>>> Brisbane
>>>>
>>>> Any suggestions on how to achieve this from the shell would great.
>>>>
>>>> Thanks in advance.
>>>> Matt
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>

Mime
View raw message