accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Slacum <wilhelm.von.cl...@accumulo.net>
Subject Re: Mappers for Accumulo
Date Tue, 12 Mar 2013 17:21:07 GMT
Depending on the size of the tablet, you can lower the split threshold
and/or set new split points on the table.

On Mon, Mar 11, 2013 at 5:39 PM, Aji Janis <aji1705@gmail.com> wrote:

> So we realized that all my data for the table of interest fits onto one
> tablet (HUGE tablet isn't it) ie we always had ONE mapper. So we said lets
> split the table by range so now we can have more mappers. So the next
> problem is  what if someone puts in start range as first row and end range
> as last row..... now I am back to One mapper. So what i need is some way to
> take in a range and split into a List<Range>.
>
>
>
> On Mon, Mar 11, 2013 at 5:13 PM, William Slacum <
> wilhelm.von.cloud@accumulo.net> wrote:
>
>> So you want both auto adjusting and not auto adjusting depending on the
>> size of a range? I suppose you could lift the code for doing the adjusting,
>> and do some introspection on the ranges (such as "how may tablets do I have
>> in this range?") and apply as necessary.
>>
>>
>> On Mon, Mar 11, 2013 at 4:47 PM, Aji Janis <aji1705@gmail.com> wrote:
>>
>>> So looks like doing a List<Range> is what I need so that I can have a
>>> mapper per range. However, a more interesting scenario is one when given a
>>> big range I want to split it into multiple ranges. In other words if my
>>> rowid was 1_hello, 2_hello, .... 9_hello, 10_hello. And the range given was
>>> 2 to 5. But i want one mapper per integer so 4 mappers in this case... any
>>> ideas on how I can accomplish that?
>>>
>>>
>>> Thanks all for suggestions.
>>>
>>>
>>> On Fri, Mar 8, 2013 at 7:02 PM, Keith Turner <keith@deenlo.com> wrote:
>>>
>>>> On Fri, Mar 8, 2013 at 4:17 PM, Aji Janis <aji1705@gmail.com> wrote:
>>>> > Thank you. Follow up question.
>>>> >
>>>> > Would this enforce one mapper per range even if all the data (From
>>>> three
>>>> > ranges) is on one node/tablet?
>>>>
>>>> Look at disableAutoAdjustRanges(). This determines wether it creates a
>>>> mapper per tablet per range OR per range.
>>>>
>>>>
>>>> >
>>>> >
>>>> >
>>>> > On Fri, Mar 8, 2013 at 1:17 PM, Mike Hugo <mike@piragua.com> wrote:
>>>> >>
>>>> >> See AccumuloInputFormat
>>>> >>
>>>> >> ArrayList<Range> ranges = new ArrayList<Range>();
>>>> >> // populate array list of row ranges ...
>>>> >> AccumuloInputFormat.setRanges(job, ranges);
>>>> >>
>>>> >>
>>>> >> You should get one mapper per range.
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Fri, Mar 8, 2013 at 12:11 PM, Aji Janis <aji1705@gmail.com>
>>>> wrote:
>>>> >>>
>>>> >>> Hello,
>>>> >>>
>>>> >>>  I am trying to figure out how I can configure number of mappers
>>>> (if its
>>>> >>> even possible) based on a Accumulo row range. My accumulo rowid
>>>> uses the
>>>> >>> format:
>>>> >>>
>>>> >>> abc/1
>>>> >>> abc/2
>>>> >>> ...
>>>> >>> def/3
>>>> >>> ....
>>>> >>> xyz/13...
>>>> >>>
>>>> >>> If I want to specify three ranges: [abc/1 to abc/3] , [def/1
to def
>>>> 5] ,
>>>> >>> [jkl/13 to klm 15]. and have one mapper work on one range, is
there
>>>> a way I
>>>> >>> can do this?? How do I even set up my mapreduce job to accept
these
>>>> >>> ranges??? Thankyou for all feedback.
>>>> >>>
>>>> >>>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Mime
View raw message