hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arnaud Le-roy <sdnetw...@gmail.com>
Subject Re: hbase map/reduce questions
Date Thu, 05 Apr 2012 13:44:34 GMT
yes i know but it's just an exemple we can do the same exemple with
one billion but effectivelly you could say me in this case the rows
would be stored on all node.

maybe it's not possible to distributed manually the task through the cluster ?
and maybe it's not a good idea but  I would like to know in order to
make the best schema for my data.

Le 5 avril 2012 15:08, Doug Meil <doug.meil@explorysmedical.com> a écrit :
>
> If you only have 1000 rows, why use MapReduce?
>
>
>
>
>
> On 4/5/12 6:37 AM, "Arnaud Le-roy" <sdnetwork@gmail.com> wrote:
>
>>but do you think that i can change the default behavior ?
>>
>>for exemple i have ten nodes in my cluster and my table is stored only
>>on two nodes this table have 1000 rows.
>>with the default behavior only two nodes will work for a map/reduce
>>task., isn't it ?
>>
>>if i do a custom input that split the table by 100 rows, can i
>>distribute manually each part  on a node   regardless where the data
>>is ?
>>
>>Le 5 avril 2012 00:36, Doug Meil <doug.meil@explorysmedical.com> a écrit :
>>>
>>> The default behavior is that the input splits are where the data is
>>>stored.
>>>
>>>
>>>
>>>
>>> On 4/4/12 5:24 PM, "sdnetwork" <sdnetwork@gmail.com> wrote:
>>>
>>>>ok thanks,
>>>>
>>>>but i don't find the information that tell me how the result of the
>>>>split
>>>>is
>>>>distrubuted across the different node of the cluster ?
>>>>
>>>>1) randomely ?
>>>>2) where the data is stored ?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>

Mime
View raw message