hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Li <fancye...@gmail.com>
Subject Re: how to do parallel scanning in map reduce using hbase as input?
Date Fri, 25 Jul 2014 02:57:31 GMT
that's great. what's the status of these issue? any patch available now?

On Fri, Jul 25, 2014 at 4:23 AM, Vladimir Rodionov
<vladrodionov@gmail.com> wrote:
> I am working on improving inter-region scan performance and have the patch
> already. The patch will be committed as soon as all tests are done. This
> should improve M/R over HBase performance because now you will be able to
> create input splits with granularities lower than a region without loss of
> a performance.
>
> See :
>
> https://issues.apache.org/jira/browse/HBASE-7336
> https://issues.apache.org/jira/browse/HBASE-5979
>
> for more information on the subject.
>
> -Vladimir Rodionov
>
>
>
> On Tue, Jul 22, 2014 at 3:31 PM, Stack <stack@duboce.net> wrote:
>
>> On Mon, Jul 21, 2014 at 11:11 PM, Li Li <fancyerii@gmail.com> wrote:
>>
>> > On Tue, Jul 22, 2014 at 1:57 PM, Stack <stack@duboce.net> wrote:
>> > > On Mon, Jul 21, 2014 at 10:53 PM, Li Li <fancyerii@gmail.com> wrote:
>> > >
>> > >> Sorry, I enter tab and it send my unfinished post. See the following
>> > >> mail for answers of other questions.
>> > >>
>> > >> I forget the exception's detail. It throws exception in terminal.
>> > >
>> > >
>> > > What exception is thrown?
>> > I forget it. maybe I can retry it with 8 mapper configuration. it
>> > seems like out of memory exception
>> >
>>
>>
>> Who OOME'd?  The map task or hbase?
>>
>>
>>
>> > >
>> > >
>> > >
>> > >> The
>> > >> default io.sort.mb is 100 and I set it to 500 to speed up reducer.
>> > >
>> > >
>> > > Do you have to have a reducer?  If you could skip the shuffle...
>> > I have 8 reducers
>> >
>>
>>
>> Do you have to reduce?
>>
>> Would more reducers make your job run faster?
>>
>>
>>
>> > >
>> > >
>> > >
>> > >> So
>> > >> I set mapred.child.java.opts to 1g
>> > >> The datanode/regionserver has 16GB memory but free memory
>> > >
>> > >
>> > > Does the RS use the 16G?
>> > the RS use 8G and there are datanode and tasktracker in this machine
>> > >
>> >
>>
>>
>> How much for DN and TT?  They don't need much usually.
>>
>>
>>
>> > >
>> > >
>> > >> for
>> > >> map-reduce is about 5gb. So I can't add more mappers
>> > >>
>> > >>
>> > >> How much RAM in these machines?
>> > 16GB
>>
>>
>>
>> These your machines or EC2?  Can you get bigger machines if EC2?
>>
>> St.Ack
>>

Mime
View raw message