hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: M/R scan problem
Date Mon, 04 Jul 2011 17:13:59 GMT
>From master UI, click 'zk dump'
:60010/zk.jsp would show you the active connections. See if the count
reaches 300 when map tasks run.

On Mon, Jul 4, 2011 at 10:12 AM, Ted Yu <yuzhihong@gmail.com> wrote:

> The reason I asked about HBaseURLsDaysAggregator.java was that I see no
> HBase (client) code in call stack.
> I have little clue for the problem you experienced.
>
> There may be more than one connection to zookeeper from one map task.
> So it doesn't hurt if you increase hbase.zookeeper.property.maxClientCnxns
>
> Cheers
>
>
> On Mon, Jul 4, 2011 at 9:47 AM, Lior Schachter <liors@infolinks.com>wrote:
>
>> 1. HBaseURLsDaysAggregator.java:124, HBaseURLsDaysAggregator.java:131 :
>> are
>> not important since even when I removed all my map code the tasks got
>> stuck
>> (but the thread dumps were generated after I revived the code). If you
>> think
>> its important I'll remove the map code again and re-generate the thread
>> dumps...
>>
>> 2. 82 maps were launched but only 36 ran simultaneously.
>>
>> 3. hbase.zookeeper.property.maxClientCnxns = 300. Should I increase it ?
>>
>> Thanks,
>> Lior
>>
>>
>> On Mon, Jul 4, 2011 at 7:33 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>> > In the future, provide full dump using pastebin.com
>> > Write snippet of log in email.
>> >
>> > Can you tell us what the following lines are about ?
>> > HBaseURLsDaysAggregator.java:124
>> > HBaseURLsDaysAggregator.java:131
>> >
>> > How many mappers were launched ?
>> >
>> > What value is used for hbase.zookeeper.property.maxClientCnxns ?
>> > You may need to increase the value for above setting.
>> >
>> > On Mon, Jul 4, 2011 at 9:26 AM, Lior Schachter <liors@infolinks.com>
>> > wrote:
>> >
>> > > I used kill -3, following the thread dump:
>> > >
>> > > ...
>> > >
>> > >
>> > > On Mon, Jul 4, 2011 at 6:22 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> > >
>> > > > I wasn't clear in my previous email.
>> > > > It was not answer to why map tasks got stuck.
>> > > > TableInputFormatBase.getSplits() is being called already.
>> > > >
>> > > > Can you try getting jstack of one of the map tasks before task
>> tracker
>> > > > kills
>> > > > it ?
>> > > >
>> > > > Thanks
>> > > >
>> > > > On Mon, Jul 4, 2011 at 8:15 AM, Lior Schachter <liors@infolinks.com
>> >
>> > > > wrote:
>> > > >
>> > > > > 1. Currently every map gets one region. So I don't understand
what
>> > > > > difference will it make using the splits.
>> > > > > 2. How should I use the TableInputFormatBase.getSplits() ? Could
>> not
>> > > find
>> > > > > examples for that.
>> > > > >
>> > > > > Thanks,
>> > > > > Lior
>> > > > >
>> > > > >
>> > > > > On Mon, Jul 4, 2011 at 5:55 PM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>> > > > >
>> > > > > > For #2, see TableInputFormatBase.getSplits():
>> > > > > >   * Calculates the splits that will serve as input for the
map
>> > tasks.
>> > > > The
>> > > > > >   * number of splits matches the number of regions in a
table.
>> > > > > >
>> > > > > >
>> > > > > > On Mon, Jul 4, 2011 at 7:37 AM, Lior Schachter <
>> > liors@infolinks.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > 1. yes - I configure my job using this line:
>> > > > > > >
>> > TableMapReduceUtil.initTableMapperJob(HBaseConsts.URLS_TABLE_NAME,
>> > > > > scan,
>> > > > > > > ScanMapper.class, Text.class, MapWritable.class, job)
>> > > > > > >
>> > > > > > > which internally uses TableInputFormat.class
>> > > > > > >
>> > > > > > > 2. One split per region ? What do you mean ? How do
I do that
>> ?
>> > > > > > >
>> > > > > > > 3. hbase version 0.90.2
>> > > > > > >
>> > > > > > > 4. no exceptions. the logs are very clean.
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Mon, Jul 4, 2011 at 5:22 PM, Ted Yu <yuzhihong@gmail.com>
>> > > wrote:
>> > > > > > >
>> > > > > > > > Do you use TableInputFormat ?
>> > > > > > > > To scan large number of rows, it would be better
to produce
>> one
>> > > > Split
>> > > > > > per
>> > > > > > > > region.
>> > > > > > > >
>> > > > > > > > What HBase version do you use ?
>> > > > > > > > Do you find any exception in master / region server
logs
>> around
>> > > the
>> > > > > > > moment
>> > > > > > > > of timeout ?
>> > > > > > > >
>> > > > > > > > Cheers
>> > > > > > > >
>> > > > > > > > On Mon, Jul 4, 2011 at 4:48 AM, Lior Schachter
<
>> > > > liors@infolinks.com>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi all,
>> > > > > > > > > I'm running a scan using the M/R framework.
>> > > > > > > > > My table contains hundreds of millions of
rows and I'm
>> > scanning
>> > > > > using
>> > > > > > > > > start/stop key about 50 million rows.
>> > > > > > > > >
>> > > > > > > > > The problem is that some map tasks get stuck
and the task
>> > > manager
>> > > > > > kills
>> > > > > > > > > these maps after 600 seconds. When retrying
the task
>> > everything
>> > > > > works
>> > > > > > > > fine
>> > > > > > > > > (sometimes).
>> > > > > > > > >
>> > > > > > > > > To verify that the problem is in hbase (and
not in the map
>> > > code)
>> > > > I
>> > > > > > > > removed
>> > > > > > > > > all the code from my map function, so it
looks like this:
>> > > > > > > > > public void map(ImmutableBytesWritable key,
Result value,
>> > > Context
>> > > > > > > > context)
>> > > > > > > > > throws IOException, InterruptedException
{
>> > > > > > > > > }
>> > > > > > > > >
>> > > > > > > > > Also, when the map got stuck on a region,
I tried to scan
>> > this
>> > > > > region
>> > > > > > > > > (using
>> > > > > > > > > simple scan from a Java main) and it worked
fine.
>> > > > > > > > >
>> > > > > > > > > Any ideas ?
>> > > > > > > > >
>> > > > > > > > > Thanks,
>> > > > > > > > > Lior
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message