hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Task attempt failed to report status
Date Sat, 06 Mar 2010 22:27:03 GMT
On Fri, Mar 5, 2010 at 1:12 AM, steven zhuang
<steven.zhuang.1984@gmail.com> wrote:
>     when I import data into the HTable with a Map/Reduce job, the task runs
> smoothly until the last reducer failed 6 times to report its status.

How many reducers?  All completed except this last one and it failed
inspite of 6 attempts?

Perhaps its a really fat Put that is holding things up?  Can you add
logging of put sizes or some such to see if it an anomalous record
that is causing the non-reporting after ten minutes?

>     In my program I use batchupdate to collect for every 1000 cells, and
> update the status. I don't think the normal inserting will cost 10 minutes,
> coz the first 99% of the job was smoothly done, only the very last reducer
> will get the "fail to report status" error.

Can you add logging to your reducers?  Log each put?  Try and see
where its hanging for > 10 minutes?

>     I doubt the problem is caused by regoinserver is way too busy, which
> causes the "output.collect(k, bu);" takes too much time to return. but I am
> not sure coz I don't know which regionserver is actually committing the
> update.
>    So which log should I dig into? any hint is appreciated.

Well, a reducer is responsible for a portion of the rows only,
usually.  MR is sorting on row?  So what arrives at the reducer is
sorted?  When this last reducer is running, look at UI?  It's probably
going to one regionserver only?  If you emit whats being inserted,
perhaps you can see from the row what region its trying to go too...
See where its hosted and look at that regionservers logs?

>    My code's  submitting portion is as follows(just copied from some online
> source and changed a little):
>        public void reduce(ImmutableBytesWritable k,
>                Iterator<HbaseMapWritable<byte[], byte[]>> v,
>                OutputCollector<ImmutableBytesWritable, BatchUpdate> output,
>                Reporter r) throws IOException {
>            while (v.hasNext()) {
>                r.setStatus("Reducer begin committing row: " + new
> String(k.get(), HConstants.UTF8_ENCODING) + "  Time:"+ new Date());
>                BatchUpdate bu = new BatchUpdate(k.get());
>                int cellCnt = 0;
>                while (v.hasNext()) {
>                    HbaseMapWritable<byte[], byte[]> hmw = v.next();
>                    Iterator<Entry<byte[], byte[]>> iter =
>  hmw.entrySet().iterator();
>                    while(iter.hasNext()){
>                        Entry<byte[], byte[]> e = iter.next();
>                        bu.put(e.getKey(), e.getValue());
>                        //System.err.println("now add cell: "+e+" cell
> count: " + cellCnt + new Date());
>                        if(++cellCnt>1000){
> *                            output.collect(k, bu); //this line coz timeout.
> *
> *                            r.setStatus("Reducer done committing " + new
> String(e.getKey(), HConstants.UTF8_ENCODING) + ":"+new String(e.getValue(),
> HConstants.UTF8_ENCODING)+ "  Time:"+ new Date());*
>                            bu = new BatchUpdate(k.get());
>                            cellCnt = 0;
>                        }
>                    }
>                }
>                if(cellCnt>0){
>                    output.collect(k, bu);
>                }
>            }

Try calling out.collect every ten cells?

You are using TableOutputFormat?  Its buffering inserts to the table?
If so, configure it to not buffer so much?


View raw message