hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Rodionov <vladrodio...@gmail.com>
Subject Re: Query on OutOfOrderScannerNextException
Date Sat, 06 Jun 2015 22:18:39 GMT
The scanner fails at the very beginning. The reason is because they need a
very few rows from a large file and HBase needs
to fill RPC buffer (which is 100 rows, yes?) before it can return first
batch. This takes more than 60 sec and scanner fails (do not ask me why its
not the timeout exception)

1. HBASE-13090 will help (can be back ported I presume to 1.0 and 0.98.x)
2. Smaller region size will help
3. Smaller  hbase.client.scanner.caching will help
4. Larger hbase.client.scanner.timeout.period will help
5. Better data store design (rowkeys) is preferred.

Too many options to choose from.

-Vlad


On Sat, Jun 6, 2015 at 3:04 PM, Arun Mishra <arunmishra@me.com> wrote:

> Thanks TED.
>
> Regards,
> Arun.
>
> > On Jun 6, 2015, at 2:34 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > HBASE-13090 'Progress heartbeats for long running scanners' solves the
> > problem you faced.
> >
> > It is in the 1.1.0 release.
> >
> > FYI
> >
> >> On Sat, Jun 6, 2015 at 12:54 PM, Arun Mishra <arunmishra@me.com> wrote:
> >>
> >> Hello,
> >>
> >> I have a query on OutOfOrderScannerNextException. I am using hbase
> 0.98.6
> >> with 45 nodes.
> >>
> >> I have a mapreduce job which scan 1 table for last 1 day worth data
> using
> >> timerange. It has been running fine for months without any failure. But
> >> last couple of days it has been failing with below exception. I have
> traced
> >> the failure to a single region. This region has 1 store and 1 hfile of
> >> 5+GB. What we realized was that, we were writing some bulk data, which
> used
> >> to land on this region. After we stopped writing this data, this region
> has
> >> been receiving very few writes per day.
> >>
> >> When mapreduce job runs, it creates a map task for this region and that
> >> task fails with OutOfOrderScannerNextException. I was able to reproduce
> >> this error by running a scan command with same start/stop row and
> timerange
> >> option. Finally, we split this region to be small enough for scan
> command
> >> to work.
> >>
> >> My query is if there is any option, apart from increasing the timeout,
> >> which can solve this use case? I am thinking of a use case where data
> comes
> >> in for 3 days a week in bulk and then nothing for next 3 days. Kind of
> >> creating a data hole in region.
> >> My understanding is that I am hit with this error because I have big
> store
> >> files and timerange scan is reading entire file even though it contains
> >> very few rowkeys for that timerange.
> >>
> >> hbase.client.scanner.caching = 100
> >> hbase.client.scanner.timeout.period = 60s
> >>
> >> scan 'dummytable',{ STARTROW=>'dummyrowkey-start',
> >> STOPROW=>'dummyrowkey-end', LIMIT=>1000,
> >> TIMERANGE=>[1433462400000,1433548800000]}
> >> ROW                                           COLUMN+CELL
> >>
> >> ERROR:
> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
> >> Expected nextCallSeq: 1 But the nextCallSeq got from client: 0;
> >> request=scanner_id: 33648 number_of_rows: 100 close_scanner: false
> >> next_call_seq: 0
> >> at
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3193)
> >> at
> >>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29587)
> >> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2031)
> >> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:108)
> >> at
> >>
> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:114)
> >> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:94)
> >> at java.lang.Thread.run(Thread.java:745)
> >>
> >>
> >> Regards,
> >> Arun
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message