hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yu Li <car...@gmail.com>
Subject Re: Is HBase RPC-Handling idempotent for reads?
Date Mon, 10 Apr 2017 04:14:55 GMT
Correct me if I'm wrong, but I think we should assume no other but the
single operation when checking whether it's idempotent. Similar to the
wikipedia
example <https://en.wikipedia.org/wiki/Idempotence#Examples>: "A function
looking up a customer's name and address in a database
<https://en.wikipedia.org/wiki/Database> is typically idempotent, since
this will not cause the database to change", I think all Get/MultiGet/Scan
operations in hbase are idempotent.

About "speculative rpc handling", I doubt whether it benefits in hbase.
Normally if a request already arrives at server side but with slow
execution, the problem might be:
1. The server is too busy and request get queued
2. The processing itself is slow due to the request pattern or some
hardware failure
I don't think a speculative execution of the request could help in any of
the above cases. It's different from the speculative task execution in MR,
there we could choose another node to execute the task while here we have
no choice.

OTOH, we already have timeout mechanism to make sure server resource won't
be wasted:
1. For scan
    - When a request handling timeouts, server will stop further
processing, refer to RSRpcServices#getTimeLimit and
ScannerContext#checkTimeLimit
    - If the client went away during processing, server will also stop
processing, check the SimpleRpcServer#disconnectSince and
RegionScannerImpl#nextInternal methods for more details.

2. For single Get
    - Controlled by rpc and operation timeout

3. For MultiGet
    - I think this is something we could improve. On client side we have
timeout mechanism but on server side there seems to be no relative
interrupt logic.


Best Regards,
Yu

On 10 April 2017 at 11:12, Jerry He <jerryjch@gmail.com> wrote:

> Again, it depends on how you abort and 'idempotent' can have different
> definitions.
>
> For example, even if you are only concerned about read,
> there are resources on the HRegion that the read touches or acquires
> (scanner, lock, mvcc etc) that hopefully will be cleaned/releases with the
> abort.
> Or you may have it in a bad/inconsistent state.
>
> Thanks.
>
> Jerry
>
>
> On Sun, Apr 9, 2017 at 7:14 PM, 张铎(Duo Zhang) <palomino219@gmail.com>
> wrote:
>
> > I think this depends on how you model the problem. At server side, if you
> > re-execute a read operation with a new mvcc, then you may read a value
> that
> > should not be visible if you use the old mvcc. If you define this as an
> > error then I think there will be conflicts.
> >
> > But at client side, there is guarantee that the request you send first
> will
> > be executed first. So as long as the read request does not return, I
> think
> > it is OK to read a value which is written by a write request which is
> sent
> > after the read request?
> >
> > Thanks.
> >
> > 2017-04-10 9:52 GMT+08:00 杨苏立 Yang Su Li <yangsuli@gmail.com>:
> >
> > > We are only concerned about read operations here. Are you suggesting
> they
> > > are completely idempotent?
> > > Are there any read-after-write conflicts?
> > >
> > > Thanks
> > >
> > > Sui
> > >
> > > On Sun, Apr 9, 2017 at 8:48 PM, 张铎(Duo Zhang) <palomino219@gmail.com>
> > > wrote:
> > >
> > > > It depends on how you about the rpc request. For hbase, there will be
> > no
> > > > write conflict, but a write operation can only be finished iff all
> the
> > > > write operations with a lower mvcc number have been finished. So if
> you
> > > > just stop a write operation without recovering the mvcc(I do not know
> > how
> > > > to recover but I think you need to something...) then the writes will
> > be
> > > > stuck.
> > > >
> > > > And one more thing, for read operation you may interrupt it at any
> > time,
> > > > but for write operation, I do not think you can re-execute it with a
> > new
> > > > mvcc number if the WAL entry has already been flushed out. That
> means,
> > > the
> > > > re-execution process will be different if you about the write
> operation
> > > at
> > > > different stages.
> > > >
> > > > Thanks.
> > > >
> > > > 2017-04-10 6:47 GMT+08:00 杨苏立 Yang Su Li <yangsuli@gmail.com>:
> > > >
> > > > > We are trying to implement speculative rpc handling for our
> > workloads.
> > > So
> > > > > we want allow RPC Handler to stop executing an RPC call, put it
> back
> > to
> > > > the
> > > > > queue, and later re-execute it.
> > > > >
> > > > > If at time t1, we execute and RPC call half way, aborts, and put
> the
> > > call
> > > > > back to the queue.
> > > > > Then at time t2 another RPC handler picks the call and re-execute
> it.
> > > > > I understand that we might get a different mvcc number and
> different
> > > > > results at t2 compared to we execute it at t1.
> > > > > My question is that: would this situation any different compared
to
> > the
> > > > > situation where the call was never executed at t1, and is executed
> at
> > > t2
> > > > > for the first time.
> > > > >
> > > > >
> > > > > My guess is that since at t1 we may already gotten an mvcc number,
> so
> > > it
> > > > > might potentially cause some write conflicts and certain write
> > > operations
> > > > > to retry. But correctness wise, is there any difference?
> > > > >
> > > > > Thanks a lot!
> > > > >
> > > > > Suli
> > > > >
> > > > >
> > > > > On Sun, Apr 9, 2017 at 5:14 PM, Jerry He <jerryjch@gmail.com>
> wrote:
> > > > >
> > > > > > I don't know what your intention and your context are.
> > > > > >
> > > > > > You may get a different mvcc number and get different results
> next
> > > time
> > > > > > around if there are concurrent writes.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jerry
> > > > > >
> > > > > > On Sun, Apr 9, 2017 at 12:48 PM 杨苏立 Yang Su Li <
> yangsuli@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > I am wondering, for read requests like Get/MultiGet/Scan,
is
> the
> > > RPC
> > > > > > > handling idempotent in HBase?
> > > > > > >
> > > > > > > More specifically, if in the middle of RPC handling we
stop the
> > > > > handling
> > > > > > > threads, puts the RPC call back to the queue, and later
another
> > RPC
> > > > > > Handler
> > > > > > > picks up this call and starts all over again, will the
result
> be
> > > the
> > > > > same
> > > > > > > as if this call is being handled for the first time now?
Or are
> > > their
> > > > > any
> > > > > > > unexpected side effects?
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Suli
> > > > > > >
> > > > > > > --
> > > > > > > Suli Yang
> > > > > > >
> > > > > > > Department of Physics
> > > > > > > University of Wisconsin Madison
> > > > > > >
> > > > > > > 4257 Chamberlin Hall
> > > > > > > Madison WI 53703
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Suli Yang
> > > > >
> > > > > Department of Physics
> > > > > University of Wisconsin Madison
> > > > >
> > > > > 4257 Chamberlin Hall
> > > > > Madison WI 53703
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Suli Yang
> > >
> > > Department of Physics
> > > University of Wisconsin Madison
> > >
> > > 4257 Chamberlin Hall
> > > Madison WI 53703
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message