hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15214) Valid mutate Ops fail with RPC Codec in use and region moves across
Date Sat, 06 Feb 2016 00:52:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135365#comment-15135365
] 

Hudson commented on HBASE-15214:
--------------------------------

FAILURE: Integrated in HBase-Trunk_matrix #685 (See [https://builds.apache.org/job/HBase-Trunk_matrix/685/])
HBASE-15214 Valid mutate Ops fail with RPC Codec in use and region moves (anoopsamjohn: rev
7239056c78cc6eb2867c8865ab45821d3e51328a)
* hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RSRpcServices.java


> Valid mutate Ops fail with RPC Codec in use and region moves across
> -------------------------------------------------------------------
>
>                 Key: HBASE-15214
>                 URL: https://issues.apache.org/jira/browse/HBASE-15214
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.0
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Critical
>             Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.4, 1.0.4, 0.98.18
>
>         Attachments: HBASE-15214-branch-1.patch, HBASE-15214.patch, HBASE-15214_V2.patch,
HBASE-15214_V3.patch
>
>
> Test failures in HBASE-15198 lead to this bug. Till now we are not doing cell block (codec
usage) for write requests. (Client -> server)  Once we enabled Codec usage by default,
aw this issue.
> A multi request came to RS with mutation for different regions. One of the region which
was in this RS got unavailable now.  In RsRpcServices#multi, we will fail that entire RegionAction
(with N mutations in it) in that MultiRequest.  Then we will continue with remaining RegionActions.
 Those Regions might be available.  (The failed RegionAction will get retried from client
after fetching latest region location).  This all works fine in pure PB requests world. When
a Codec is used, we wont convert the Mutation Cell to PB Cells and pack them in PB Message.
Instead we will pass all Cells serialized into one byte[] cellblock. Using Decoder we will
iterate over these cells at server side. Each Mutation PB will know only the number of cells
associated with it.  As in above case when an entire RegionAction was skipped, there might
be N Mutations under that which might have corresponding Cells in the cellblock. We are not
doing the skip in that Iterator. This makes the later Mutations (for other Regions) to refer
to invalid Cells and try to put those into the a different region. This will make HRegion#checkRow()
to throw WrongRegionException which will be treated as Sanity check failure and so throwing
back a DNRIOE to client. So the op will get failed for the user code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message