hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13374) Small scanners (with particular configurations) do not return all rows
Date Sat, 04 Apr 2015 01:12:33 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14395441#comment-14395441

Hadoop QA commented on HBASE-13374:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  against master branch at commit 6c22333599b9910314f57d0b6a580fb69eb7aa2b.
  ATTACHMENT ID: 12709332

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 4 new or modified

    {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions
(2.4.1 2.5.2 2.6.0)

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:green}+1 protoc{color}.  The applied patch does not increase the total number of
protoc compiler warnings.

    {color:green}+1 javadoc{color}.  The javadoc tool did not generate any warning messages.

    {color:green}+1 checkstyle{color}.  The applied patch does not increase the total number
of checkstyle errors

    {color:green}+1 findbugs{color}.  The patch does not introduce any  new Findbugs (version
2.0.3) warnings.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

  {color:green}+1 site{color}.  The mvn site goal succeeds with this patch.

     {color:red}-1 core tests{color}.  The patch failed these unit tests:

     {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 	at org.apache.activemq.tests.integration.cluster.distribution.ClusterTestBase.stopServers(ClusterTestBase.java:2231)
	at org.apache.activemq.tests.integration.cluster.distribution.SymmetricClusterWithBackupTest.stopServers(SymmetricClusterWithBackupTest.java:571)
	at org.apache.activemq.tests.integration.cluster.distribution.SymmetricClusterTest.tearDown(SymmetricClusterTest.java:49)

Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/13561//testReport/
Release Findbugs (version 2.0.3) 	warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/13561//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/13561//artifact/patchprocess/checkstyle-aggregate.html

  Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/13561//console

This message is automatically generated.

> Small scanners (with particular configurations) do not return all rows
> ----------------------------------------------------------------------
>                 Key: HBASE-13374
>                 URL: https://issues.apache.org/jira/browse/HBASE-13374
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.0.0, 2.0.0, 1.1.0, 0.98.13
>            Reporter: Jonathan Lawlor
>            Assignee: Jonathan Lawlor
>            Priority: Blocker
>             Fix For: 2.0.0, 1.0.1, 1.1.0, 0.98.13
>         Attachments: HBASE-13374-v1.patch, HBASE-13374-v1.patch, HBASE-13374-v1.patch,
HBASE-13374-v1.patch, small-scanner-data-loss-tests-0.98.patch, small-scanner-data-loss-tests-branch-1.0+.patch
> I recently ran into a couple data loss issues with small scans. Similar to HBASE-13262,
these issues only appear when scans are configured in such a way that the max result size
limit is reached before the caching limit is reached. As far as I can tell, this issue affects
branches 0.98+
> I should note that after investigation it looks like the root cause of these issues is
not the same as HBASE-13262. Rather, these issue are caused by errors in the small scanner
logic (I will explain in more depth below). 
> Furthermore, I do know that the solution from HBASE-13262 has not made its way into small
scanners (it is being addressed in HBASE-13335). As a result I made sure to test these issues
with the patch from HBASE-13335 applied and I saw that they were still present.
> The following two issues have been observed (both lead to data loss):
> 1. When a small scan is configured with a caching value of Integer.MAX_VALUE, and a maxResultSize
limit that is reached before the region is exhausted, integer overflow will occur. This eventually
leads to a preemptive skip of the regions.
> 2. When a small scan is configured with a maxResultSize that is smaller than the size
of a single row, the small scanner will jump between regions preemptively. This issue seems
to be because small scanners assume that, unless a region is exhausted, at least 2 rows will
be returned from the server. This assumption isn't clearly state in the small scanners but
is implied through the use of {{skipRowOfFirstResult}}.
> Again, I would like to stress that the root cause of these issues is *NOT* related to
the cause of HBASE-13262. These issues occur because of inappropriate assumption made in the
small scanner logic. The inappropriate assumptions are:
> 1. Integer overflow will not occur when incrementing caching
> 2. At least 2 rows will be returned from the server unless the region has been exhausted
> I am attaching a patch that contains tests to display these issues. If these issues should
be split into separate JIRAs please let me know.

This message was sent by Atlassian JIRA

View raw message