phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-4018) HashJoin may produce nulls for LHS table columns
Date Fri, 14 Jul 2017 02:30:00 GMT


Hadoop QA commented on PHOENIX-4018:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment
  against master branch at commit 8a34de7a4c74584f5fb3a7ab9aaa64fd6269717b.
  ATTACHMENT ID: 12877200

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 52 warning messages.

    {color:green}+1 release audit{color}.  The applied patch does not increase the total number
of release audit warnings.

    {color:green}+1 lineLengths{color}.  The patch does not introduce lines longer than 100

     {color:red}-1 core tests{color}.  The patch failed these unit tests:

     {color:red}-1 core zombie tests{color}.  There are 5 zombie test(s): 	at org.apache.hadoop.hbase.regionserver.TestHRegion.testgetHDFSBlocksDistribution(
	at org.apache.hadoop.hbase.regionserver.TestPerColumnFamilyFlush.testCompareStoreFileCount(
	at org.apache.hadoop.hbase.regionserver.TestRegionReplicas.testRefresStoreFiles(

Test results:
Javadoc warnings:
Console output:

This message is automatically generated.

> HashJoin may produce nulls for LHS table columns
> ------------------------------------------------
>                 Key: PHOENIX-4018
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.11.0
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>            Priority: Critical
>         Attachments: PHOENIX-4018-1.patch
> Here is the problem: in HashJoinRegionScanner methods (nextRow for example) we are using
the same scanner context that was created in RSRpcServices. It has limits (i.e. 2Mb size).
Let's say that we have 3Mb region and the only key that match the join condition is located
at the end of the region. In HashJoinRegionScanner#nextRow when we iterate through the region
rows once we reached the limit of 2Mb, every region scanner nextRow will  return a single
cell and the scanner context will have SIZE_LIMIT_REACHED_MID_ROW state. But we don't have
any logic that check that, so this single cell is considered as a complete row with all nulls
except one column. 
> How to fix it: 
> 1. for region scanner we may provide NoLimitScannerContext, so we will never get a partial
> 2. We need to update the scanner context that we got from RSRpcServices with the real
data, basing on the size of results we are going to return. 

This message was sent by Atlassian JIRA

View raw message