phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "wuchengzhi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1570) Data missing when using local index
Date Mon, 05 Jan 2015 08:23:34 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14264331#comment-14264331
] 

wuchengzhi commented on PHOENIX-1570:
-------------------------------------

i think i found the reason but i have no competence to fix it. please help to fix ASAP. thanks.

for select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table where q = '102218';
the  field's count is 20 (18  from primary table, a rowKey and index_rowkey from index table)
. so the valueBitSet bytes 0, 0, 0, 0, 0, 3, -9, -1, 0, 1  ,long value is 260095, it's correct
response from server.
But in the client part, when decoding the RowProjector from org.apache.phoenix.compile.ProjectionCompiler.compile(StatementContext,SelectStatement,GroupBy,List<?
extends PDatum> ). we get an incorrect ProjectionCompiler object.

each  expression in  RowProjector.columnProjectors , its schema's  FieldCount(fieldIndexByPosition.length)
is increase by one element.
eg.
Expression(B)   fieldIndexByPosition is  [0]   (1)
Expression(C)   fieldIndexByPosition is  [0, 1]   (2)
Expression(D)   fieldIndexByPosition is  [0, 1, 1]   (3)
....................
Expression(S)   fieldIndexByPosition is  [0, 1, 1, 1, 1, 2, 2, 3, 4, 5, 6, 6, 7, 7, 8, 9,
10]       (17)
Expression(T)   fieldIndexByPosition is  [0, 1, 1, 1, 1, 2, 2, 3, 4, 5, 6, 6, 7, 7, 8, 9,
10, 11]   (18)

so. then it call expression.evaluate() to check whether this column contains value .

bitSet.or(ptr);

public void or(ImmutableBytesWritable ptr) {
        if (schema == null) {
            return;
        }
        if (isVarLength()) {
            int offset = ptr.getOffset() + ptr.getLength() - Bytes.SIZEOF_SHORT;
            short nLongs = Bytes.toShort(ptr.get(), offset);
            offset -= nLongs * Bytes.SIZEOF_LONG;
            for (int i = 0; i < nLongs; i++) {
                bits[i] |= Bytes.toLong(ptr.get(), offset);
                offset += Bytes.SIZEOF_LONG;
            }
            maxSetBit = Math.max(maxSetBit, nLongs * BITS_PER_LONG - 1);
        } else {
            long l = Bytes.toShort(ptr.get(), ptr.getOffset() + ptr.getLength() - Bytes.SIZEOF_SHORT);
            bits[0] |= l;
            maxSetBit = Math.max(maxSetBit, BITS_PER_SHORT - 1);
        }
        
  }

   private boolean isVarLength() {
        return schema == null ? false : schema.getFieldCount() - schema.getMinNullable() >
BITS_PER_SHORT;
   }


then.
Expression(B).evaluate() 
 the bitSet's bits[0] = Bytes.toShort({0,1}}) = 1 so   ( 1 & 1 >> 0 ) != 0  , hasValue
= true.
Expression(C).evaluate() 
 the bitSet's bits[0] = Bytes.toShort({0,1}}) = 1 so   ( 1 & 1 >> 1 ) != 0 , hasValue
= false.
Expression(D).evaluate() 
 the bitSet's bits[0] = Bytes.toShort({0,1}}) = 1 so   ( 1 & 1 >> 1 ) != 0 , hasValue
= false.
......

Expression(S).evaluate() 
 the bitSet's bits[0] = Bytes.toShort({0, 0, 0, 0, 0, 3, -9, -1}}) = 260095 so  ( 260095 &
260095 >> 16 )  != 0 , hasValue = true.

Expression(S).evaluate() 
 the bitSet's bits[0] = Bytes.toShort({0, 0, 0, 0, 0, 3, -9, -1}}) = 260095 so  ( 260095 &
260095 >> 17 )  != 0 , hasValue = true.

so it tell why just B,S,T contains value,but other not. and also we can see why [ select a,b,c,d,e,f,g,h,i,j,k,l,n,o,p,q,r,s
from Miss_data_table where q = '102218' ] it's ok.

  
this is my analysis ´╝îplease check it out 

> Data missing when using local index
> -----------------------------------
>
>                 Key: PHOENIX-1570
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1570
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.2.2
>         Environment: ubuntu 
> HBase 0.98.7
> Hadoop 2.5.1
> OS: ubuntu
>            Reporter: wuchengzhi
>            Priority: Critical
>
> 1. crate a table by the schema as below:
> CREATE TABLE IF NOT EXISTS Miss_data_table(
> a BIGINT NOT NULL,
> b VARCHAR,
> c INTEGER,
> d INTEGER,
> e INTEGER,
> f INTEGER,
> g VARCHAR,
> h VARCHAR,
> i INTEGER,
> j VARCHAR,
> k INTEGER,
> l VARCHAR,
> m VARCHAR,
> n INTEGER,
> o INTEGER,
> p VARCHAR,
> q VARCHAR,
> r INTEGER,
> s BIGINT,
> t VARCHAR CONSTRAINT pk PRIMARY KEY(a))
> 2.create local index for the table with column: q
> create local index idx_q on Miss_data_table (q);
> 3.upsert data into table.
> upsert into Miss_data_table values(96660688,'hello/TEST-0',156,-1,-1,0,'2013-02-14 18:34:05.0','TEST-1',0,'495839182',0,'50','',0,0,'1818378','102218',0,26,'20141201')
> 4. execute querys...
> select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table where q = '102218';
> +----------+--------------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+--------+------+------+----------+
> | A        | B            | C    | D    | E    | F    | G    | H    | I    | J    | K
   | L    | M    | N    | O    | P    | Q      | R    | S    | T        |
> +----------+--------------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+--------+------+------+----------+
> | 96660688 | hello/TEST-0 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL
| NULL | NULL | NULL | NULL | NULL | 102218 | NULL | 26   | 20141201 |
> +----------+--------------+------+------+------+------+------+------+------+------+------+------+------+------+------+------+--------+------+------+----------+
> select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table where a=96660688;
> +----------+--------------+------+------+------+------+-----------------------+--------+------+-----------+------+------+------+------+------+---------+--------+------+------+----------+
> | A        | B            | C    | D    | E    | F    | G                     | H   
  | I    | J         | K    | L    | M    | N    | O    | P       | Q      | R    | S    |
T        |
> +----------+--------------+------+------+------+------+-----------------------+--------+------+-----------+------+------+------+------+------+---------+--------+------+------+----------+
> | 96660688 | hello/TEST-0 | 156  | -1   | -1   | 0    | 2013-02-14 18:34:05.0 | TEST-1
| 0    | 495839182 | 0    | 50   | NULL | 0    | 0    | 1818378 | 102218 | 0    | 26   | 20141201
|
> +----------+--------------+------+------+------+------+-----------------------+--------+------+-----------+------+------+------+------+------+---------+--------+------+------+----------+
> // execute the query plain ,it shows we fetch data by local index.
> explain select a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t from Miss_data_table where q =
'102218';
> +------------------------------------------+
> |                   PLAN                   |
> +------------------------------------------+
> | CLIENT 1-CHUNK PARALLEL 1-WAY RANGE SCAN OVER _LOCAL_IDX_TEST.MISS_DATA_TABLE [-32768,'102218']
|
> | CLIENT MERGE SORT                        |
> +------------------------------------------+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message