phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Soldatov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-3112) Partial row scan not handled correctly
Date Tue, 03 Oct 2017 06:27:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sergey Soldatov updated PHOENIX-3112:
-------------------------------------
    Attachment: PHOENIX-3112-v6.patch

Final version that includes both testcases. I had to split it to 2 files because one (for
join) depends on the server side configuration. It can be used for other server side problems
with partial result ( such as local index scanner during compaction, statistic scanners, etc).



> Partial row scan not handled correctly
> --------------------------------------
>
>                 Key: PHOENIX-3112
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3112
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: Pierre Lacave
>            Assignee: Sergey Soldatov
>            Priority: Blocker
>             Fix For: 4.12.0
>
>         Attachments: PHOENIX-3112-1.patch, PHOENIX-3112-ssa-v4.patch, PHOENIX-3112-ssa-v5.patch,
PHOENIX-3112_v3.patch, PHOENIX-3112-v6.patch, PHOENIX-3112_wip2.patch
>
>
> When doing a select of a relatively large table (a few touthands rows) some rows return
partially missing.
> When increasing the fitler to return those specific rows, the values appear as expected
> {noformat}
> CREATE TABLE IF NOT EXISTS TEST (
>         BUCKET VARCHAR,
>         TIMESTAMP_DATE TIMESTAMP,
>         TIMESTAMP UNSIGNED_LONG NOT NULL,
>         SRC VARCHAR,
>         DST VARCHAR,
>         ID VARCHAR,
>         ION VARCHAR,
>         IC BOOLEAN NOT NULL,
>         MI UNSIGNED_LONG,
>         AV UNSIGNED_LONG,
>         MA UNSIGNED_LONG,
>         CNT UNSIGNED_LONG,
>         DUMMY VARCHAR
>     CONSTRAINT pk PRIMARY KEY (BUCKET, TIMESTAMP DESC, SRC, DST, ID, ION, IC)
> );{noformat}
> using a python script to generate a CSV with 5000 rows
> {noformat}
> for i in xrange(5000):
>     print "5SEC,2016-07-21 07:25:35.{i},146908593500{i},WWWWWWWW,AAA,BBBB,CCCCCCCC,false,{i}1181000,1788000{i},2497001{i},{i},aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa{i}".format(i=i)
> {noformat}
> bulk inserting the csv in the table
> {noformat}
> phoenix/bin/psql.py localhost -t TEST large.csv
> {noformat}
> here we can see one row that contains no TIMESTAMP_DATE and null values in MI and MA
> {noformat}
> 0: jdbc:phoenix:localhost:2181> select * from TEST 
> ....
> +---------+--------------------------+-------------------+-----------+------+-------+-----------+--------+--------------+--------------+--------------+-------+----------------------------------------------------------------------------+
> | BUCKET  |      TIMESTAMP_DATE      |     TIMESTAMP     |    SRC    | DST  |  ID   |
   ION    |   IC   |      MI      |      AV      |      MA      |  CNT  |                
                  DUMMY                                    |
> +---------+--------------------------+-------------------+-----------+------+-------+-----------+--------+--------------+--------------+--------------+-------+----------------------------------------------------------------------------+
> | 5SEC    | 2016-07-21 07:25:35.100  | 1469085935001000  | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 10001181000  | 17880001000  | 24970011000  | 1000  | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa1000
 |
> | 5SEC    | 2016-07-21 07:25:35.999  | 146908593500999   | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 9991181000   | 1788000999   | 2497001999   | 999   | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa999
  |
> | 5SEC    | 2016-07-21 07:25:35.998  | 146908593500998   | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 9981181000   | 1788000998   | 2497001998   | 998   | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa998
  |
> | 5SEC    |                          | 146908593500997   | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | null         | 1788000997   | null         | 997   |                
                                                           |
> | 5SEC    | 2016-07-21 07:25:35.996  | 146908593500996   | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 9961181000   | 1788000996   | 2497001996   | 996   | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa996
  |
> | 5SEC    | 2016-07-21 07:25:35.995  | 146908593500995   | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 9951181000   | 1788000995   | 2497001995   | 995   | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa995
  |
> | 5SEC    | 2016-07-21 07:25:35.994  | 146908593500994   | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 9941181000   | 1788000994   | 2497001994   | 994   | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa994
  |
> ....
> {noformat}
> but when selecting that row specifically the values are correct
> {noformat}
> 0: jdbc:phoenix:localhost:2181> select * from TEST where timestamp = 146908593500997;
> +---------+--------------------------+------------------+-----------+------+-------+-----------+--------+-------------+-------------+-------------+------+---------------------------------------------------------------------------+
> | BUCKET  |      TIMESTAMP_DATE      |    TIMESTAMP     |    SRC    | DST  |  ID   |
   ION    |   IC   |     MI      |     AV      |     MA      | CNT  |                    
              DUMMY                                   |
> +---------+--------------------------+------------------+-----------+------+-------+-----------+--------+-------------+-------------+-------------+------+---------------------------------------------------------------------------+
> | 5SEC    | 2016-07-21 07:25:35.997  | 146908593500997  | WWWWWWWW  | AAA  | BBBB  |
CCCCCCCC  | false  | 9971181000  | 1788000997  | 2497001997  | 997  | aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa997
 |
> +---------+--------------------------+------------------+-----------+------+-------+-----------+--------+-------------+-------------+-------------+------+---------------------------------------------------------------------------+
> 1 row selected (0.159 seconds){noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message