phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas D'Silva (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3755) Duplicate rows
Date Sun, 09 Apr 2017 04:55:41 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15962038#comment-15962038
] 

Thomas D'Silva commented on PHOENIX-3755:
-----------------------------------------

Can you provide us with some DML statements to reproduce the issue?

> Duplicate rows
> --------------
>
>                 Key: PHOENIX-3755
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3755
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.9.0
>         Environment: Ubuntu 16.04
> HBase version 1.2.2
>            Reporter: Viet Nguyen
>
> I have a major bug in apache phoenix version 4.9.0 as following:
> A query create table:
> CREATE TABLE ANALYTIC_ITEM_URL_V2
> (
>    DOMAIN_ID INTEGER NOT NULL ,
>    ITEM VARCHAR(40) NOT NULL ,
>    URL VARCHAR(500),
>    CONSTRAINT PK PRIMARY KEY (DOMAIN_ID,ITEM )
> ) SALT_BUCKETS=4, COMPRESSION='SNAPPY', IMMUTABLE_ROWS=true;
> This table has primary key with two field are domain_id and item. And when I executed
the query: 
> "select * from ANALYTIC_ITEM_URL_V2  where domain_id=17 and item='435bbf4da995a9b618b3d00d536ba730'"
> but I was quite surprised with the result as follow:
> +------------+-----------------------------------+-----------------------------------------------------------------------------------------------------------------+
> | DOMAIN_ID  |               ITEM                |                                  
                    URL                                                       |
> +------------+-----------------------------------+-----------------------------------------------------------------------------------------------------------------+
> | 17         | 435bbf4da995a9b618b3d00d536ba730  | /nhom-nghi-si-thieu-so-bi-an-va-quyen-luc-khien-tong-thong-trump-tham-bai-truoc-obamacare-2017032819455659.chn
 |
> | 17         | 435bbf4da995a9b618b3d00d536ba730  | /nhom-nghi-si-thieu-so-bi-an-va-quyen-luc-khien-tong-thong-trump-tham-bai-truoc-obamacare-2017032819455659.chn
 |
> | 17         | 435bbf4da995a9b618b3d00d536ba730  | /nhom-nghi-si-thieu-so-bi-an-va-quyen-luc-khien-tong-thong-trump-tham-bai-truoc-obamacare-2017032819455659.chn
 |
> +------------+-----------------------------------+-----------------------------------------------------------------------------------------------------------------+
> As you see, there are 3 rows with same primary key. I also executed two query:
>  - select count(*) from ANALYTIC_ITEM_URL_V2;     in Phoenix
> And
> - count 'ANALYTIC_ITEM_URL_V2'      in HBase
> Result is there are about 20M record in HBase and about 35M record in Phoenix. So I think
phoenix has a bug in metadata store. Can I help me explain this thing?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message