phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-5055) Split mutations batches probably affects correctness of index data
Date Tue, 11 Dec 2018 09:29:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-5055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16716652#comment-16716652
] 

Hadoop QA commented on PHOENIX-5055:
------------------------------------

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12951315/PHOENIX-5055-4.x-HBase-1.4-v4.patch
  against 4.x-HBase-1.4 branch at commit f0881a137c9b1a020d807f4d1651ca139ee1a7be.
  ATTACHMENT ID: 12951315

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:green}+1 tests included{color}.  The patch appears to include 3 new or modified
tests.

    {color:green}+1 javac{color}.  The applied patch does not increase the total number of
javac compiler warnings.

    {color:red}-1 release audit{color}.  The applied patch generated 1 release audit warnings
(more than the master's current 0 warnings).

    {color:red}-1 lineLengths{color}.  The patch introduces the following lines longer than
100:
    +        try (PhoenixConnection conn = DriverManager.getConnection(getUrl(), props).unwrap(PhoenixConnection.class))
{
+            conn.createStatement().executeUpdate("CREATE INDEX " + indexName + " on "  +
tableName + " (C) INCLUDE(D)");
+            conn.createStatement().executeUpdate("UPSERT INTO "  + tableName + "(A,B,C,D)
VALUES ('A2','B2','C2','D2')");
+            conn.createStatement().executeUpdate("UPSERT INTO "  + tableName + "(A,B,C,D)
VALUES ('A3','B3', 'C3', null)");
+                        assertEquals("(" + cell.toString() + ") has different ts", ts, cell.getTimestamp());
+        // set the batch size (rows) to 2 since three are at least 2 mutations when updates
a single row
+     * Split the list of mutations into multiple lists. since a single row update can contain
multiple mutations,
+    public static List<List<Mutation>> getMutationBatchList(long batchSize, long
batchSizeBytes, List<Mutation> allMutationList) {
+                "Mutation types are put or delete, for one row all mutations must be in one
batch.");
+            List<Mutation> list = ImmutableList.of(new Put(r3), new Put(r1), new Delete(r1),
new Put(r2), new Put(r4), new Delete(r4));

    {color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-PHOENIX-Build/2202//testReport/
Release audit warnings: https://builds.apache.org/job/PreCommit-PHOENIX-Build/2202//artifact/patchprocess/patchReleaseAuditWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-PHOENIX-Build/2202//console

This message is automatically generated.

> Split mutations batches probably affects correctness of index data
> ------------------------------------------------------------------
>
>                 Key: PHOENIX-5055
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5055
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 5.0.0, 4.14.1
>            Reporter: Jaanai
>            Assignee: Jaanai
>            Priority: Critical
>             Fix For: 5.1.0
>
>         Attachments: ConcurrentTest.java, PHOENIX-5055-4.x-HBase-1.4-v2.patch, PHOENIX-5055-4.x-HBase-1.4-v3.patch,
PHOENIX-5055-4.x-HBase-1.4-v4.patch, PHOENIX-5055-v4.x-HBase-1.4.patch
>
>
> In order to get more performance, we split the list of mutations into multiple batches
in MutationSate.  For one upsert SQL with some null values that will produce two type KeyValues(Put
and DeleteColumn),  These KeyValues should have the same timestamp so that keep on an atomic
operation for corresponding the row key.
> [^ConcurrentTest.java] produced some random upsert/delete SQL and concurrently executed,
some SQL snippets as follows:
> {code:java}
> 1149:UPSERT INTO ConcurrentReadWritTest(A,C,E,F,G) VALUES ('3826','2563','3052','3170','3767');
> 1864:UPSERT INTO ConcurrentReadWritTest(A,B,C,D,E,F,G) VALUES ('2563','4926','3526','678',null,null,'1617');
> 2332:UPSERT INTO ConcurrentReadWritTest(A,B,C,D,E,F,G) VALUES ('1052','2563','1120','2314','1456',null,null);
> 2846:UPSERT INTO ConcurrentReadWritTest(A,B,C,D,G) VALUES ('1922','146',null,'469','2563');
> 2847:DELETE FROM ConcurrentReadWritTest WHERE A = '2563’;
> {code}
> Found incorrect indexed data for the index tables by sqlline.
> !https://gw.alicdn.com/tfscom/TB1nSDqpxTpK1RjSZFGXXcHqFXa.png|width=665,height=400!
> Debugged the mutations of batches on the server side. the DeleteColumns and Puts were splitted
into the different batches for the once upsert,  the DeleteFaimly also was executed by another
thread.  due to DeleteColumns's timestamp is larger than DeleteFaimly under multiple threads.
> !https://gw.alicdn.com/tfscom/TB1frHmpCrqK1RjSZK9XXXyypXa.png|width=901,height=120!
>  
> Running the following:
> {code:java}
> conn.createStatement().executeUpdate( "CREATE TABLE " + tableName + " (" + "A VARCHAR
NOT NULL PRIMARY KEY," + "B VARCHAR," + "C VARCHAR," + "D VARCHAR) COLUMN_ENCODED_BYTES =
0"); 
> conn.createStatement().executeUpdate("CREATE INDEX " + indexName + " on " + tableName
+ " (C) INCLUDE(D)"); 
> conn.createStatement().executeUpdate("UPSERT INTO " + tableName + "(A,B,C,D) VALUES ('A2','B2','C2','D2')");

> conn.createStatement().executeUpdate("UPSERT INTO " + tableName + "(A,B,C,D) VALUES ('A3','B3',
'C3', null)");
> {code}
> dump IndexMemStore:
> {code:java}
> hbase.index.covered.data.IndexMemStore(117): Inserting:\x01A3/0:D/1542190446218/DeleteColumn/vlen=0/seqid=0/value=
phoenix.hbase.index.covered.data.IndexMemStore(133): Current kv state: phoenix.hbase.index.covered.data.IndexMemStore(135):
KV: \x01A3/0:B/1542190446167/Put/vlen=2/seqid=5/value=B3 phoenix.hbase.index.covered.data.IndexMemStore(135):
KV: \x01A3/0:C/1542190446167/Put/vlen=2/seqid=5/value=C3 phoenix.hbase.index.covered.data.IndexMemStore(135):
KV: \x01A3/0:D/1542190446218/DeleteColumn/vlen=0/seqid=0/value= phoenix.hbase.index.covered.data.IndexMemStore(135):
KV: \x01A3/0:_0/1542190446167/Put/vlen=1/seqid=5/value=x phoenix.hbase.index.covered.data.IndexMemStore(137):
========== END MemStore Dump ==================
> {code}
>  
> The DeleteColumn's timestamp larger than other mutations.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message