impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] Updates to DML statements for Impala + Kudu
Date Wed, 25 Jan 2017 20:06:06 GMT
Matthew Jacobs has posted comments on this change.

Change subject: Updates to DML statements for Impala + Kudu
......................................................................


Patch Set 4:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/5646/3/docs/topics/impala_delete.xml
File docs/topics/impala_delete.xml:

PS3, Line 89: There
the


http://gerrit.cloudera.org:8080/#/c/5646/4/docs/topics/impala_delete.xml
File docs/topics/impala_delete.xml:

PS4, Line 79:     <p>
            :       The conditions in the <codeph>WHERE</codeph> clause can refer
to
            :       any combination of primary key columns or other columns.
            :     </p>
maybe worth mentioning that predicates on the PK will be faster- this still has a scan in
it, so we push some predicates to the scan (that's described somewhere else)


PS4, Line 89: There
The


PS4, Line 88: 
            :       There <codeph>WHERE</codeph> clause can refer to any combination
of columns,
            :       regardless of whether the columns are part of the primary key.
seems to duplicate the stmt 2 above


PS4, Line 93:     <p>
            :       If some rows cannot be deleted because their
            :       some primary key columns are not found, due to their being deleted
            :       by a concurrent <codeph>DELETE</codeph> operation,
            :       the statement succeeds but returns a warning.
            :     </p>
            : 
            :     <p>
            :       After the statement finishes, there might be more or fewer rows than expected
in the table,
            :       due to other <codeph>INSERT</codeph>, <codeph>DELETE</codeph>,
<codeph>UPDATE</codeph>,
            :       or <codeph>UPSERT</codeph> statements running concurrently
on the same table.
            :     </p>
these could be combined I think and made more clear, e.g.

Because DML statements may conflict with one another (ref consistency?), a DELETE statement
may attempt to delete rows that have already been deleted in which case the statement succeeds
but a warning is returned. A DELETE statement may also conflict with an INSERT statement resulting
in more rows than expected in the target table.


PS4, Line 108:       No message or return value indicates how many rows were deleted by the
statement.
This is not true, we show it in the shell and in the profile (not *DBC/HS2).


Query: select * from t
Query submitted at: 2017-01-25 11:09:23 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=c410195daa4fa5e:aa39998900000000
+----+---------+
| id | int_col |
+----+---------+
| 1  | 1       |
| 5  | 1       |
| 6  | 0       |
| 7  | 1       |
| 0  | 0       |
| 2  | 0       |
| 4  | 0       |
| 3  | 1       |
+----+---------+
Fetched 8 row(s) in 5.56s
[localhost:21000] > delete t where id < 3;
Query: delete t where id < 3
Query submitted at: 2017-01-25 11:11:04 (Coordinator: http://mj-desktop.ca.cloudera.com:25000)
Query progress can be monitored at: http://mj-desktop.ca.cloudera.com:25000/query_plan?query_id=5b47f095b366a4ac:c972171800000000
Modified 3 row(s), 0 row error(s) in 0.13s


PS4, Line 140: DELETE FROM time_series WHERE
             :   year = 2016 AND month IN (11,12) AND day > 15;
maybe worth mentioning this one would be fastest assuming year, month, day are PK. in the
above examples we cannot push anything with "OR" to the scan.


http://gerrit.cloudera.org:8080/#/c/5646/4/docs/topics/impala_update.xml
File docs/topics/impala_update.xml:

PS4, Line 62:       The conditions in the <codeph>WHERE</codeph> clause are the
same ones allowed
            :       for the <codeph>SELECT</codeph> statement.
same comment as in delete case about predicates on PKs will be faster.


PS4, Line 77: their
            :       some
the


PS4, Line 77:  If some rows cannot be updated because their
            :       some primary key columns are not found, due to their being deleted
            :       by a concurrent <codeph>DELETE</codeph> operation,
            :       the statement succeeds but returns a warning.
            :     </p>
            : 
            :     <p>
            :       The result set of this statement is always the empty set (zero rows).
            :       No message or return value indicates how many rows were deleted by the
statement.
same comment about combining these as in DELETE


PS4, Line 84:       The result set of this statement is always the empty set (zero rows).
            :       No message or return value indicates how many rows were deleted by the
statement.
same as delete this should return a message in the shell but not *DBC/HS2. it is in the profile
too. it holds for all DML.


PS4, Line 85: deleted
this should be updated


PS4, Line 98: <p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
as we discussed in the mtg this probably doesn't apply to DML (please update the other DML
stmts as well)


PS4, Line 144: but more efficient.
note this is still not pushed down


http://gerrit.cloudera.org:8080/#/c/5646/4/docs/topics/impala_upsert.xml
File docs/topics/impala_upsert.xml:

PS4, Line 41: <indexterm audience="hidden">UPSERT statement</indexterm>
            :       Acts as a combination of the <codeph>INSERT</codeph>
            :       and <codeph>UPDATE</codeph> statements.
not sure if we should state this in docs


PS4, Line 78: (Note: the square brackets are part of the syntax.)
this ends up formatted oddly in the pdf, maybe next line or out of the code block


PS4, Line 104:     <p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
same as other stmts


-- 
To view, visit http://gerrit.cloudera.org:8080/5646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I60512b7957fb53d86d3123a4f1d46fbb355f4665
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Ambreen Kazi <ambreen.kazi@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jdcryans@apache.org>
Gerrit-Reviewer: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <todd@apache.org>
Gerrit-HasComments: Yes

Mime
View raw message