ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Paschenko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-6022) SQL: add native batch execution support for DML statements
Date Fri, 12 Jan 2018 10:43:00 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-6022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323815#comment-16323815
] 

Alexander Paschenko commented on IGNITE-6022:
---------------------------------------------

[~rkondakov]
Hi, my comments:

1. Do we really need additional ctor in {{SqlFieldsQueryEx}}? Args field is a list, not an
array, and we have null check at {{addBatchedArgs}}, so nothing prevents us from using old
ctor and adding as much args as we like. I don't think that pre-allocating list of given size
justifies existence of an additional ctor we clearly can live without.
2. {{DmlStatementsProcessor.doInsertBatched}}: why do we not fail fast on unexpected exceptions
and instead try to convert non {{IgniteSQLException}} s to {{SQLException}} s? [~vozerov]
what could be correct behavior here, how do you think? I believe we should handle only {{IgniteSQLException}}
s at this point.
3. {{DmlUtils.isBatched}} can be greatly simplified and turned into a one-liner, please do
so ({{return (instanceof && ((QueryEx)isBatched)}})
4. {{SqlFieldsQueryEx.isBatched}} - please use {{F.isEmpty}}
5. {{JdbcRequestHandler.executeBatchedQuery}}: you don't need {{return}} in {{catch}} clause,
instead please move everything after {{catch}} into {{try}}. Local var {{qryRes}} will be
declared inside {{try}} then too.
6. Why does {{UpdatePlanBuilder.checkPlanCanBeDistributed}} count {{DmlUtils.isBatched}} unconditionally?
Probably there are some cases when batch can be executed in distributed manner?
7. {{DmlBatchSender.add}}: you can simplify code and get rid of duplicate code at the end
of this method by rewriting condition to {{if (batch.put(...) != null || batch.size() >=
size)}}
8. {{DmlBatchSender.processPage}}: this constant copying of maps on each page looks quite
suspicious to me. To avoid this, just keep two maps instead of one where needed: {{<Object,
Integer>}} and {{Object, EntryProcessor}} (same sets of keys) and pass both to {{processPage}}.
Method {{countAllRows}} will get simpler too (it will need only values of the first map).
9. Multiple violations of coding conventions - please don't put closing curly brace and anything
else on the same line - like this: {{} catch {}}, instead you should move {{catch}} to the
next line.
10. In the cases where there are maps or lists with contents that are hard to understand intuitively
I would write concise comments about what those tons of generic args mean, or what are those
lists of lists of lists.
11. {{UpdatePlan}}: {{createRows(Object[])}} and {{extractArgsValues}} contain parts of clearly
copy-pasted code. Can't we unite those? Probably {{createRows(Object[])}} should just call
{{extractArgsValues}}?

> SQL: add native batch execution support for DML statements
> ----------------------------------------------------------
>
>                 Key: IGNITE-6022
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6022
>             Project: Ignite
>          Issue Type: Task
>          Components: sql
>    Affects Versions: 2.1
>            Reporter: Vladimir Ozerov
>            Assignee: Roman Kondakov
>              Labels: iep-1, performance
>             Fix For: 2.4
>
>
> We have batch execution support for JDBC and ODBC drivers. This decreases number of network
hops. However, we do not have any batch execution support on the server side. It means that
for batch of N similar statements, every statement will go through the whole execution chain
- parsing, splitting, communication with servers. And while parsing and splitting might be
avoided with help of statement cache, the most heavy part - network communication - is still
there.
> We need to investigate how to optimize the flow for batch updates. Possible improvements:
> 1) Execute statements with certain degree of parallelism;
> 2) Send several query execution requests to the server at once;
> 3) Ensure that caches are used properly for batching - we should not parse the same request
multiple times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message