ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@apache.org>
Subject Re: SQL on PARTITIONED vs REPLICATED cache
Date Wed, 12 Apr 2017 21:20:38 GMT
Dmitriy,

No, I think that Sergi supposed a type of cache which reference is used for a query execution.
In my example

>> 2. Error-prone scenario - *replicatedCache*.query(“SELECT * FROM
>> partitionedCache … JOIN replicatedCache …”);

*replicatedCache* reference is used for the query execution and, as I understand, this causes
the issue.

Sergi, please clarify.

—
Denis

> On Apr 12, 2017, at 1:51 PM, Dmitriy Setrakyan <dsetrakyan@apache.org> wrote:
> 
> Denis, I think that you meant selecting from replicated cache first as an
> invalid scenario, but provided the wrong example. Here is the correct
> example for the invalid query:
> 
> SELECT * FROM replicatedCache … JOIN partitionedCache …”
> 
> I do agree, we should make the change, as long as we keep the flag to
> enable the old behavior.
> 
> D.
> 
> On Wed, Apr 12, 2017 at 12:50 PM, Denis Magda <dmagda@apache.org> wrote:
> 
>> Sergi,
>> 
>> As far as I understand you’re considering an example below:
>> 
>> IgniteCache partitioneCache = ...;
>> IgniteCache replicatedCache = …;
>> 
>> 1. Valid scenario - *partitionedCache*.query(“SELECT * FROM
>> partitionedCache … JOIN replicatedCache …”);
>> 2. Error-prone scenario - *replicatedCache*.query(“SELECT * FROM
>> partitionedCache … JOIN replicatedCache …”);
>> 
>> Do you mean 2. as the issue? If it’s so then can’t we just detect on our
>> own that all the caches are replicated and execute a query more optimal?
>> This should omit necessity to add isReplicatedOnly()?
>> 
>> —
>> Denis
>> 
>>> On Apr 12, 2017, at 7:07 AM, Andrey Mashenkov <
>> andrey.mashenkov@gmail.com> wrote:
>>> 
>>> Yes, it's reasonable.
>>> 
>>> On Wed, Apr 12, 2017 at 3:23 PM, Sergi Vladykin <
>> sergi.vladykin@gmail.com>
>>> wrote:
>>> 
>>>> Good point, but I'm not sure. The difference is that on client node you
>>>> should not be able to enable isLocal, while isReplicatedOnly is
>> perfectly
>>>> valid. What do you think?
>>>> 
>>>> Sergi
>>>> 
>>>> 2017-04-12 15:18 GMT+03:00 Andrey Mashenkov <andrey.mashenkov@gmail.com
>>> :
>>>> 
>>>>> Sergi,
>>>>> 
>>>>> Got it.
>>>>> 
>>>>> Does query execution way and results will be same for isReplicatedOnly
>>>> flag
>>>>> and for isLocal flag turned on?
>>>>> If my understanding is correct, we will get same results and there is
>> no
>>>>> need to introduce a new flag.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Apr 12, 2017 at 2:54 PM, Sergi Vladykin <
>>>> sergi.vladykin@gmail.com>
>>>>> wrote:
>>>>> 
>>>>>> Ok, let it be an exception. I'm just saying that the thing does not
>>>> work
>>>>>> now.
>>>>>> 
>>>>>> Sergi
>>>>>> 
>>>>>> 2017-04-12 14:50 GMT+03:00 Andrey Mashenkov <
>>>> andrey.mashenkov@gmail.com
>>>>>> :
>>>>>> 
>>>>>>> Sergi,
>>>>>>> 
>>>>>>> I wounder how it is possible?
>>>>>>> 
>>>>>>> Looks like it is impossible to run query on replicated cache,
but
>>>>> select
>>>>>>> data from a
>>>>>>> partitioned table. It will result with IlleagalStateException
on
>>>> stable
>>>>>>> topology or
>>>>>>> IgniteCacheException on unstable topology.
>>>>>>> See ReduceQueryExecutor.stableDataNodes() and
>>>>>>> replicatedUnstableDataNodes()
>>>>>>> methods.
>>>>>>> 
>>>>>>> BTW, IlleagalStateException with no message is confusing.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, Apr 12, 2017 at 2:36 PM, Sergi Vladykin <
>>>>>> sergi.vladykin@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Andrey,
>>>>>>>> 
>>>>>>>> Because if you run query on replicated cache, but select
data from
>>>> a
>>>>>>>> partitioned table, you will get only a part of the result.
>>>>>>>> 
>>>>>>>> Igor,
>>>>>>>> 
>>>>>>>> You are mostly right, but
>>>>>>>> 
>>>>>>>> 1. Performance characteristics may change.
>>>>>>>> 2. Ignite SQL processing pipeline may not support all the
stuff in
>>>> H2
>>>>>> SQL
>>>>>>>> and fail in some case where it worked previously.
>>>>>>>> 
>>>>>>>> Because of this the change may affect existing applications
and I
>>>>> want
>>>>>> to
>>>>>>>> have it in 2.0 to make it legal.
>>>>>>>> 
>>>>>>>> Sergi
>>>>>>>> 
>>>>>>>> 2017-04-12 14:10 GMT+03:00 Igor Sapego <isapego@gridgain.com>:
>>>>>>>> 
>>>>>>>>> Also, is it really a breaking change if the results are
wrong?
>>>>>>>>> To me it looks more like a bugfix, i.e. you can't break
something
>>>>>>>>> that does not work properly.
>>>>>>>>> 
>>>>>>>>> Best Regards,
>>>>>>>>> Igor
>>>>>>>>> 
>>>>>>>>> On Wed, Apr 12, 2017 at 2:04 PM, Andrey Mashenkov <
>>>>>>>>> andrey.mashenkov@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> Sergi,
>>>>>>>>>> 
>>>>>>>>>> How can query to replicated cache leads to to wrong
results?
>>>>>>>>>> Is it due to we can read backup entries?
>>>>>>>>>> 
>>>>>>>>>> On Wed, Apr 12, 2017 at 12:31 PM, Sergi Vladykin
<
>>>>>>>>> sergi.vladykin@gmail.com
>>>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Guys,
>>>>>>>>>>> 
>>>>>>>>>>> I want to introduce another breaking change for
2.0.
>>>>>>>>>>> 
>>>>>>>>>>> Currently SQL is being processed differently
when we call
>>>>> method
>>>>>>>>> `query`
>>>>>>>>>> on
>>>>>>>>>>> partitioned cache and on replicated: on replicated
cache we
>>>> do
>>>>>> not
>>>>>>> do
>>>>>>>>> any
>>>>>>>>>>> extra processing and execute the query as is
on current node.
>>>>>>>>>>> 
>>>>>>>>>>> This behavior historically existed for performance
reasons.
>>>> But
>>>>>> it
>>>>>>> is
>>>>>>>>> not
>>>>>>>>>>> obvious and leads to wrong query results. This
issue becomes
>>>>> even
>>>>>>>> more
>>>>>>>>>>> creepy with JDBC and ODBC drivers.
>>>>>>>>>>> 
>>>>>>>>>>> In 2.0 I want to execute all the SQL queries
the same way
>>>>> through
>>>>>>> the
>>>>>>>>>> whole
>>>>>>>>>>> processing pipeline to guaranty the correct result
>>>>> irrespectively
>>>>>>> to
>>>>>>>>> the
>>>>>>>>>>> cache that was the query originator.
>>>>>>>>>>> 
>>>>>>>>>>> To be able to have the old behavior (skip all
the
>>>> preprocessing
>>>>>> and
>>>>>>>> run
>>>>>>>>>>> query on current node) add a flag isReplicatedOnly()
on
>>>>> SqlQuery
>>>>>>> and
>>>>>>>>>>> SqlFieldsQuery. It will be disabled by default
and if one
>>>> knows
>>>>>>> that
>>>>>>>>> the
>>>>>>>>>>> only replicated tables participate in a query,
then he can
>>>>> enable
>>>>>>> it
>>>>>>>>> for
>>>>>>>>>>> better performance.
>>>>>>>>>>> 
>>>>>>>>>>> Sergi
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Andrey V. Mashenkov
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Andrey V. Mashenkov
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Best regards,
>>>>> Andrey V. Mashenkov
>>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> Andrey V. Mashenkov
>> 
>> 


Mime
View raw message