ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Tanackovic <igor.tanacko...@gmail.com>
Subject Re: SQL Engine
Date Wed, 17 Oct 2018 23:42:06 GMT
Dmitriy,

Correct me if I’m wrong, but the concept is to store everything off heap - which is perfectly
fine :). So, the question is how SQL engine actually works. Analyzing profiler object generation
and sizes on heap, I’ve learned that for each query entire rows in question are copied on
heap which, I’d say is a little bit inefficient (more or less depending on domain object).
Maybe an example will help to further elaborate on the issue.

Lets say we have an object ORGANIZATION which has only two query enabled fields (annotated
with @QuerySqlField) and 20 more value fields (text/blobs, …) which are not used in queries
(not annotated with @QuerySqlField and thus not visible by SQL engine). So, how COUNT works?
Will SQL engine copy on heap just a part of ORGANIZATION rows (2 fields that might be part
of COUNT query) or entire rows (sql enabled fields + text/blob fields…)?

From what I saw during profiling, I’d say entire rows will be copied on heap. Just to compare,
this is our, real world example:

Object A (~300k rows, each row has 5 columns, all sql enabled), COUNT(*) takes ~2MB on heap
Object B (~25k rows, each row has 16 columns, 5 sql enabled fields and 11 text value fields),
COUNT(*) takes ~60MB on heap - nearly 30 times more


Regards,
Igor
 
    

> On Oct 18, 2018, at 12:24 AM, Dmitriy Setrakyan <dsetrakyan@apache.org> wrote:
> 
> I do not understand - why are we copying values on-heap? If we must copy
> something, why not copy rows into some off-heap space and read what we have
> from there? At least this way we will not burden the JVM with extra GC
> pressure.
> 
> D.
> 
> On Wed, Oct 17, 2018 at 10:33 AM Igor Tanackovic <igor.tanackovic@gmail.com>
> wrote:
> 
>> Vladimir,
>> 
>> Thanks for explanation... that’s true - ignite never deserialize values
>> during query execution. The point here is why copying fields to heap that
>> sql engine could not use (not annotated as QuerySqlField)?
>> SQL count is a perfect example where you can benefit (heap space and gc)
>> copying only field sql engine could use.
>> 
>> Regards,
>> Igor
>> 
>> On Wed, Oct 17, 2018 at 18:47 Vladimir Ozerov <vozerov@gridgain.com>
>> wrote:
>> 
>>> Hi Igor,
>>> 
>>> We never deserialize values during query execution. Instead, we copy the
>>> row to heap and extract fields as needed. In general case it is
>> impossible
>>> to avoid reading the whole row because we do not know whether this is
>>> COUNT(*) or COUNT(*) WHERE <something> or COUNT(<something>) WHERE
>>> <something>.
>>> We do handle plain COUNT(*) as speical case and iterate over index only,
>>> but event this simple query cannot avoid row reads in general case when
>> new
>>> snapshot mode is enabled, because an entry in the index may be not
>> visible
>>> to current transaction.
>>> 
>>> On Wed, Oct 17, 2018 at 7:19 PM igor.tanackovic <
>> igor.tanackovic@gmail.com
>>>> 
>>> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> Seems that SQL engine always deserialize whole objects instead of using
>>>> just
>>>> SQL enabled fields (annotated with @QuerySqlField). This may have a
>> huge
>>>> impact on Ignite heap usage and GC overhead as well.
>>>> 
>>>> For example, we have a cache holding big objects but with only two sql
>>>> query
>>>> fields which for each query execution (SELECT COUNT(*) FROM 'cache')
>>>> consumes large amount on heap memory (~300MB). As a proof of concept,
>> we
>>>> divided the same cache to *index* cache with only sql query field and a
>>>> *data* holding whole object for materialization. The same query (SELECT
>>>> COUNT(*) FROM 'index-cache') consumes ~25 time less memory! The same is
>>>> true
>>>> for all other queries.
>>>> 
>>>> The obvious workaround would be to always have separated regions for
>>>> indexes
>>>> (sql query enabled region) and a data/value region for materialization,
>>> but
>>>> it might be a good idea to fix this in a systematic way during off heap
>>>> deserialization.
>>>> 
>>>> Regards,
>>>> Igor
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>>>> 
>>> 
>> 


Mime
View raw message