ignite-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Semen Boikov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (IGNITE-2921) ScanQueries over local partitions are not optimal
Date Fri, 29 Apr 2016 12:19:13 GMT

    [ https://issues.apache.org/jira/browse/IGNITE-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263970#comment-15263970

Semen Boikov commented on IGNITE-2921:

Hi Artem,

Reviewed your fix, looks good, have some minor comments:
- please change CacheWeakQueryIteratorsHolder to avoid extra AutoCloseable creation (CacheIteratorConverter
is already AutoCloseable, CacheQueryFuture also can implement it). 
- scan iterator is wrapped twice in GrdiCacheAdaper.igniteIterator: first in CacheWeakQueryIteratorsHolder
and then in GrdiCacheAdaper.igniteIterator, let's keep only one wrapper.
- do we really always need copy nodes collection inside 'nodes(final GridCacheContext<?,
?> cctx, @Nullable final ClusterGroup prj, @Nullable final Integer part)'?
- please take a look at this place in PeekValueExpiryAwareIterator: we call 'unwrapBinaryIfNeeded'
for key and value, then it possible that 'unwrapBinaryIfNeeded' is called one more time inside
'checkPredicate'. Can we get rid of one call?

> ScanQueries over local partitions are not optimal
> -------------------------------------------------
>                 Key: IGNITE-2921
>                 URL: https://issues.apache.org/jira/browse/IGNITE-2921
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 1.5.0.final
>            Reporter: Denis Magda
>            Assignee: Artem Shutak
>            Priority: Blocker
>              Labels: community, important, performance
>             Fix For: 1.6
>         Attachments: LocalIteratorStuff.java, ScanQueryStuff.java
> Presently scan queries over local partitions are not executed optimally. 
> If to run a scan query over a specific partition (by setting {{query.setPartition(...)}}
parameter and or {{query.setLocal(true)}}) and start iterating over entries we will see that
the Thread, that iterates over the data, waits for some event to happen.
> In fact the Thread waits while a system pool's thread prepares an iterator with entries
for it and only after that iterates over the returned result set. The flow looks this way:
> - {{GridCacheLocalQueryFuture}} is created;
> - when {{QueryCursor.iterator().next}} is called from the app thread (the Thread above),
{{GridCacheLocalQueryFuture.execute()}} methods puts closure that will prepare content for
the iterator in the system pool.
> -  a system Thread execute {{GridCacheQueryManager.runQuery()}} reading all the entries
from partition and passing them back to the Thread at line 1553 by calling {{onPageReady(...)}}
> The other bottleneck is that a system thread gets all the entries and passes them to
the Thread which will lead to more garbaged Java heap especially if cache is {{OFFHEAP_TIRED}}.
> Run attached test ({{ScanQueryStuff}}) and you will see with Visual VM that most of the
time the test spends executing the code from system threads.
> Finally, what have to be done:
> - if ScanQuery is supposed to be executed locally (setPartition() refers to local partition
or setLocal is set to true) then the calling application thread has to iterate over the data
avoiding usage of the system pool;
> - internal code mustn't read all entries from a partition initially. The iterator has
to get one entry next after another. This will be a memory backpressure mechanism especially
> My assumption is that the fixed version has to work in a similar way to iteration over
local entries - {{cache.localEntries(CachePeekMode.PRIMARY);}}. Run attached {{LocalIteratorStuff}}
to see with Visual VM that the application thread is fully utilized and system threads are

This message was sent by Atlassian JIRA

View raw message