ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexei Scherbakov <alexey.scherbak...@gmail.com>
Subject Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?
Date Thu, 26 May 2016 16:46:35 GMT
You should measure performance on the real-life cases and see if it's
enough for you.
Ignite performs good in both modes.
If you really want to use ONHEAP_TIERED, you must tune GC and heap size, as
described here [1]
Make sure you have enough memory for your dataset.
The goal is to avoid long GC pauses.

[1] https://apacheignite.readme.io/docs/jvm-and-system-tuning

2016-05-26 19:40 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com>:

> Ok, I will try it. However, Why OFF_HEAP_TIERED ?  It seem to be not fast
> as ON HEAP
>
> 2016-05-26 18:32 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com
> >:
>
>> We are talking about count(*) query performance, right ?
>> WriteBehind is for writing to CacheStore in the async mode.
>>
>> If yes, do the following:
>>
>> 1) Set OFFHEAP_TIERED mode and reduce max heap memory on example to 4Gb.
>> 2) Update to Ignite 1.6
>> 3) Measure query performance. Run the query several times and use average
>> value as the estimation.
>> 4) If it's not as expected, show me GC logs.
>>
>>
>>
>> 2016-05-26 18:28 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com>:
>>
>>> No, I am using ON_HEAP_TIERED.
>>>
>>> Maybe WriteBehind should be turned on ?
>>> My App do exactly one thing:  initialize hot loading.
>>>
>>> When it comes to JDBC client, I did show fragment of code in previous
>>> post.
>>>
>>> 2016-05-26 16:15 GMT+02:00 Alexei Scherbakov <
>>> alexey.scherbakoff@gmail.com>:
>>>
>>>> I see long pauses in your GC log (> 3 seconds)
>>>> This means your app have high pressure on the heap.
>>>> It's hard to tell why without knowing what your app is doing.
>>>>
>>>> Are you using OFFHEAP_TIERED?
>>>> If yes, try to reduce sqlOnheapRowCacheSize value.
>>>>
>>>>
>>>>
>>>>
>>>> 2016-05-26 14:57 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com>:
>>>>
>>>>> Ok,
>>>>> i am going to add new machines to ignite cluster. Firstly, please look
>>>>> at my gc file log - previous message.
>>>>>
>>>>> 2016-05-26 13:39 GMT+02:00 Alexei Scherbakov <
>>>>> alexey.scherbakoff@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> The initial question was about setSqlOnheapRowCacheSize and I think
>>>>>> now it is clear how to improve SQL performance using with parameter.
>>>>>>
>>>>>> If you dissatisfied with the Ignite performance, I suggest you to
>>>>>> start a new thread on this,
>>>>>> providing detailed info about your performance test like
>>>>>> cluster configuration, server GC settings, and test sources.
>>>>>>
>>>>>> As already mentioned, Ignite SQL engine(H2) has the same(or slightly)
>>>>>> less performance when Postresql.
>>>>>> Ignite really starts to shine when used as distributed data grid
>>>>>> having large amount of data in memory on several nodes.
>>>>>>
>>>>>> SELECT count(*) from table is not very good test query.
>>>>>> Postgres may have the result cached, whereas Ignite always do the
>>>>>> full table traversal.
>>>>>> Recently I implemented an improvement for this case.
>>>>>> See https://issues.apache.org/jira/browse/IGNITE-2751 for details.
>>>>>>
>>>>>> I strongly recommend to test Ignite performance on the real case.
>>>>>> Dont' forget to configure GC properly [1]
>>>>>>
>>>>>> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2016-05-26 2:09 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com>:
>>>>>>
>>>>>>> | Also it would be interesting to see result of
>>>>>>> | SELECT count(*) from the query above in both cases.
>>>>>>> (number of rows = 2 798 685)
>>>>>>> SELECT count(*) FROM postgresTable;
>>>>>>>  456 ms
>>>>>>> SELECT count(*) FROM postgresTable;
>>>>>>> 314 ms
>>>>>>>
>>>>>>> SELECT count(*) FROM igniteTable;
>>>>>>> 9746 ms
>>>>>>> SELECT count(*) FROM igniteTable;
>>>>>>> 9664 ms
>>>>>>>
>>>>>>>
>>>>>>> Code of Jdbc Drvier (the same code for Ignite and postgresql
- url
>>>>>>> connection is given from command line):
>>>>>>> http://pastebin.com/mYDSjziN
>>>>>>> My start sh file:
>>>>>>> http://pastebin.com/VmRM2sPQ
>>>>>>>
>>>>>>> My gc log file (following hint Magda):
>>>>>>> (file generated during hot loading and query via JDBC).
>>>>>>> http://pastebin.com/XicnNczV
>>>>>>>
>>>>>>>
>>>>>>> If you would like to see something else let me know.
>>>>>>>
>>>>>>> PS How to launch H2 debug console ? I followed docs, but it doesn't
>>>>>>> help.
>>>>>>> I set enviroment variable:
>>>>>>> echo $IGNITE_H2_DEBUG_CONSOLE
>>>>>>> true
>>>>>>> now, ./ignite.sh conf.xml
>>>>>>>
>>>>>>> sudo netstat -tulpn | grep 61214
>>>>>>> No opened ports.
>>>>>>>
>>>>>>> BTW, during starting ignite it give me information:
>>>>>>> [01:03:02]  Performance suggestions for grid
>>>>>>> 'turbines_table_cluster' (fix if possible)
>>>>>>> [01:03:02] To disable, set
>>>>>>> -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
>>>>>>> [01:03:02]   ^-- Disable grid events (remove 'includeEventTypes'
>>>>>>> from configuration)
>>>>>>> [01:03:02]   ^-- Enable ATOMIC mode if not using transactions
(set
>>>>>>> 'atomicityMode' to ATOMIC)
>>>>>>> [01:03:02]   ^-- Enable write-behind to persistent store (set
>>>>>>> 'writeBehindEnabled' to true)
>>>>>>>
>>>>>>>
>>>>>>> 2016-05-25 12:23 GMT+02:00 Alexei Scherbakov <
>>>>>>> alexey.scherbakoff@gmail.com>:
>>>>>>>
>>>>>>>> For postgres test I mean initial jdbc query and result
>>>>>>>> set traversal.
>>>>>>>> For Ignite I mean sql query and iterator traversal.
>>>>>>>> Also it would be interesting to see result of
>>>>>>>> *SELECT count(*) from the query above in both cases.*
>>>>>>>>
>>>>>>>> 2016-05-25 12:00 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com>:
>>>>>>>>
>>>>>>>>> [image: Obraz w treści 1]
>>>>>>>>>
>>>>>>>>> What code do you mean ? JDBC client ?
>>>>>>>>>
>>>>>>>>> 2016-05-25 10:25 GMT+02:00 Alexei Scherbakov <
>>>>>>>>> alexey.scherbakoff@gmail.com>:
>>>>>>>>>
>>>>>>>>>> What's the batch size for postgresql ?
>>>>>>>>>> What's the size of one entry ?
>>>>>>>>>> Could you provide the test code for both postgres
and Ignite
>>>>>>>>>> (just the query + read with the time estimation)
?
>>>>>>>>>>
>>>>>>>>>> 2016-05-25 11:13 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> | How many entries are downloaded to the client
in both cases?
>>>>>>>>>>> 3000 000
>>>>>>>>>>>
>>>>>>>>>>> | Do the both queries involve network I/O ?
>>>>>>>>>>> No, I have only local one server (for testing
purpose).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2016-05-25 9:59 GMT+02:00 Alexei Scherbakov <
>>>>>>>>>>> alexey.scherbakoff@gmail.com>:
>>>>>>>>>>>
>>>>>>>>>>>> SELECT * is not really a good test query.
>>>>>>>>>>>> It's result can be affected not only by engine
performance.
>>>>>>>>>>>>
>>>>>>>>>>>> How many entries are downloaded to the client
in both cases?
>>>>>>>>>>>> Do the both queries involve network I/O ?
>>>>>>>>>>>>
>>>>>>>>>>>> 2016-05-25 7:58 GMT+03:00 Denis Magda <dmagda@gridgain.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> In general Ignite is designed to be used
in a distributed
>>>>>>>>>>>>> environment when gigabytes or terabytes
of dataset is spread across many
>>>>>>>>>>>>> cluster nodes and SQL queries executed
across the cluster should be faster
>>>>>>>>>>>>> since resources of all the machines will
be used and as a result a query
>>>>>>>>>>>>> should be completed quicker. In your
scenario you just have only a single
>>>>>>>>>>>>> cluster node and in fact comparing performance
of PostgreSQL and H2 (engine
>>>>>>>>>>>>> that is used by Ignite SQL) and I can
consider that Ignite SQL can work
>>>>>>>>>>>>> slightly slowly but this in is not Ignite
usage scenario.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However if you try to create a cluster
of several nodes
>>>>>>>>>>>>> running on different physical machines,
pre-load gigabytes of data there
>>>>>>>>>>>>> and compare Ignite SQL and PostgresSQL
you should see performance
>>>>>>>>>>>>> improvements on Ignite side.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In any case taking into account the advise
above do the
>>>>>>>>>>>>> following:
>>>>>>>>>>>>> - execute “EXPLAIN” query to see
that the index is chose
>>>>>>>>>>>>> properly [1];
>>>>>>>>>>>>> - H2 console will allow you to see how
fast a query is
>>>>>>>>>>>>> presently executed on a single node removing
several Ignite layers [2];
>>>>>>>>>>>>> - check if you have any GC pauses during
query execution since
>>>>>>>>>>>>> it can affect execution time [3]
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also share the objects you use as keys
and values.
>>>>>>>>>>>>>
>>>>>>>>>>>>> [1]
>>>>>>>>>>>>> https://apacheignite.readme.io/docs/sql-queries#using-explain
>>>>>>>>>>>>> [2]
>>>>>>>>>>>>> https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console
>>>>>>>>>>>>> [3]
>>>>>>>>>>>>> https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats
>>>>>>>>>>>>>
>>>>>>>>>>>>> —
>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>
>>>>>>>>>>>>> On May 25, 2016, at 3:23 AM, Tomek W
<rrrtomtomrrr@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> +==============================================================================================+
>>>>>>>>>>>>> |     Node ID8(@), IP      | CPUs | Heap
Used | CPU Load |
>>>>>>>>>>>>> Up Time   |  Size   | Hi/Mi/Rd/Wr |
>>>>>>>>>>>>>
>>>>>>>>>>>>> +==============================================================================================+
>>>>>>>>>>>>> | 0F0AAF99(@n0), 127.0.0.1 | 8    | 54.50
%   | 3.23 %   |
>>>>>>>>>>>>> 00:13:13:49 | 3000000 | Hi: 0       |
>>>>>>>>>>>>> |                          |      | 
         |
>>>>>>>>>>>>> |             |         | Mi: 0     
 |
>>>>>>>>>>>>> |                          |      | 
         |
>>>>>>>>>>>>> |             |         | Rd: 0     
 |
>>>>>>>>>>>>> |                          |      | 
         |
>>>>>>>>>>>>> |             |         | Wr: 0     
 |
>>>>>>>>>>>>>
>>>>>>>>>>>>> +----------------------------------------------------------------------------------------------+
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I followed your hints. Actually, client
doesn't require such
>>>>>>>>>>>>> many memory as before - thanks for it.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> When it comes to configuration of server,
I also followed your
>>>>>>>>>>>>> hints, results:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Querying is done by JDBC Client.  In
ignite and postgresql I
>>>>>>>>>>>>> have single index on column A.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ignite: SELECT * FROM table WHERE A >
1345 takes 6s.
>>>>>>>>>>>>> Postgres: SELECT * FROM table WHERE A
> 1345 takes 4s.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As you  can see, postgres is still bettter
than Ignite.  I
>>>>>>>>>>>>> show you significant fragments of my
configuration:
>>>>>>>>>>>>> http://pastebin.com/EQC4JPWR
>>>>>>>>>>>>>
>>>>>>>>>>>>> And xml for server file:
>>>>>>>>>>>>> http://pastebin.com/enR9h5J4
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Try to consider why postgresql is still
better, please.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Alexei Scherbakov
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Alexei Scherbakov
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Alexei Scherbakov
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Best regards,
>>>>>> Alexei Scherbakov
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Best regards,
>>>> Alexei Scherbakov
>>>>
>>>
>>>
>>
>>
>> --
>>
>> Best regards,
>> Alexei Scherbakov
>>
>
>


-- 

Best regards,
Alexei Scherbakov

Mime
View raw message