ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@gridgain.com>
Subject Re: off heap indexes - setSqlOnheapRowCacheSize - how does it improve efficiency ?
Date Thu, 26 May 2016 20:07:52 GMT
Here is a link with rough estimation
https://apacheignite.readme.io/docs/capacity-planning <https://apacheignite.readme.io/docs/capacity-planning>

—
Denis

> On May 26, 2016, at 8:09 PM, Tomek W <rrrtomtomrrr@gmail.com> wrote:
> 
> | Make sure you have enough memory for your dataset.
> How to check it  ?
> 
> 2016-05-26 18:46 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> You should measure performance on the real-life cases and see if it's enough for you.
> Ignite performs good in both modes.
> If you really want to use ONHEAP_TIERED, you must tune GC and heap size, as described
here [1]
> Make sure you have enough memory for your dataset.
> The goal is to avoid long GC pauses.
> 
> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning <https://apacheignite.readme.io/docs/jvm-and-system-tuning>
> 
> 2016-05-26 19:40 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>:
> Ok, I will try it. However, Why OFF_HEAP_TIERED ?  It seem to be not fast as ON HEAP
> 
> 2016-05-26 18:32 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> We are talking about count(*) query performance, right ?
> WriteBehind is for writing to CacheStore in the async mode.
> 
> If yes, do the following:
> 
> 1) Set OFFHEAP_TIERED mode and reduce max heap memory on example to 4Gb.
> 2) Update to Ignite 1.6
> 3) Measure query performance. Run the query several times and use average value as the
estimation.
> 4) If it's not as expected, show me GC logs.
> 
> 
> 
> 2016-05-26 18:28 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>:
> No, I am using ON_HEAP_TIERED.
> 
> Maybe WriteBehind should be turned on ?
> My App do exactly one thing:  initialize hot loading.
> 
> When it comes to JDBC client, I did show fragment of code in previous post.
> 
> 2016-05-26 16:15 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> I see long pauses in your GC log (> 3 seconds)
> This means your app have high pressure on the heap.
> It's hard to tell why without knowing what your app is doing.
> 
> Are you using OFFHEAP_TIERED?
> If yes, try to reduce sqlOnheapRowCacheSize value.
> 
> 
> 
> 
> 2016-05-26 14:57 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>:
> Ok,
> i am going to add new machines to ignite cluster. Firstly, please look at my gc file
log - previous message.
> 
> 2016-05-26 13:39 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> Hi,
> 
> The initial question was about setSqlOnheapRowCacheSize and I think
> now it is clear how to improve SQL performance using with parameter.
> 
> If you dissatisfied with the Ignite performance, I suggest you to start a new thread
on this,
> providing detailed info about your performance test like
> cluster configuration, server GC settings, and test sources.
> 
> As already mentioned, Ignite SQL engine(H2) has the same(or slightly) less performance
when Postresql.
> Ignite really starts to shine when used as distributed data grid having large amount
of data in memory on several nodes.
> 
> SELECT count(*) from table is not very good test query.
> Postgres may have the result cached, whereas Ignite always do the full table traversal.
> Recently I implemented an improvement for this case.
> See https://issues.apache.org/jira/browse/IGNITE-2751 <https://issues.apache.org/jira/browse/IGNITE-2751>
for details.
> 
> I strongly recommend to test Ignite performance on the real case.
> Dont' forget to configure GC properly [1]
> 
> [1] https://apacheignite.readme.io/docs/jvm-and-system-tuning <https://apacheignite.readme.io/docs/jvm-and-system-tuning>
> 
> 
> 
> 
> 
> 
> 2016-05-26 2:09 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>:
> | Also it would be interesting to see result of 
> | SELECT count(*) from the query above in both cases.
> (number of rows = 2 798 685)
> SELECT count(*) FROM postgresTable;
>  456 ms
> SELECT count(*) FROM postgresTable;
> 314 ms
> 
> SELECT count(*) FROM igniteTable;
> 9746 ms
> SELECT count(*) FROM igniteTable;
> 9664 ms
> 
> 
> Code of Jdbc Drvier (the same code for Ignite and postgresql - url connection is given
from command line):
> http://pastebin.com/mYDSjziN <http://pastebin.com/mYDSjziN>
> My start sh file:
> http://pastebin.com/VmRM2sPQ <http://pastebin.com/VmRM2sPQ>
> 
> My gc log file (following hint Magda):
> (file generated during hot loading and query via JDBC).
> http://pastebin.com/XicnNczV <http://pastebin.com/XicnNczV>
> 
> 
> If you would like to see something else let me know.
> 
> PS How to launch H2 debug console ? I followed docs, but it doesn't help.  
> I set enviroment variable:
> echo $IGNITE_H2_DEBUG_CONSOLE
> true
> now, ./ignite.sh conf.xml
> 
> sudo netstat -tulpn | grep 61214
> No opened ports.
> 
> BTW, during starting ignite it give me information: 
> [01:03:02]  Performance suggestions for grid 'turbines_table_cluster' (fix if possible)
> [01:03:02] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
> [01:03:02]   ^-- Disable grid events (remove 'includeEventTypes' from configuration)
> [01:03:02]   ^-- Enable ATOMIC mode if not using transactions (set 'atomicityMode' to
ATOMIC)
> [01:03:02]   ^-- Enable write-behind to persistent store (set 'writeBehindEnabled' to
true)
> 
> 
> 2016-05-25 12:23 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> For postgres test I mean initial jdbc query and result set traversal.
> For Ignite I mean sql query and iterator traversal.
> Also it would be interesting to see result of 
> SELECT count(*) from the query above in both cases.
> 
> 2016-05-25 12:00 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>:
> <image.png>
> 
> What code do you mean ? JDBC client ?
> 
> 2016-05-25 10:25 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> What's the batch size for postgresql ?
> What's the size of one entry ?
> Could you provide the test code for both postgres and Ignite (just the query + read with
the time estimation) ?
> 
> 2016-05-25 11:13 GMT+03:00 Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>:
> | How many entries are downloaded to the client in both cases?
> 3000 000
> 
> | Do the both queries involve network I/O ?
> No, I have only local one server (for testing purpose).
> 
> 
> 2016-05-25 9:59 GMT+02:00 Alexei Scherbakov <alexey.scherbakoff@gmail.com <mailto:alexey.scherbakoff@gmail.com>>:
> SELECT * is not really a good test query.
> It's result can be affected not only by engine performance.
> 
> How many entries are downloaded to the client in both cases?
> Do the both queries involve network I/O ?
> 
> 2016-05-25 7:58 GMT+03:00 Denis Magda <dmagda@gridgain.com <mailto:dmagda@gridgain.com>>:
> In general Ignite is designed to be used in a distributed environment when gigabytes
or terabytes of dataset is spread across many cluster nodes and SQL queries executed across
the cluster should be faster since resources of all the machines will be used and as a result
a query should be completed quicker. In your scenario you just have only a single cluster
node and in fact comparing performance of PostgreSQL and H2 (engine that is used by Ignite
SQL) and I can consider that Ignite SQL can work slightly slowly but this in is not Ignite
usage scenario.
> 
> However if you try to create a cluster of several nodes running on different physical
machines, pre-load gigabytes of data there and compare Ignite SQL and PostgresSQL you should
see performance improvements on Ignite side.
> 
> In any case taking into account the advise above do the following:
> - execute “EXPLAIN” query to see that the index is chose properly [1];
> - H2 console will allow you to see how fast a query is presently executed on a single
node removing several Ignite layers [2];
> - check if you have any GC pauses during query execution since it can affect execution
time [3]
> 
> Also share the objects you use as keys and values.
> 
> [1] https://apacheignite.readme.io/docs/sql-queries#using-explain <https://apacheignite.readme.io/docs/sql-queries#using-explain>
> [2] https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console <https://apacheignite.readme.io/docs/sql-queries#using-h2-debug-console>
> [3] https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats
<https://apacheignite.readme.io/v1.6/docs/jvm-and-system-tuning#section-detailed-garbage-collection-stats>
> 
> —
> Denis
> 
>> On May 25, 2016, at 3:23 AM, Tomek W <rrrtomtomrrr@gmail.com <mailto:rrrtomtomrrr@gmail.com>>
wrote:
>> 
>> +==============================================================================================+
>> |     Node ID8(@), IP      | CPUs | Heap Used | CPU Load |   Up Time   |  Size  
| Hi/Mi/Rd/Wr |
>> +==============================================================================================+
>> | 0F0AAF99(@n0), 127.0.0.1 | 8    | 54.50 %   | 3.23 %   | 00:13:13:49 | 3000000
| Hi: 0       |
>> |                          |      |           |          |             |        
| Mi: 0       |
>> |                          |      |           |          |             |        
| Rd: 0       |
>> |                          |      |           |          |             |        
| Wr: 0       |
>> +----------------------------------------------------------------------------------------------+
>> 
>> 
>> I followed your hints. Actually, client doesn't require such many memory as before
- thanks for it.
>> 
>> 
>> When it comes to configuration of server, I also followed your hints, results:
>> 
>> Querying is done by JDBC Client.  In ignite and postgresql I have single index on
column A.
>> 
>> Ignite: SELECT * FROM table WHERE A > 1345 takes 6s.
>> Postgres: SELECT * FROM table WHERE A > 1345 takes 4s.
>> 
>> As you  can see, postgres is still bettter than Ignite.  I show you significant fragments
of my configuration:
>> http://pastebin.com/EQC4JPWR <http://pastebin.com/EQC4JPWR>
>> 
>> And xml for server file:
>> http://pastebin.com/enR9h5J4 <http://pastebin.com/enR9h5J4>
>> 
>> 
>> Try to consider why postgresql is still better, please.
>> 
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 
> 
> 
> 
> -- 
> 
> Best regards,
> Alexei Scherbakov
> 


Mime
View raw message