ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yakov Zhdanov <yzhda...@apache.org>
Subject Re: Ignite performance
Date Wed, 03 Aug 2016 14:28:13 GMT
How many concurrent queries do you have? Or in other words - how many
threads does your executor have? If it has several ones then I understand
why CPU load goes up to max.

I am also not sure about the measurements. All your queries get scheduled
immediately and initialize startTime at schedule, but the time they stay
queued is still accounted. Correct? If yes, I would suggest you rewrite
your benchmark. Just start N threads and make each thread submit your query
measuring the time.

Please set "peerClassLoading" to false. Please provide cache configuration
as well and MyObject.java

--Yakov

2016-08-03 16:47 GMT+03:00 Piubelli, Manuel <manuel.piubelli@citi.com>:

> Hi Yakov,
>
>
>
> Thank you very much for your reply, addressing a few questions:
>
>
>
> > “Is it correct that you run your query in a loop”
>
>
>
> The query is run asynchronously, as I want to simulate multiple clients
> hitting the cluster at the same time (see queries/second) so its
> technically not a loop but just scheduled callables.
>
>
>
> > “giving enough time for the whole cluster to warmup and only then take
> the final measurements?”
>
>
>
> I have a non-measured warmup round firing queries for 5 minutes before
> starting the measurements.
>
>
>
> > I also do not understand why CPU load is 400% which may be interpreted
> as full (correct?). This means that at least 4 threads are busy on each
> node, but when you broadcast your query it is processed with only 1 thread
> on each node.
>
>
>
> Yes, I noticed *up to* 400% which is full (my box has 4 Cores), I would
> explain this with the fact that the ignite cluster is hit with other
> requests while it is processing the first, would that explain it?
>
>
>
> >Having this in mind you can try launching 1 node per 1 core on each
> server - this will split your data set and will lower the amount of work
> for each node.
>
>
>
> Would this mean that instances compete more for the same cores in a high
> throughput scenario? Is there a way to have one node restricted to one
> process – or should I lower the size of the thread pool?
>
>
>
> Code (SQL example) :
>
> //Async Test Runner
>
> final Queue<Integer> timings = new LinkedBlockingQueue<Integer>();
>
>               for (long i = 0; i < requestsPerSecond * testTime; i++) {
>
>                      PerformanceTest test =
> PerformanceTestFactory.getIgnitePerformanceTest(ignite,testName,timings);
>
>                      executor.schedule(test, 0, TimeUnit.SECONDS);
>
>                      Thread.sleep(1000/requestsPerSecond);
>
>               }
>
>               executor.shutdownNow();
>
>
>
> //Runnable class
>
> public abstract class PerformanceTest implements Runnable {
>
>
>
>        protected Ignite ignite;
>
>        private Queue<Integer> timings;
>
>        private long startTime;
>
>        protected IgniteCache<String,BinaryObject> cache;
>
>
>
>        public PerformanceTest(Ignite ignite,Queue<Integer> timings) {
>
>               super();
>
>               this.ignite = ignite;
>
>               this.timings = timings;
>
>               this.cache = ignite.cache("MyObjCache").withKeepBinary();
>
>               this.startTime = System.currentTimeMillis();
>
>        }
>
>
>
>        @Override
>
>        public void run() {
>
>               runTest();
>
>               this.timings.add((int) (System.currentTimeMillis() -
> this.startTime));
>
>        }
>
>
>
>        public abstract void runTest();
>
>
>
> }
>
> //Runnable subclass i.e. Test
>
> public class SQLQueryPerformanceTest extends PerformanceTest{
>
>
>
>        private static final String queryString = "select SUM(field1 *
> field2)/SUM(field2) as perf from MyObj ";
>
>        private final SqlFieldsQuery query;
>
>
>
>        public SQLQueryPerformanceTest(Ignite ignite, Queue<Integer>
> timings) {
>
>               super(ignite, timings);
>
>               this.query = new SqlFieldsQuery(queryString);
>
>        }
>
>
>
>        @Override
>
>        public void runTest() {
>
>               this.cache.query(query).getAll();
>
>        }
>
> }
>
>
>
> Ignite configuration:
>
> <bean abstract="true" id="ignite.cfg"
> class="org.apache.ignite.configuration.IgniteConfiguration">
>
>         <!-- Set to true to enable distributed class loading for examples,
> default is false. -->
>
>         <property name="peerClassLoadingEnabled" value="true"/>
>
>
>
>         <!-- Explicitly configure TCP discovery SPI to provide list of
> initial nodes. -->
>
>         <property name="discoverySpi">
>
>             <bean
> class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
>
>                 <property name="ipFinder">
>
>                     <bean
> class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
>
>                         <property name="addresses">
>
>                             <list>
>
>                                 <!-- In distributed environment, replace
> with actual host IP address. -->
>
>                                 <value>127.0.0.1:47500..47509</value>
>
>                             </list>
>
>                         </property>
>
>                     </bean>
>
>                 </property>
>
>             </bean>
>
>         </property>
>
>     </bean>
>
>
>
>
>
> *From:* Yakov Zhdanov [mailto:yzhdanov@apache.org]
> *Sent:* 03 August 2016 13:37
> *To:* user@ignite.apache.org
> *Subject:* Re: Ignite performance
>
>
>
> Manuel,
>
>
>
> The numbers you are seeing are pretty strange to me.
>
>
>
> Is it correct that you run your query in a loop giving enough time for the
> whole cluster to warmup and only then take the final measurements?
>
> I also do not understand why CPU load is 400% which may be interpreted as
> full (correct?). This means that at least 4 threads are busy on each node,
> but when you broadcast your query it is processed with only 1 thread on
> each node. Having this in mind you can try launching 1 node per 1 core on
> each server - this will split your data set and will lower the amount of
> work for each node. However question with high CPU utilization is still
> open. Can you please provide stack for those threads if they are Ignite
> threads. You can follow these instructions -
> https://blogs.oracle.com/jiechen/entry/analysis_against_jvm_thread_dump
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.oracle.com_jiechen_entry_analysis-5Fagainst-5Fjvm-5Fthread-5Fdump&d=CwMFaQ&c=j-EkbjBYwkAB4f8ZbVn1Fw&r=TmHAzydJEwZXF4nTEyGZO2lJF7c9EsVzP6DtLQQqOVQ&m=B78AXF0DliAyJs9kSHwzknrhfV3OSCx-EADJCFVe5Qs&s=eyvb9bLoPlaPiuMo7oQfUhIu_7_zwwdtYudCAxmnpyY&e=>
>
>
>
> Please tell me what machines you are running this test on. I would ask you
> to do all measurements on hardware machines (not virtual) giving all
> resources to Ignite.
>
>
>
> Please also share your code and configuration for cluster nodes.
>
>
> --Yakov
>
>
>
> 2016-08-03 12:49 GMT+03:00 Piubelli, Manuel <manuel.piubelli@citi.com>:
>
> Hello,
>
> I am currently benchmarking Apache Ignite for a near real-time application
> and simple operations seem to be excessively slow for a relatively small
> sample size. *The following is giving the setup details and timings -
> please see 2 questions at the bottom.*
>
> Setup:
>
> ·         Cache mode: Partitioned
>
> ·         Number of server nodes: 3
>
> ·         CPUs: 4 per node (12)
>
> ·         Heap size: 2GB per node (6GB)
>
> The first use case is computing the weighted average over two fields of
> the object at different rates.
>
> First method is to run a SQL style query:
>
> ...
>
> query = new SqlFieldsQuery("select SUM(field1*field2)/SUM(field2) from
> MyObject");
>
> cache.query(query).getAll();
>
> ....
>
> The observed timings are:
>
> Cache: 500,000 Queries/second: 10
> Median: 428ms, 90th percentile: 13,929ms
>
> Cache: 500,000 Queries/second: 50
> Median: 191,465ms, 90th percentile: 402,285ms
>
> Clearly this is queuing up with an enormous latency (>400 ms), a simple
> weighted average computation on a single jvm (4 Cores) takes 6 ms.
>
> The second approach is to use the IgniteCompute to broadcast Callables
> across nodes and compute the weighted average on each node, reducing at the
> caller, latency is only marginally better, throughput improves but still at
> unusable levels.
>
> Cache: 500,000 Queries/second: 10
> Median: 408ms, 90th percentile: 507ms
>
> Cache: 500,000 Queries/second: 50
> Median: 114,155ms, 90th percentile: 237,521ms
>
> A few things i noticed during the experiment:
>
> ·         No disk swapping is happening
>
> ·         CPUs run at up to 400%
>
> ·         Query is split up in two different weighted averages (map
> reduce)
>
> ·         Entries are evenly split across the nodes
>
> ·         No garbage collections are triggered with each heap size around
> 500MB
>
> To my questions:
>
> 1.    *Are these timings expected or is there some obvious setting i am
> missing? I could not find benchmarks on similar operations.*
>
> 2.    *What is the advised method to run fork-join style computations on
> ignite without moving data?*
>
> Thank you
>
> Manuel
>
>
>

Mime
View raw message