ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergi Vladykin <sergi.vlady...@gmail.com>
Subject Re: Performance of join SQL queries
Date Mon, 10 Aug 2015 07:29:26 GMT
Alexey,

Sure, we should document this better.

Sergi

2015-08-09 19:22 GMT+03:00 Alexey Kuznetsov <akuznetsov@gridgain.com>:

> Sergi, I think your answer (with appropriate changes) should be added to
> "Performance and Usability Considerations" in documentation
> https://apacheignite.readme.io/docs/cache-queries
> Users should know about EXPLAIN and group indexes from documentation.
>
> On Sun, Aug 9, 2015 at 11:08 PM, Sergi Vladykin <sergi.vladykin@gmail.com>
> wrote:
>
>> Hi!
>>
>> Ignite does support "EXPLAIN ..." syntax in SQL queries and reading
>> execution plans is a main way to analyze query performance in Ignite. Note
>> that plan cursor will contain multiple rows: the first one will contain
>> query for reducing node, others are for map nodes.
>>
>> I think in your case SQL engine chooses wrong index in a second query.
>> The problem on Ignite side is that we don't collect selectivity for each
>> index (all the single field indexes look equal to query optimizer).
>> The problem on your side is that you've created many single field indexes
>> instead of creating a smaller number of group indexes (see
>> @QuerySqlField.Group).
>>
>> For example for both queries it will be enough to have one group index on
>> CachedShipment (status, carrierType) instead of having two separate indexes
>> on (status) and (carrierType). Or as I see you already have startSiteName
>> in CachedShipment, if it has the same meaning as in Segment, then you
>> should add it to this group index as well like (startSiteName, status,
>> carrierType) and use CachedShipment.startSiteName instead of
>> Segment.startSiteName in these queries.
>>
>> Note that if you have an index on (startSiteName, status, carrierType)
>> query engine will be able to use it if query will contain conditions like:
>> 1. startSiteName = ?
>> 2. startSiteName = ? and status = ?
>> 3. startSiteName = ? and status = ? and carrierType = ?
>> 4. startSiteName = ? and carrierType = ? /** this one is Ok as well, but
>> for the index it looks no better than the first one. **/
>>
>> If the query contains condition like `status = ? and carrierType = ?` but
>> does not contain condition `startSiteName = ?` you need to have a separate
>> index, so choose order of fields in index wisely.
>>
>> Sergi
>>
>>
>> 2015-08-09 14:13 GMT+03:00 alex.glau <alex.glau@gmail.com>:
>>
>>> I prepared test with smaller number of data:
>>> I have two Ignite caches: "ShipmentCache" including 200000 records and
>>> "ShipmentSegmentCache" including 600500 records. These two caches
>>> (tables)
>>> are related by "shipmentId" field.
>>> We run this test on single Ignite node. The first query (see
>>> countJoinQuery() method) that has to criteria takes ~900msec but second
>>> one
>>> (see countJoinQuery_2()) that has one additional criteria took
>>> ~2900000msec
>>> !!!. So we want to understand execution plan. Is it possible to request
>>> execution plan as it supported by H2 database. Please see relevant code
>>> below.
>>>
>>> //  Set Cache for shipments
>>> CacheConfiguration<?, ?> shipmentCacheConfig = new
>>> CacheConfiguration<>();
>>> shipmentCacheConfig.setName(SHIPMENT_CACHE);
>>> shipmentCacheConfig.setCacheMode(CacheMode.PARTITIONED);
>>> shipmentCacheConfig.setAtomicityMode(CacheAtomicityMode.ATOMIC);
>>> shipmentCacheConfig.setIndexedTypes(UUID.class, CachedShipment.class);
>>> //  Set Cache for shipment segments
>>> CacheConfiguration<?, ?> shipmentSegmentCacheConfig = new
>>> CacheConfiguration<>();
>>> shipmentSegmentCacheConfig.setName(SHIPMENT_SEGMENT_CACHE);
>>> shipmentSegmentCacheConfig.setCacheMode(CacheMode.PARTITIONED);
>>> shipmentSegmentCacheConfig.setAtomicityMode(CacheAtomicityMode.ATOMIC);
>>> shipmentSegmentCacheConfig.setIndexedTypes(AffinityKey.class,
>>> CachedShipmentSegment.class);
>>>
>>> // Ignite configuration
>>> IgniteConfiguration igniteConfig = new IgniteConfiguration();
>>> igniteConfig.setCacheConfiguration(shipmentCacheConfig,
>>> shipmentSegmentCacheConfig);
>>>
>>> Ignite ignite = Ignition.start(igniteConfig);
>>>
>>> private void countJoinQuery() throws Exception {
>>>         IgniteCache<AffinityKey&lt;UUID>, CachedShipment>
>>> shipmentSegmentCache =
>>> ignite.cache(SHIPMENT_SEGMENT_CACHE);
>>>         String joinSql =
>>>                 "select count(*) " +
>>>                 "from CachedShipmentSegment as segment, \"" +
>>> SHIPMENT_CACHE +
>>> "\".CachedShipment as shipment " +
>>>                 "where segment.shipmentId = shipment.id and " +
>>>                 "segment.startSiteName = ? and shipment.status = ?";
>>>         QueryCursor<List&lt;?>> results = shipmentSegmentCache.query(new
>>> SqlFieldsQuery(joinSql).setArgs("Beta", ShipmentStatus.CLOSED));
>>>         System.out.println("\nSuccessful get count(*) : " +
>>> results.getAll().get(0).get(0));
>>> }
>>>
>>> private void countJoinQuery_2() throws Exception {
>>>         IgniteCache<AffinityKey&lt;UUID>, CachedShipment>
>>> shipmentSegmentCache =
>>> ignite.cache(SHIPMENT_SEGMENT_CACHE);
>>>         String joinSql =
>>>                 "select count(*) " +
>>>                 "from CachedShipmentSegment as segment, \"" +
>>> SHIPMENT_CACHE +
>>> "\".CachedShipment as shipment " +
>>>                 "where segment.shipmentId = shipment.id and " +
>>>                 "shipment.status = ? and segment.startSiteName = ? and
>>> shipment.carrierType = ?";
>>>
>>>         // Execute queries for find employees for different
>>> organizations.
>>>         QueryCursor<List&lt;?>> results = shipmentSegmentCache.query(new
>>> SqlFieldsQuery(joinSql).setArgs(ShipmentStatus.CLOSED, "Beta",
>>> CarrierType.AVIA));
>>>         System.out.println("\nSuccessful get count(*): " +
>>> results.getAll().get(0).get(0));
>>> }
>>>
>>> public class CachedShipment implements Externalizable {
>>>         @QuerySqlField(index = true)
>>>         private UUID id;
>>>         @QuerySqlField(index = true)
>>>         private String name;
>>>         @QuerySqlField
>>>         private String description;
>>>         @QuerySqlField(index=true)
>>>         private String startSiteName;
>>>         @QuerySqlField(index=true)
>>>         private Date startDate;
>>>         @QuerySqlField(index = true)
>>>         private ShipmentStatus status;
>>>         @QuerySqlField(index = true)
>>>         private String destOrgName;
>>>         @QuerySqlField(index = true)
>>>         private CarrierType carrierType;
>>>
>>> //      constructor & getter/setter...
>>> }
>>>
>>> public class CachedShipmentSegment implements Externalizable {
>>>
>>>         private UUID id;
>>>         @QuerySqlField(index = true)
>>>         private UUID shipmentId;
>>>         @QuerySqlField(index = true)
>>>         private String startSiteName;
>>>         @QuerySqlField(index = true)
>>>         private String endSiteName;
>>>
>>>     public AffinityKey<UUID> key() {
>>>         if (key == null)
>>>             key = new AffinityKey<>(id, shipmentId);
>>>         return key;
>>>     }
>>>
>>> //      constructor & getter/setter...
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-ignite-users.70518.x6.nabble.com/Performance-of-join-SQL-queries-tp845p870.html
>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>>
>>
>>
>
>
> --
> Alexey Kuznetsov
> GridGain Systems
> www.gridgain.com
>

Mime
View raw message