ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Kishore <rajesh10si...@gmail.com>
Subject Re: slow query performance against berkley db
Date Fri, 09 Feb 2018 04:08:52 GMT
Igniters any pointers pls.

Regards,
Rajesh

On Wed, Feb 7, 2018 at 9:15 AM, Rajesh Kishore <rajesh10sinha@gmail.com>
wrote:

> Hi Dmitry,
>
> Thanks a ton.
>
> What is not convincing to me is with just *.1 M  in main table and* *2 M
> records in other table * , sql query is taking around 24 sec, that is
> worrisome.
> In local cache mode , I tried both using partitioned and non partitioned
> mode , the result is same.
> All I wanted to know , is my approach is wrong  somewhere? I am sure
> igniters would correct me with my approach used.
>
> Regards,
> -Rajesh
>
> On Wed, Feb 7, 2018 at 8:23 AM, Dmitriy Setrakyan <dsetrakyan@apache.org>
> wrote:
>
>> Hi Rajesh,
>>
>> Please allow the community some time to test your code.
>>
>> As far as testing single node vs. distributed, when you have more than
>> one node, Ignite will split your data set evenly across multiple nodes.
>> This means that when running the query, it will be executed on each node on
>> smaller data sets in parallel, which should provide better performance. If
>> your query does some level of scanning, then the more nodes you add, the
>> faster it will get.
>>
>> D.
>>
>> On Tue, Feb 6, 2018 at 5:02 PM, Rajesh Kishore <rajesh10sinha@gmail.com>
>> wrote:
>>
>>> Hi All
>>> Please help me in getting the pointers, this is deciding factor for us
>>> to further evaluate ignite. Somehow we are not convinced with just  . 1 m
>>> records it's not responsive as that of Berkley db.
>>> Let me know the strategy to be adopted, pointers where I am doing wrong.
>>>
>>> Thanks
>>> Rajesh
>>>
>>> On 6 Feb 2018 6:11 p.m., "Rajesh Kishore" <rajesh10sinha@gmail.com>
>>> wrote:
>>>
>>>> Further to this,
>>>>
>>>> I am re-framing what I have , pls correct me if my approach is correct
>>>> or not.
>>>>
>>>> As of now, using only node as local cache and using native persistence
>>>> file system. The system has less number of records around *.1 M *in
>>>> main table and 2 M in supporting table.
>>>>
>>>> Using sql to retrieve the records using join , the sql used is
>>>> -----------------------------------------------------------------------
>>>>  final String query1 = "SELECT "
>>>>             + "f.entryID,f.attrName,f.attrValue, "
>>>>             + "f.attrsType "
>>>>             + "FROM "
>>>>             +"( select st.entryID,st.attrName,st.attrValue,
>>>> st.attrsType from "
>>>>             +"(SELECT at1.entryID FROM \"objectclass\".Ignite_ObjectC
>>>> lass"
>>>>                 + " at1 WHERE "
>>>>                 + " at1.attrValue= ? )  t"
>>>>             +" INNER JOIN \"Ignite_DSAttributeStore\".IGNITE_DSATTRIBUTESTORE
>>>> st ON st.entryID = t.entryID "
>>>>             + " WHERE st.attrKind IN ('u','o') "
>>>>             +" ) f "
>>>>             + " INNER JOIN "
>>>>             + " ( "
>>>>             +" SELECT entryID from \"dn\".Ignite_DN where parentDN like
>>>> ? "
>>>>              +")  "
>>>>             +" dnt"
>>>>             + " ON f.entryID = dnt.entryID"
>>>>             + " order by f.entryID";
>>>>
>>>>         String queryWithType = query1;
>>>>         QueryCursor<List<?>> cursor = cache.query(new SqlFieldsQuery(
>>>>             queryWithType).setEnforceJoinOrder(true).setArgs("person",
>>>> "dc=ignite,%"));
>>>>         System.out.println("SUBTREE "+cursor.getAll() );
>>>>
>>>>
>>>> -----------------------------------------------------------------------
>>>>
>>>> The corresponding EXPLAIN plan is
>>>> ----------------------------------------------------
>>>>
>>>> [[SELECT
>>>>     F.ENTRYID,
>>>>     F.ATTRNAME,
>>>>     F.ATTRVALUE,
>>>>     F.ATTRSTYPE
>>>> FROM (
>>>>     SELECT
>>>>         ST.ENTRYID,
>>>>         ST.ATTRNAME,
>>>>         ST.ATTRVALUE,
>>>>         ST.ATTRSTYPE
>>>>     FROM (
>>>>         SELECT
>>>>             AT1.ENTRYID
>>>>         FROM "objectclass".IGNITE_OBJECTCLASS AT1
>>>>         WHERE AT1.ATTRVALUE = ?1
>>>>     ) T
>>>>     INNER JOIN "Ignite_DSAttributeStore".IGNITE_DSATTRIBUTESTORE ST
>>>>         ON 1=1
>>>>     WHERE (ST.ATTRKIND IN('u', 'o'))
>>>>         AND (ST.ENTRYID = T.ENTRYID)
>>>> ) F
>>>>     /* SELECT
>>>>         ST.ENTRYID,
>>>>         ST.ATTRNAME,
>>>>         ST.ATTRVALUE,
>>>>         ST.ATTRSTYPE
>>>>     FROM (
>>>>         SELECT
>>>>             AT1.ENTRYID
>>>>         FROM "objectclass".IGNITE_OBJECTCLASS AT1
>>>>         WHERE AT1.ATTRVALUE = ?1
>>>>     ) T
>>>>         /++ SELECT
>>>>             AT1.ENTRYID
>>>>         FROM "objectclass".IGNITE_OBJECTCLASS AT1
>>>>             /++ "objectclass".OBJECTCLASSNDEXED_ATTRVAL_IDX: ATTRVALUE
>>>> = ?1 ++/
>>>>         WHERE AT1.ATTRVALUE = ?1
>>>>          ++/
>>>>     INNER JOIN "Ignite_DSAttributeStore".IGNITE_DSATTRIBUTESTORE ST
>>>>         /++ "Ignite_DSAttributeStore".IGNI
>>>> TE_DSATTRIBUTESTORE_ENTRYID_IDX: ENTRYID = T.ENTRYID ++/
>>>>         ON 1=1
>>>>     WHERE (ST.ATTRKIND IN('u', 'o'))
>>>>         AND (ST.ENTRYID = T.ENTRYID)
>>>>      */
>>>> INNER JOIN (
>>>>     SELECT
>>>>         ENTRYID
>>>>     FROM "dn".IGNITE_DN
>>>>     WHERE PARENTDN LIKE ?2
>>>> ) DNT
>>>>     /* SELECT
>>>>         ENTRYID
>>>>     FROM "dn".IGNITE_DN
>>>>         /++ "dn".EP_DN_IDX: ENTRYID IS ?3 ++/
>>>>     WHERE (ENTRYID IS ?3)
>>>>         AND (PARENTDN LIKE ?2): ENTRYID = F.ENTRYID
>>>>         AND ENTRYID = F.ENTRYID
>>>>      */
>>>>     ON 1=1
>>>> WHERE F.ENTRYID = DNT.ENTRYID
>>>> ORDER BY 1]]
>>>> -----------------------------------------------------
>>>>
>>>> The above query takes *24 sec* to retrieve the records which we feel
>>>> defeats the purpose , our application existing berkley db can retrieve this
>>>> faster.
>>>>
>>>> Question is -
>>>> a) I have attached my application models & client code , am I doing
>>>> something wrong in defining the models and cache configuration. Right now,
>>>> not considering distributed as I have less number of records.. What is
>>>> recommended?
>>>> b) What is the best memory requirement of Ignite/H2 , is 16g machine
>>>> not good enough for the records I have as of now?
>>>> c) does it create  performance overhead when using sql
>>>>
>>>> Please guide.
>>>>
>>>> Thanks,
>>>> Rajesh
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, Feb 6, 2018 at 9:30 AM, Rajesh Kishore <rajesh10sinha@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Michael
>>>>>
>>>>> Pls find my response
>>>>>
>>>>>
>>>>> Does that mean Ignite cannot scale well against Berkley dB Incase of
>>>>> single node?
>>>>> Could you please clarify, what your question means?
>>>>>
>>>>>
>>>>> (Rajesh) Our application currently uses Berkley dB and we are using it
>>>>> as key value dB ie storing object as value as bytes, we are using our
own
>>>>> logic in application for replication.
>>>>>
>>>>>
>>>>>    The comparison is  being done based on one node as of now.
>>>>>
>>>>>  now as a poc I have considered my model to be fit in sql dB of ignite
>>>>>
>>>>> What I am realizing, I get the faster result in Berkley dB against
>>>>> ignite in just
>>>>> .1 m records.
>>>>> I understand that ignite is distributed system, but with just   . 1 m
>>>>> records it's result is not comparable with Berkley dB?
>>>>>
>>>>> Any pointers?
>>>>>
>>>>> Regards
>>>>> Rajesh
>>>>> On 6 Feb 2018 8:35 a.m., "Michael Cherkasov" <
>>>>> michael.cherkasov@gmail.com> wrote:
>>>>>
>>>>> Rajesh,
>>>>>
>>>>> >Does that mean Ignite cannot scale well against Berkley dB Incase
of
>>>>> single node?
>>>>> Could you please clarify, what your question means?
>>>>>
>>>>>
>>>>> (Rajesh) Our application currently uses Berkley dB and we are using it
>>>>> as key value dB ie storing object as value as bytes, we are using our
own
>>>>> logic in application for replication.
>>>>>
>>>>>
>>>>>    The comparison is  being done based on one node as of now.
>>>>>
>>>>>  now as a poc I have considered my model to be fit in sql dB of ignite
>>>>>
>>>>> What I am realizing, I get the faster result in Berkley dB against
>>>>> ignite in just
>>>>> .1 m records.
>>>>> I understand that ignite is distributed system, but with just   . 1 m
>>>>> records it's result is not comparable with Berkley dB?
>>>>>
>>>>> Any pointers?
>>>>>
>>>>>
>>>>> Ignite can scale from a single node to hundreds(or even thousands, I
>>>>> have seen the only cluster of 300 nodes, but this definitely not a limit).
>>>>> It was designed to work as a distrebuted grid. So I think if you will
>>>>> try to compare one node of Ignite with one node of SomeDB, ignite will
lose.
>>>>>
>>>>> But you can run 10 ignite nodes and they will be faster then 10 nodes
>>>>> of somedb, furthermore, you can kill nodes and ignite will continue to
>>>>> work,
>>>>> what will happen if a host with Berkley DB crashes?
>>>>> So in case of crash can you transparently switch to other Berkley DB
>>>>> node and continue to work?
>>>>>
>>>>> Ignite is not just SQL DB, Ignite is a distributed data grid, it's
>>>>> strongly consistent and HA database,
>>>>> please make this into account when comparing it with other solutions.
>>>>>
>>>>> Thanks,
>>>>> Mike.
>>>>>
>>>>>
>>>>>
>>>>> 2018-02-05 9:23 GMT-08:00 Rajesh Kishore <rajesh10sinha@gmail.com>:
>>>>>
>>>>>> Hi Christos
>>>>>>
>>>>>> Does that mean Ignite cannot scale well against Berkley dB Incase
of
>>>>>> single node?
>>>>>>
>>>>>> Regards
>>>>>> Rajesh
>>>>>>
>>>>>> On 5 Feb 2018 10:08 p.m., "Christos Erotocritou" <
>>>>>> christos@gridgain.com> wrote:
>>>>>>
>>>>>>> Hi Rajesh,
>>>>>>>
>>>>>>> Ignite is a distributed system, testing with one node is really
not
>>>>>>> the way.
>>>>>>>
>>>>>>> You need to consider having multiple nodes and portion and collocate
>>>>>>> your data before.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> C
>>>>>>>
>>>>>>> On 5 Feb 2018, at 16:36, Rajesh Kishore <rajesh10sinha@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> We are in the process of evaluating Ignite native persistence
>>>>>>> against berkely db. For some reason Ignite query does not seem
to be
>>>>>>> performant the way application code behaves against berkley db
>>>>>>>
>>>>>>> Background:
>>>>>>> Berkley db - As of now, we have berkley db for our application
and
>>>>>>> the data is stored as name value pair as byte stream in the berkley
db's
>>>>>>> native file system.
>>>>>>>
>>>>>>> Ignite DB - We are using Ignite DB's native persistence file
system.
>>>>>>> Created appropriate index and retrieving data using SQL involving
multiple
>>>>>>> joins.
>>>>>>>
>>>>>>> Ignite configuration : with native persistence enabled , only
one
>>>>>>> node
>>>>>>>
>>>>>>> Data: As of now in the main table we have only *.1 M records
*and
>>>>>>> in supporting tables we have around 2 million records
>>>>>>>
>>>>>>> Ignite sql query used
>>>>>>>
>>>>>>> SELECT f.entryID,f.attrName,f.attrValue, f.attrsType FROM
>>>>>>>     ( select st.entryID,st.attrName,st.attrValue, st.attrsType
from
>>>>>>>             (SELECT at1.entryID FROM "objectclass".Ignite_ObjectClass
>>>>>>> at1 WHERE  at1.attrValue= ? ) t
>>>>>>>             INNER JOIN
>>>>>>>             "Ignite_DSAttributeStore".IGNITE_DSATTRIBUTESTORE
st ON
>>>>>>> st.entryID = t.entryID  WHERE st.attrKind IN ('u','o')
>>>>>>>     ) f
>>>>>>>    INNER JOIN  (SELECT entryID from "dn".Ignite_DN where parentDN
>>>>>>> like ? ) dnt ON f.entryID = dnt.entry
>>>>>>>
>>>>>>> The corresponding EXPLAIN PLAN
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [[SELECT
>>>>>>>     F__Z3.ENTRYID AS __C0_0,
>>>>>>>     F__Z3.ATTRNAME AS __C0_1,
>>>>>>>     F__Z3.ATTRVALUE AS __C0_2,
>>>>>>>     F__Z3.ATTRSTYPE AS __C0_3
>>>>>>> FROM (
>>>>>>>     SELECT
>>>>>>>         ST__Z2.ENTRYID,
>>>>>>>         ST__Z2.ATTRNAME,
>>>>>>>         ST__Z2.ATTRVALUE,
>>>>>>>         ST__Z2.ATTRSTYPE
>>>>>>>     FROM (
>>>>>>>         SELECT
>>>>>>>             AT1__Z0.ENTRYID
>>>>>>>         FROM "objectclass".IGNITE_OBJECTCLASS AT1__Z0
>>>>>>>         WHERE AT1__Z0.ATTRVALUE = ?1
>>>>>>>     ) T__Z1
>>>>>>>     INNER JOIN "Ignite_DSAttributeStore".IGNITE_DSATTRIBUTESTORE
>>>>>>> ST__Z2
>>>>>>>         ON 1=1
>>>>>>>     WHERE (ST__Z2.ATTRKIND IN('u', 'o'))
>>>>>>>         AND (ST__Z2.ENTRYID = T__Z1.ENTRYID)
>>>>>>> ) F__Z3
>>>>>>>     /* SELECT
>>>>>>>         ST__Z2.ENTRYID,
>>>>>>>         ST__Z2.ATTRNAME,
>>>>>>>         ST__Z2.ATTRVALUE,
>>>>>>>         ST__Z2.ATTRSTYPE
>>>>>>>     FROM (
>>>>>>>         SELECT
>>>>>>>             AT1__Z0.ENTRYID
>>>>>>>         FROM "objectclass".IGNITE_OBJECTCLASS AT1__Z0
>>>>>>>         WHERE AT1__Z0.ATTRVALUE = ?1
>>>>>>>     ) T__Z1
>>>>>>>         /++ SELECT
>>>>>>>             AT1__Z0.ENTRYID
>>>>>>>         FROM "objectclass".IGNITE_OBJECTCLASS AT1__Z0
>>>>>>>             /++ "objectclass".OBJECTCLASSNDEXED_ATTRVAL_IDX:
>>>>>>> ATTRVALUE = ?1 ++/
>>>>>>>         WHERE AT1__Z0.ATTRVALUE = ?1
>>>>>>>          ++/
>>>>>>>     INNER JOIN "Ignite_DSAttributeStore".IGNITE_DSATTRIBUTESTORE
>>>>>>> ST__Z2
>>>>>>>         /++ "Ignite_DSAttributeStore".IGNI
>>>>>>> TE_DSATTRIBUTESTORE_ENTRYID_IDX: ENTRYID = T__Z1.ENTRYID ++/
>>>>>>>         ON 1=1
>>>>>>>     WHERE (ST__Z2.ATTRKIND IN('u', 'o'))
>>>>>>>         AND (ST__Z2.ENTRYID = T__Z1.ENTRYID)
>>>>>>>      */
>>>>>>> INNER JOIN (
>>>>>>>     SELECT
>>>>>>>         __Z4.ENTRYID
>>>>>>>     FROM "dn".IGNITE_DN __Z4
>>>>>>>     WHERE __Z4.PARENTDN LIKE ?2
>>>>>>> ) DNT__Z5
>>>>>>>     /* SELECT
>>>>>>>         __Z4.ENTRYID
>>>>>>>     FROM "dn".IGNITE_DN __Z4
>>>>>>>         /++ "dn".EP_DN_IDX: ENTRYID IS ?3 ++/
>>>>>>>     WHERE (__Z4.ENTRYID IS ?3)
>>>>>>>         AND (__Z4.PARENTDN LIKE ?2): ENTRYID = F__Z3.ENTRYID
>>>>>>>      */
>>>>>>>     ON 1=1
>>>>>>> WHERE F__Z3.ENTRYID = DNT__Z5.ENTRYID
>>>>>>> ORDER BY 1], [SELECT
>>>>>>>     __C0_0 AS ENTRYID,
>>>>>>>     __C0_1 AS ATTRNAME,
>>>>>>>     __C0_2 AS ATTRVALUE,
>>>>>>>     __C0_3 AS ATTRSTYPE
>>>>>>> FROM PUBLIC.__T0
>>>>>>>     /* "Ignite_DSAttributeStore"."merge_sorted" */
>>>>>>> ORDER BY 1
>>>>>>> /* index sorted */]]
>>>>>>>
>>>>>>>
>>>>>>> Any pointers , how should I proceed , Following is the JFR report
>>>>>>> for the code used
>>>>>>> cursor = cache.query(new SqlFieldsQuery(query).setEnfor
>>>>>>> ceJoinOrder(true);
>>>>>>> cursor.getAll();
>>>>>>>
>>>>>>>
>>>>>>> <image.png>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Rajesh
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>
>

Mime
View raw message