ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Gura <ag...@gridgain.com>
Subject Re: The result of cross-cache SQL joins is incomplete.
Date Mon, 09 Nov 2015 12:24:22 GMT
Hi,

Replicated vs partititoned is desicion that depends on many factors: data
set size, growth rate, reliability, colocation requirements.

I don't know any details about your domain. So I can only make some
assumptions.

1. "good" and "good_type" tables are dictionary tables. Usually tables like
this are not big. So we can represent they as replicated caches. It means
that all rows will be replicated on each cluster node. It leads to usefull
effect: this tables don't require any colocation.

2. "day_report" table isn't big also. If I understand correctly this table
contains one data row per day. So this table can b erepresented as
replicated cache.

3. "user", "user_order" and "order_good" tables can have a lot of data and
growth rate can be big enough. It makes sense to represent this tables as
partitioned cache.

4. If you have partitioned caches that involved into crosscache queries
then you should colocate data from this caches in order to provide correct
joining and good performance. Thus data about user orders should be
colocated with users (i.e. all user order entries should be on the same
nodes where this user entries are placed) and data about goods in orders
should be colocated with orders (i.e. all "order_good" rows for specific
order should be placed on the same nodes where corresponding order entry is
placed).

About your questions:

1. Some tables do not have primary key, should I need to add primary key on
> them first before importing them to Ignite?


Yes, you should. Any entry in cache must have unique key.

2. Which tables should be setting as Replicated cache mode? And which table
> should be setting as partition cache mode?


See my analysis above.

3. Should I need to set any AffinityKey?


Yes, you should. For example, in order to colocate user orders with users
you should use AffinityKey for "user_order" table: AffinityKey<Integer>(
user_order.id, user_order.usr_id).

4. Some join connections are not connected to primary key(such as
> day_report.report_date = user_order.create_time where create_time is not a
> primary key), Does it have any affect?
>

Primary key is a just unique not null key. Join can be executed on any
field, but it would be better if this field is indexed. Probably in your
case make sense to truncate user_order.create_time field to date type
(without info about hours, minutes, etc).

On Sat, Nov 7, 2015 at 5:22 PM, iceguo <guoht@ffcs.cn> wrote:

> Thank you for your reply, but I am still not very clear about the rules of
> setting a table for Replicated cache mode or Partition cache mode.
> I have another a little more complicated example, could you help to
> analysis
> it?
>
> In this example, I have 6 tables
>
> /*Table structure for table `day_report` */
> CREATE TABLE `day_report` (
>   `report_date` date DEFAULT NULL,
>   `money` decimal(10,0) DEFAULT NULL,
>   `people` int(11) DEFAULT NULL
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `good` */
>
> CREATE TABLE `good` (
>   `id` int(11) NOT NULL,
>   `name` varchar(200) DEFAULT NULL,
>   `type` int(11) DEFAULT NULL,
>   `money` decimal(10,0) DEFAULT NULL,
>   PRIMARY KEY (`id`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `good_type` */
>
> CREATE TABLE `good_type` (
>   `id` int(11) NOT NULL,
>   `name` varchar(200) DEFAULT NULL,
>   `pid` int(11) DEFAULT NULL,
>   PRIMARY KEY (`id`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `order_good` */
>
> CREATE TABLE `order_good` (
>   `order_id` int(11) DEFAULT NULL,
>   `good_id` int(11) DEFAULT NULL
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `user` */
>
> CREATE TABLE `user` (
>   `id` int(11) NOT NULL,
>   `name` varchar(50) DEFAULT NULL,
>   PRIMARY KEY (`id`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
>
> /*Table structure for table `user_order` */
>
> CREATE TABLE `user_order` (
>   `id` int(11) NOT NULL AUTO_INCREMENT,
>   `usr_id` int(11) DEFAULT NULL,
>   `create_time` date DEFAULT NULL,
>   `money` decimal(10,0) DEFAULT NULL,
>   KEY `id` (`id`)
> ) ENGINE=InnoDB AUTO_INCREMENT=500000 DEFAULT CHARSET=utf8;
>
> The join connections of them as following:
> day_report.report_date = user_order.create_time
> user_order.type         = good_type.pid
> user_order.id = order_good.order_id
> good.id = order_good.good_id
>
> My questions are:
> 1. Some tables do not have primary key, should I need to add primary key on
> them first before importing them to Ignite?
> 2. Which tables should be setting as Replicated cache mode? And which table
> should be setting as partition cache mode?
> 3. Should I need to set any AffinityKey?
> 4. Some join connections are not connected to primary key(such as
> day_report.report_date = user_order.create_time where create_time is not a
> primary key), Does it have any affect?
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1880.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Mime
View raw message