Mailing-List: contact user-help@kudu.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@kudu.incubator.apache.org
MIME-Version: 1.0
In-Reply-To: <CALo2W-VUN7pbssTW01gOPb10KEayD=QobwFs0es5aELQSO4ROw@mail.gmail.com>
References: <CADrk5qO3xiGBP1fvMSuOkmYQQ5=adGwAcnXudoZ3p8PSw=xzPw@mail.gmail.com>
	<CANqQv0uq6z_CwJZ4QWSppGng4w9kus4y8OSkvKJCJkSBvRUmJA@mail.gmail.com>
	<CADrk5qPk_UqqRAj8=mhu7V33CSBu7wQiwhiHAZEeHnnfaemjSQ@mail.gmail.com>
	<CADrk5qPKqLDdSB-RYA6BCqyHsK-nQG0fVvrsuystgA=qZO8faA@mail.gmail.com>
	<CAGpTDNcgL8sR4O0azOw4yTmY1OiMd+EqGsujfU36HQukxMXzwA@mail.gmail.com>
	<CADrk5qMzY=+viBbVXYgyAT4h_EDoijULTsdCeVaM+fK_kyrRmQ@mail.gmail.com>
	<CAGpTDNeCA7e0gGUzztAayEjZ9ODcOAN8uXBtFG+aKJmRTFSR-Q@mail.gmail.com>
	<CADrk5qOZtAn44zE3j_KXWABo_R1unR0v6J9Wb=HV+QO7-o=Wug@mail.gmail.com>
	<CALo2W-UJNa6A6RGck2-kj65H=3jg32MEJzfP5Batd9bqY6iC8w@mail.gmail.com>
	<CADrk5qM-ToaGeEGegBDPQByc_XQcBxWtDk8YTmM2Fg6p-NZLBw@mail.gmail.com>
	<CALo2W-XSU9dBz78bpJ3zGpJiSEpdfu=jgHU4T1LKKxvkrdh2Hw@mail.gmail.com>
	<CADrk5qN=Xh9LtJ9edaB-FVgqyQKP8OvzOmAvFHfe1h4HCS70zw@mail.gmail.com>
	<CALo2W-XxQQ2N0JgCtsGTM11VSDRNCsZnF4juEA0VgUVjRyWHjQ@mail.gmail.com>
	<CALo2W-Xh9_9W82eNXzMxHH4W8Lqpc_eHmZ2mXFrwdZ_P3_MYSw@mail.gmail.com>
	<CADrk5qMH9FYQWnCx=EX-zWW6iT++68tbDMyj+kQFf-x8qPzHJw@mail.gmail.com>
	<CALo2W-XdDfhFp7VCn8GWrVy-yHTsP_EuKd2TJtMgC_YJFMruuA@mail.gmail.com>
	<CADrk5qN7SfZ4b4KD_h_v0wHzYcf0XT9D4rPAYyNv9FJt-WrsHQ@mail.gmail.com>
	<CALo2W-VUN7pbssTW01gOPb10KEayD=QobwFs0es5aELQSO4ROw@mail.gmail.com>
Date: Thu, 12 May 2016 15:37:43 -0700
Message-ID: <CADrk5qPZSZ3jmsC=_3+-7+ACdQP1XQ32naVFm0ELjbqLTY+EmA@mail.gmail.com>
Subject: Re: Partition and Split rows
From: Sand Stone <sand.m.stone@gmail.com>
To: user@kudu.incubator.apache.org
Content-Type: multipart/alternative; boundary=001a113d05f083cffe0532acceff
archived-at: Thu, 12 May 2016 22:37:59 -0000

--001a113d05f083cffe0532acceff
Content-Type: text/plain; charset=UTF-8

Cool, I will study it.

On Thu, May 12, 2016 at 2:17 PM, Dan Burkert <dan@cloudera.com> wrote:

>
>
> On Thu, May 12, 2016 at 2:04 PM, Sand Stone <sand.m.stone@gmail.com>
> wrote:
>
> >Instead, take advantage of the index capability of Primary Keys.
>> Currently I did make the "5-min" field a part of the primary key as well.
>> I am most likely overdoing it. I will play around with the schema and use
>> cases around it.
>>
>
> Definitely take a look at the data model in Kudu TS, it has extremely
> efficient scan semantics (all scans retrieve only the necessary data by
> using the primary key, no client *or* server side filtering), and it works
> with arbitrarily large time range partitions.
>
>
>> >since each tablet server should only have on the order of 10-20 tablets.
>> How does this 10-20 heuristics come out? Is it based on certain machine
>> profile? Or some default parameters in the code/config?
>>
>
> 10-20 isn't a hard and fast rule, and it is very dependent on the dataset
> and the machine size.  We routinely run tablet servers with 300+ tablets,
> so it's definitely not a hard limitation.  My intention was to stress that
> instead of pursuing fine grained partitions as a method of limiting the
> size of scans, instead take advantage of the primary key indexing that Kudu
> provides, just as you might in a traditional relational database.
> Partitions are better suited for ensuring that inserts and large scans can
> be parallelized across multiple machines.
>
> - Dan
>
>
>>
>> On Thu, May 12, 2016 at 1:45 PM, Dan Burkert <dan@cloudera.com> wrote:
>>
>>>
>>> On Thu, May 12, 2016 at 11:39 AM, Sand Stone <sand.m.stone@gmail.com>
>>> wrote:
>>>
>>> I don't know how Kudu load balance the data across the tablet servers.
>>>>
>>>
>>> Individual tablets are replicated and balanced across all available
>>> tablet servers, for more on that see
>>> http://getkudu.io/docs/schema_design.html#data-distribution.
>>>
>>>
>>>
>>>> For example, do I need to pre-calculate every day, a list of 5 minutes
>>>> apart timestamps at table creation? [assume I have to create a new table
>>>> every day].
>>>>
>>>
>>> If you wish to range partition on the time column, then yes, currently
>>> you must specify the splits upfront during table creation (but this will
>>> change with the non-covering range partitions work).
>>>
>>>
>>>>
>>>> My hope, with the additional 5-min column, and use it as the range
>>>> partition column, is that so I could spread the data evenly across the
>>>> tablet servers.
>>>>
>>>
>>> I don't think this is meaningfully different than range partitioning on
>>> the full time column with splits every 5 minutes.
>>>
>>>
>>>> Also, since 5-min interval data are always colocated together, the read
>>>> query could be efficient too.
>>>>
>>>
>>> Data colocation is a function of the partitioning and indexing.  As I
>>> mentioned before, if you have timestamp as part of your primary key then
>>> you can guarantee that scans specifying a time range are efficient. Overall
>>> it sounds like you are attempting to get fast scans by creating many fine
>>> grained partitions, as you might with Parquet.  This won't be an efficient
>>> strategy in Kudu, since each tablet server should only have on the order of
>>> 10-20 tablets.  Instead, take advantage of the index capability of Primary
>>> Keys.
>>>
>>> - Dan
>>>
>>>
>>>> On Thu, May 12, 2016 at 11:13 AM, Dan Burkert <dan@cloudera.com> wrote:
>>>>
>>>>> Forgot to add the PK specification to the CREATE TABLE, it should have
>>>>> read as follows:
>>>>>
>>>>> CREATE TABLE metrics (metric STRING, time TIMESTAMP, value DOUBLE)
>>>>> PRIMARY KEY (metric, time);
>>>>>
>>>>> - Dan
>>>>>
>>>>>
>>>>> On Thu, May 12, 2016 at 11:12 AM, Dan Burkert <dan@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> On Thu, May 12, 2016 at 11:05 AM, Sand Stone <sand.m.stone@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> > Is the requirement to pre-aggregate by time window?
>>>>>>> No, I am thinking to create a column say, "minute". It's basically
>>>>>>> the minute field of the timestamp column(even round to 5-min bucket
>>>>>>> depending on the needs). So it's a computed column being filled in on data
>>>>>>> ingestion. My goal is that this field would help with data filtering at
>>>>>>> read/query time, say select certain projection at minute 10-15, to speed up
>>>>>>> the read queries.
>>>>>>>
>>>>>>
>>>>>> In many cases, Kudu can do his for you without having to add special
>>>>>> columns.  The requirements are that the timestamp is part of the primary
>>>>>> key, and any columns that come before the timestamp in the primary key (if
>>>>>> it's a compound PK), have equality predicates.  So for instance, if you
>>>>>> create a table such as:
>>>>>>
>>>>>> CREATE TABLE metrics (metric STRING, time TIMESTAMP, value DOUBLE);
>>>>>>
>>>>>> then queries such as
>>>>>>
>>>>>> SELECT time, value FROM metrics WHERE metric = "my-metric" AND time >
>>>>>> 2016-05-01T00:00 AND time < 2016-05-01T00:05
>>>>>>
>>>>>> Then only the data for that 5 minute time window will be read from
>>>>>> disk.  If the query didn't have the equality predicate on the 'metric'
>>>>>> column, then it would do a much bigger scan + filter operation.  If you
>>>>>> want more background on how this is achieved, check out the partition
>>>>>> pruning design doc:
>>>>>> https://github.com/apache/incubator-kudu/blob/master/docs/design-docs/scan-optimization-partition-pruning.md
>>>>>> .
>>>>>>
>>>>>> - Dan
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks for the info., I will follow them.
>>>>>>>
>>>>>>> On Thu, May 12, 2016 at 10:50 AM, Dan Burkert <dan@cloudera.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey Sand,
>>>>>>>>
>>>>>>>> Sorry for the delayed response.  I'm not quite following your use
>>>>>>>> case.  Is the requirement to pre-aggregate by time window? I don't think
>>>>>>>> Kudu can help you directly with that (nothing built in), but you could
>>>>>>>> always create a separate table to store the pre-aggregated values.  As far
>>>>>>>> as applying functions to do row splits, that is an interesting idea, but I
>>>>>>>> think once Kudu has support for range bounds (the non-covering range
>>>>>>>> partition design doc linked above), you can simply create the bounds where
>>>>>>>> the function would have put them.  For example, if you want a partition for
>>>>>>>> every five minutes, you can create the bounds accordingly.
>>>>>>>>
>>>>>>>> Earlier this week I gave a talk on timeseries in Kudu, I've
>>>>>>>> included some slides that may be interesting to you.  Additionally, you may
>>>>>>>> want to check out https://github.com/danburkert/kudu-ts, it's a
>>>>>>>> very young  (not feature complete) metrics layer on top of Kudu, it may
>>>>>>>> give you some ideas.
>>>>>>>>
>>>>>>>> - Dan
>>>>>>>>
>>>>>>>> On Sat, May 7, 2016 at 1:28 PM, Sand Stone <sand.m.stone@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for sharing, Dan. The diagrams explained clearly how the
>>>>>>>>> current system works.
>>>>>>>>>
>>>>>>>>> As for things in my mind. Take the schema of
>>>>>>>>> <host,metric,time,...>, say, I am interested in data for the past 5 mins,
>>>>>>>>> 10 mins, etc. Or, aggregate at 5 mins interval for the past 3 days, 7 days,
>>>>>>>>> ... Looks like I need to introduce a special 5-min bar column, use that
>>>>>>>>> column to do range partition to spread data across the tablet servers so
>>>>>>>>> that I could leverage parallel filtering.
>>>>>>>>>
>>>>>>>>> The cost of this extra column (INT8) is not ideal but not too bad
>>>>>>>>> either (storage cost wise, compression should do wonders). So I am thinking
>>>>>>>>> whether it would be better to take "functions" as row split instead of only
>>>>>>>>> constants. Of course if business requires to drop down to 1-min bar, the
>>>>>>>>> data has to be re-sharded again. So a more cost effective way of doing this
>>>>>>>>> on a production cluster would be good.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, May 7, 2016 at 8:50 AM, Dan Burkert <dan@cloudera.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Sand,
>>>>>>>>>>
>>>>>>>>>> I've been working on some diagrams to help explain some of the
>>>>>>>>>> more advanced partitioning types, it's attached.   Still pretty rough at
>>>>>>>>>> this point, but the goal is to clean it up and move it into the Kudu
>>>>>>>>>> documentation proper.  I'm interested to hear what kind of time series you
>>>>>>>>>> are interested in Kudu for.  I'm tasked with improving Kudu for time
>>>>>>>>>> series, you can follow progress here
>>>>>>>>>> <https://issues.apache.org/jira/browse/KUDU-1306>. If you have
>>>>>>>>>> any additional ideas I'd love to hear them.  You may also be interested in
>>>>>>>>>> a small project that a JD and I have been working on in the past week to
>>>>>>>>>> build an OpenTSDB style store on top of Kudu, you can find it
>>>>>>>>>> here <https://github.com/danburkert/kudu-ts>.  Still quite
>>>>>>>>>> feature limited at this point.
>>>>>>>>>>
>>>>>>>>>> - Dan
>>>>>>>>>>
>>>>>>>>>> On Fri, May 6, 2016 at 4:51 PM, Sand Stone <
>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks. Will read.
>>>>>>>>>>>
>>>>>>>>>>> Given that I am researching time series data, row locality is
>>>>>>>>>>> crucial :-)
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 6, 2016 at 3:57 PM, Jean-Daniel Cryans <
>>>>>>>>>>> jdcryans@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> We do have non-covering range partitions coming in the next few
>>>>>>>>>>>> months, here's the design (in review):
>>>>>>>>>>>> http://gerrit.cloudera.org:8080/#/c/2772/9/docs/design-docs/non-covering-range-partitions.md
>>>>>>>>>>>>
>>>>>>>>>>>> The "Background & Motivation" section should give you a good
>>>>>>>>>>>> idea of why I'm mentioning this.
>>>>>>>>>>>>
>>>>>>>>>>>> Meanwhile, if you don't need row locality, using hash
>>>>>>>>>>>> partitioning could be good enough.
>>>>>>>>>>>>
>>>>>>>>>>>> J-D
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 6, 2016 at 3:53 PM, Sand Stone <
>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Makes sense.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yeah it would be cool if users could specify/control the split
>>>>>>>>>>>>> rows after the table is created. Now, I have to "think ahead" to pre-create
>>>>>>>>>>>>> the range buckets.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:49 PM, Jean-Daniel Cryans <
>>>>>>>>>>>>> jdcryans@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> You will only get 1 tablet and no data distribution, which is
>>>>>>>>>>>>>> bad.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That's also how HBase works, but it will split regions as you
>>>>>>>>>>>>>> insert data and eventually you'll get some data distribution even if it
>>>>>>>>>>>>>> doesn't start in an ideal situation. Tablet splitting will come later for
>>>>>>>>>>>>>> Kudu.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> J-D
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:42 PM, Sand Stone <
>>>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One more questions, how does the range partition work if I
>>>>>>>>>>>>>>> don't specify the split rows?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:37 PM, Sand Stone <
>>>>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks, Misty. The "advanced" impala example helped.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I was just reading the Java API,CreateTableOptions.java,
>>>>>>>>>>>>>>>> it's unclear how the range partition column names associated with the
>>>>>>>>>>>>>>>> partial rows params in the addSplitRow API.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, May 6, 2016 at 3:08 PM, Misty Stanley-Jones <
>>>>>>>>>>>>>>>> mstanleyjones@cloudera.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Sand,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please have a look at
>>>>>>>>>>>>>>>>> http://getkudu.io/docs/kudu_impala_integration.html#partitioning_tables
>>>>>>>>>>>>>>>>> and see if it is helpful to you.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Misty
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, May 6, 2016 at 2:00 PM, Sand Stone <
>>>>>>>>>>>>>>>>> sand.m.stone@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi, I am new to Kudu. I wonder how the split rows work. I
>>>>>>>>>>>>>>>>>> know from some docs, this is currently for pre-creation the table. I am
>>>>>>>>>>>>>>>>>> researching how to partition (hash+range) some time series test data.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is there an example? or notes somewhere I could read
>>>>>>>>>>>>>>>>>> upon.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks much.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

--001a113d05f083cffe0532acceff
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Cool, I will study it.</div><div class=3D"gmail_extra"><br=
><div class=3D"gmail_quote">On Thu, May 12, 2016 at 2:17 PM, Dan Burkert <s=
pan dir=3D"ltr">&lt;<a href=3D"mailto:dan@cloudera.com" target=3D"_blank">d=
an@cloudera.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" =
style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><di=
v dir=3D"ltr"><br><div class=3D"gmail_extra"><span class=3D""><br><div clas=
s=3D"gmail_quote">On Thu, May 12, 2016 at 2:04 PM, Sand Stone <span dir=3D"=
ltr">&lt;<a href=3D"mailto:sand.m.stone@gmail.com" target=3D"_blank">sand.m=
.stone@gmail.com</a>&gt;</span> wrote:</div></span><div class=3D"gmail_quot=
e"><span class=3D""><br><blockquote class=3D"gmail_quote" style=3D"margin:0=
 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><sp=
an><div>&gt;<span style=3D"font-size:13px">Instead, take advantage of the i=
ndex capability of Primary Keys.</span></div></span><div>Currently I did ma=
ke the &quot;5-min&quot;=C2=A0field a part of the primary key as well. I am=
 most likely overdoing it. I will play around with the schema and use cases=
 around it.=C2=A0</div></div></blockquote><div>=C2=A0</div></span><div>Defi=
nitely take a look at the data model in Kudu TS, it has extremely efficient=
 scan semantics (all scans retrieve only the necessary data by using the pr=
imary key, no client *or* server side filtering), and it works with arbitra=
rily large time range partitions.</div><span class=3D""><div>=C2=A0</div><b=
lockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px =
#ccc solid;padding-left:1ex"><div dir=3D"ltr"><span><div></div><div>&gt;<sp=
an style=3D"font-size:13px">since each tablet server should only have on th=
e order of 10-20 tablets.</span></div></span><div>How does this 10-20 heuri=
stics come out? Is it based on certain machine profile? Or some default par=
ameters in the code/config?</div></div></blockquote><div><br></div></span><=
div>10-20 isn&#39;t a hard and fast rule, and it is very dependent on the d=
ataset and the machine size.=C2=A0 We routinely run tablet servers with 300=
+ tablets, so it&#39;s definitely not a hard limitation.=C2=A0 My intention=
 was to stress that instead of pursuing fine grained partitions as a method=
 of limiting the size of scans, instead take advantage of the primary key i=
ndexing that Kudu provides, just as you might in a traditional relational d=
atabase.=C2=A0 Partitions are better suited for ensuring that inserts and l=
arge scans can be parallelized across multiple machines.</div><div><div cla=
ss=3D"h5"><div><br></div><div>- Dan</div><div>=C2=A0</div><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex"><div><div><div class=3D"gmail_extra"><br><div class=3D"gmail_=
quote">On Thu, May 12, 2016 at 1:45 PM, Dan Burkert <span dir=3D"ltr">&lt;<=
a href=3D"mailto:dan@cloudera.com" target=3D"_blank">dan@cloudera.com</a>&g=
t;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0=
 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div cl=
ass=3D"gmail_extra"><span><br><div class=3D"gmail_quote">On Thu, May 12, 20=
16 at 11:39 AM, Sand Stone <span dir=3D"ltr">&lt;<a href=3D"mailto:sand.m.s=
tone@gmail.com" target=3D"_blank">sand.m.stone@gmail.com</a>&gt;</span> wro=
te:</div></span><div class=3D"gmail_quote"><span><br><blockquote class=3D"g=
mail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-=
left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div =
dir=3D"ltr"><div>I don&#39;t know how Kudu load balance the data across the=
 tablet servers.</div></div></blockquote><div><br></div></span><div>Individ=
ual tablets are replicated and balanced across all available tablet servers=
, for more on that see=C2=A0<a href=3D"http://getkudu.io/docs/schema_design=
.html#data-distribution" target=3D"_blank">http://getkudu.io/docs/schema_de=
sign.html#data-distribution</a>.</div><span><div><br></div><div>=C2=A0</div=
><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border=
-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);=
padding-left:1ex"><div dir=3D"ltr"><div> For example, do I need to pre-calc=
ulate every day, a list of 5 minutes apart timestamps at table creation? [a=
ssume I have to create a new table every day].</div></div></blockquote><div=
><br></div></span><div>If you wish to range partition on the time column, t=
hen yes, currently you must specify the splits upfront during table creatio=
n (but this will change with the non-covering range partitions work).</div>=
<span><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0p=
x 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-c=
olor:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div><br></div><di=
v>My hope, with the additional 5-min column, and use it as the range partit=
ion column, is that so I could spread the data evenly across the tablet ser=
vers.</div></div></blockquote><div><br></div></span><div>I don&#39;t think =
this is meaningfully different than range partitioning on the full time col=
umn with splits every 5 minutes.</div><span><div>=C2=A0=C2=A0</div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-wid=
th:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-l=
eft:1ex"><div dir=3D"ltr"><div>Also, since 5-min interval data are always c=
olocated together, the read query could be efficient too.=C2=A0</div></div>=
</blockquote><div><br></div></span><div>Data colocation is a function of th=
e partitioning and indexing.=C2=A0 As I mentioned before, if you have times=
tamp as part of your primary key then you can guarantee that scans specifyi=
ng a time range are efficient. Overall it sounds like you are attempting to=
 get fast scans by creating many fine grained partitions, as you might with=
 Parquet.=C2=A0 This won&#39;t be an efficient strategy in Kudu, since each=
 tablet server should only have on the order of 10-20 tablets.=C2=A0 Instea=
d, take advantage of the index capability of Primary Keys.</div><div><div><=
div>=C2=A0</div><div>- Dan</div><div>=C2=A0</div><blockquote class=3D"gmail=
_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left=
-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div><div=
><div class=3D"gmail_extra"><div class=3D"gmail_quote">On Thu, May 12, 2016=
 at 11:13 AM, Dan Burkert <span dir=3D"ltr">&lt;<a href=3D"mailto:dan@cloud=
era.com" target=3D"_blank">dan@cloudera.com</a>&gt;</span> wrote:<br><block=
quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-w=
idth:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding=
-left:1ex"><div dir=3D"ltr">Forgot to add the PK specification to the CREAT=
E TABLE, it should have read as follows:<div><br></div><div><div style=3D"f=
ont-size:12.8px"><span>CREATE TABLE metrics (metric STRING, time TIMESTAMP,=
 value DOUBLE)<br></span>PRIMARY KEY (metric, time);</div></div><div style=
=3D"font-size:12.8px"><br></div><div style=3D"font-size:12.8px">- Dan</div>=
<div><br></div></div><div><div><div class=3D"gmail_extra"><br><div class=3D=
"gmail_quote">On Thu, May 12, 2016 at 11:12 AM, Dan Burkert <span dir=3D"lt=
r">&lt;<a href=3D"mailto:dan@cloudera.com" target=3D"_blank">dan@cloudera.c=
om</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"marg=
in:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-l=
eft-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div class=3D=
"gmail_extra"><br><div class=3D"gmail_quote"><span>On Thu, May 12, 2016 at =
11:05 AM, Sand Stone <span dir=3D"ltr">&lt;<a href=3D"mailto:sand.m.stone@g=
mail.com" target=3D"_blank">sand.m.stone@gmail.com</a>&gt;</span> wrote:<br=
><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border=
-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);=
padding-left:1ex"><div dir=3D"ltr"><span>&gt;<span style=3D"font-size:13px"=
>=C2=A0Is the requirement to pre-aggregate by time window?</span></span><di=
v>No, I am thinking to create a column say, &quot;minute&quot;. It&#39;s=C2=
=A0basically the minute field of the timestamp column(even round to 5-min b=
ucket depending on the needs). So it&#39;s a computed column being filled i=
n on data ingestion. My goal is that this field would help with data filter=
ing at read/query time, say select certain projection at minute 10-15, to s=
peed up the read queries.=C2=A0</div></div></blockquote><div><br></div></sp=
an><div>In many cases, Kudu can do his for you without having to add specia=
l columns.=C2=A0 The requirements are that the timestamp is part of the pri=
mary key, and any columns that come before the timestamp in the primary key=
 (if it&#39;s a compound PK), have equality predicates.=C2=A0 So for instan=
ce, if you create a table such as:</div><div><br></div><div>CREATE TABLE me=
trics (metric STRING, time TIMESTAMP, value DOUBLE);</div><div><br></div><d=
iv>then queries such as=C2=A0</div><div><br></div><div>SELECT time, value F=
ROM metrics WHERE metric =3D &quot;my-metric&quot; AND time &gt; 2016-05-01=
T00:00 AND time &lt; 2016-05-01T00:05</div><div><br></div><div>Then only th=
e data for that 5 minute time window will be read from disk.=C2=A0 If the q=
uery didn&#39;t have the equality predicate on the &#39;metric&#39; column,=
 then it would do a much bigger scan + filter operation.=C2=A0 If you want =
more background on how this is achieved, check out the partition pruning de=
sign doc:=C2=A0<a href=3D"https://github.com/apache/incubator-kudu/blob/mas=
ter/docs/design-docs/scan-optimization-partition-pruning.md" target=3D"_bla=
nk">https://github.com/apache/incubator-kudu/blob/master/docs/design-docs/s=
can-optimization-partition-pruning.md</a>.</div><span><font color=3D"#88888=
8"><div><br></div><div>- Dan</div></font></span><div><div><div><br></div><d=
iv>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0p=
x 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb=
(204,204,204);padding-left:1ex"><div dir=3D"ltr"><div></div><div>Thanks for=
 the info., I will follow them.=C2=A0</div></div><div><div><div class=3D"gm=
ail_extra"><br><div class=3D"gmail_quote">On Thu, May 12, 2016 at 10:50 AM,=
 Dan Burkert <span dir=3D"ltr">&lt;<a href=3D"mailto:dan@cloudera.com" targ=
et=3D"_blank">dan@cloudera.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;bo=
rder-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">=
<div dir=3D"ltr">Hey Sand,<div><br></div><div>Sorry for the delayed respons=
e.=C2=A0 I&#39;m not quite following your use case.=C2=A0 Is the requiremen=
t to pre-aggregate by time window? I don&#39;t think Kudu can help you dire=
ctly with that (nothing built in), but you could always create a separate t=
able to store the pre-aggregated values.=C2=A0 As far as applying functions=
 to do row splits, that is an interesting idea, but I think once Kudu has s=
upport for range bounds (the non-covering range partition design doc linked=
 above), you can simply create the bounds where the function would have put=
 them.=C2=A0 For example, if you want a partition for every five minutes, y=
ou can create the bounds accordingly.</div><div><br></div><div>Earlier this=
 week I gave a talk on timeseries in Kudu, I&#39;ve included some slides th=
at may be interesting to you.=C2=A0 Additionally, you may want to check out=
=C2=A0<a href=3D"https://github.com/danburkert/kudu-ts" target=3D"_blank">h=
ttps://github.com/danburkert/kudu-ts</a>, it&#39;s a very young =C2=A0(not =
feature complete) metrics layer on top of Kudu, it may give you some ideas.=
</div><span><font color=3D"#888888"><div><br></div><div>- Dan</div></font><=
/span></div><div><div><div class=3D"gmail_extra"><br><div class=3D"gmail_qu=
ote">On Sat, May 7, 2016 at 1:28 PM, Sand Stone <span dir=3D"ltr">&lt;<a hr=
ef=3D"mailto:sand.m.stone@gmail.com" target=3D"_blank">sand.m.stone@gmail.c=
om</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"marg=
in:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-l=
eft-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Thanks for sh=
aring, Dan. The diagrams explained clearly how the current system works.=C2=
=A0<div><br></div><div>As for things in my mind. Take the schema of &lt;hos=
t,metric,time,...&gt;, say, I am interested in data for the past 5 mins, 10=
 mins, etc. Or, aggregate at 5 mins interval for the past 3 days, 7 days, .=
.. Looks like I need to introduce a special 5-min bar column, use that colu=
mn to do range partition to spread data across the tablet servers so that I=
 could leverage parallel filtering.=C2=A0<div><br></div><div>The cost of th=
is extra column (INT8) is not ideal but not too bad either (storage cost wi=
se, compression should do wonders). So I am thinking whether it would be be=
tter to take &quot;functions&quot; as row split instead of only constants. =
Of course if business requires to drop down to 1-min bar, the data has to b=
e re-sharded again. So a more cost effective way of doing this on a product=
ion cluster would be good.=C2=A0</div><div><br></div><div><br></div><div><b=
r></div></div></div><div><div><div class=3D"gmail_extra"><br><div class=3D"=
gmail_quote">On Sat, May 7, 2016 at 8:50 AM, Dan Burkert <span dir=3D"ltr">=
&lt;<a href=3D"mailto:dan@cloudera.com" target=3D"_blank">dan@cloudera.com<=
/a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left=
-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr">Hi Sand,<div><br=
></div><div>I&#39;ve been working on some diagrams to help explain some of =
the more advanced partitioning types, it&#39;s attached. =C2=A0 Still prett=
y rough at this point, but the goal is to clean it up and move it into the =
Kudu documentation proper.=C2=A0 I&#39;m interested to hear what kind of ti=
me series you are interested in Kudu for.=C2=A0 I&#39;m tasked with improvi=
ng Kudu for time series, you can follow progress <a href=3D"https://issues.=
apache.org/jira/browse/KUDU-1306" target=3D"_blank">here</a>. If you have a=
ny additional ideas I&#39;d love to hear them.=C2=A0 You may also be intere=
sted in a small project that a JD and I have been working on in the past we=
ek to build an OpenTSDB style store on top of Kudu, you can find it=C2=A0<a=
 href=3D"https://github.com/danburkert/kudu-ts" target=3D"_blank">here</a>.=
=C2=A0 Still quite feature limited at this point.</div><span><font color=3D=
"#888888"><div><br></div><div>- Dan</div></font></span></div><div><div><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Fri, May 6, 2016 a=
t 4:51 PM, Sand Stone <span dir=3D"ltr">&lt;<a href=3D"mailto:sand.m.stone@=
gmail.com" target=3D"_blank">sand.m.stone@gmail.com</a>&gt;</span> wrote:<b=
r><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;borde=
r-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204)=
;padding-left:1ex"><div dir=3D"ltr">Thanks. Will read.=C2=A0<div><br></div>=
<div>Given that I am researching time series data, row locality is crucial =
:-) =C2=A0</div></div><div><div><div class=3D"gmail_extra"><br><div class=
=3D"gmail_quote">On Fri, May 6, 2016 at 3:57 PM, Jean-Daniel Cryans <span d=
ir=3D"ltr">&lt;<a href=3D"mailto:jdcryans@apache.org" target=3D"_blank">jdc=
ryans@apache.org</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote"=
 style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:=
solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"=
>We do have non-covering range partitions coming in the next few months, he=
re&#39;s the design (in review):=C2=A0<a href=3D"http://gerrit.cloudera.org=
:8080/#/c/2772/9/docs/design-docs/non-covering-range-partitions.md" target=
=3D"_blank">http://gerrit.cloudera.org:8080/#/c/2772/9/docs/design-docs/non=
-covering-range-partitions.md</a><div><br></div><div>The &quot;Background &=
amp; Motivation&quot; section should give you a good idea of why I&#39;m me=
ntioning this.</div><div><br></div><div>Meanwhile, if you don&#39;t need ro=
w locality, using hash partitioning could be good enough.</div><span><font =
color=3D"#888888"><div><br></div><div>J-D</div></font></span></div><div><di=
v><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Fri, May 6, =
2016 at 3:53 PM, Sand Stone <span dir=3D"ltr">&lt;<a href=3D"mailto:sand.m.=
stone@gmail.com" target=3D"_blank">sand.m.stone@gmail.com</a>&gt;</span> wr=
ote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex=
;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,20=
4,204);padding-left:1ex"><div dir=3D"ltr">Makes sense.=C2=A0<div><br></div>=
<div>Yeah it would be cool if users could specify/control the split rows af=
ter the table is created. Now, I have to &quot;think ahead&quot; to pre-cre=
ate the range buckets.=C2=A0</div></div><div><div><div class=3D"gmail_extra=
"><br><div class=3D"gmail_quote">On Fri, May 6, 2016 at 3:49 PM, Jean-Danie=
l Cryans <span dir=3D"ltr">&lt;<a href=3D"mailto:jdcryans@apache.org" targe=
t=3D"_blank">jdcryans@apache.org</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;bo=
rder-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">=
<div dir=3D"ltr">You will only get 1 tablet and no data distribution, which=
 is bad.<div><br></div><div>That&#39;s also how HBase works, but it will sp=
lit regions as you insert data and eventually you&#39;ll get some data dist=
ribution even if it doesn&#39;t start in an ideal situation. Tablet splitti=
ng will come later for Kudu.</div><span><font color=3D"#888888"><div><br></=
div><div>J-D</div></font></span></div><div><div><div class=3D"gmail_extra">=
<br><div class=3D"gmail_quote">On Fri, May 6, 2016 at 3:42 PM, Sand Stone <=
span dir=3D"ltr">&lt;<a href=3D"mailto:sand.m.stone@gmail.com" target=3D"_b=
lank">sand.m.stone@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"=
gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border=
-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div=
 dir=3D"ltr">One more questions, how does the range partition work if I don=
&#39;t specify the split rows?=C2=A0<div><br></div><div>Thanks!=C2=A0</div>=
</div><div><div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">O=
n Fri, May 6, 2016 at 3:37 PM, Sand Stone <span dir=3D"ltr">&lt;<a href=3D"=
mailto:sand.m.stone@gmail.com" target=3D"_blank">sand.m.stone@gmail.com</a>=
&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0px=
 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-co=
lor:rgb(204,204,204);padding-left:1ex"><div dir=3D"ltr"><font color=3D"#000=
000" face=3D"verdana, sans-serif">Thanks, Misty. The &quot;advanced&quot; i=
mpala example helped.=C2=A0</font><div><span style=3D"background-color:rgb(=
255,255,255)"><font color=3D"#000000" face=3D"verdana, sans-serif"><br></fo=
nt></span></div><div><span style=3D"background-color:rgb(255,255,255)"><fon=
t color=3D"#000000" face=3D"verdana, sans-serif">I was just reading the Jav=
a API,CreateTableOptions.java, it&#39;s unclear how the range partition col=
umn names associated with the partial rows params in the=C2=A0<span style=
=3D"line-height:16.8px;white-space:pre-wrap">addSplitRow API.</span></font>=
</span></div></div><div><div><div class=3D"gmail_extra"><br><div class=3D"g=
mail_quote">On Fri, May 6, 2016 at 3:08 PM, Misty Stanley-Jones <span dir=
=3D"ltr">&lt;<a href=3D"mailto:mstanleyjones@cloudera.com" target=3D"_blank=
">mstanleyjones@cloudera.com</a>&gt;</span> wrote:<br><blockquote class=3D"=
gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;border=
-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div=
 dir=3D"ltr">Hi Sand,<div><br></div><div>Please have a look at=C2=A0<a href=
=3D"http://getkudu.io/docs/kudu_impala_integration.html#partitioning_tables=
" target=3D"_blank">http://getkudu.io/docs/kudu_impala_integration.html#par=
titioning_tables</a> and see if it is helpful to you.</div><div><br></div><=
div>Thanks,</div><div>Misty</div></div><div><div><div class=3D"gmail_extra"=
><br><div class=3D"gmail_quote">On Fri, May 6, 2016 at 2:00 PM, Sand Stone =
<span dir=3D"ltr">&lt;<a href=3D"mailto:sand.m.stone@gmail.com" target=3D"_=
blank">sand.m.stone@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D=
"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;borde=
r-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><di=
v dir=3D"ltr">Hi, I am new to Kudu. I wonder how the split rows work. I kno=
w from some docs, this is currently for pre-creation the table. I am resear=
ching how to partition (hash+range) some time series test data.=C2=A0<div><=
br></div><div>Is there an example? or notes somewhere I could read upon.=C2=
=A0<br></div><div><br></div><div>Thanks much.=C2=A0</div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br></div>

--001a113d05f083cffe0532acceff--