Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
MIME-Version: 1.0
In-Reply-To: <CABNXB2CBApuKg6mVBoi6b028CFeEQ78rLZr53wrnKhHAdZutUQ@mail.gmail.com>
References: <CAHucCEEh2VoJwoYAWVStewBGBj4RvhXQBAT0tnB5Lf2PBXyPQg@mail.gmail.com>
 <CANsFX07uTn6Mb+83YDJZ9-pHfU-UjqZArd63QfijXQ-30SkGHQ@mail.gmail.com>
 <CABNXB2DwRppgiMB4g0WcGodwsBOMRR7Soy2BK6=+-UrpyFuu-A@mail.gmail.com>
 <CANsFX055jtM0QxhsU=EY3QwZdawfhVDBh_SJNjiAR71rvA5VLA@mail.gmail.com> <CABNXB2CBApuKg6mVBoi6b028CFeEQ78rLZr53wrnKhHAdZutUQ@mail.gmail.com>
From: Benjamin Roth <benjamin.roth@jaumo.com>
Date: Tue, 4 Oct 2016 14:29:33 +0200
Message-ID: <CAHucCEGrJ6gBatu_cGdgwT1KXmc4FU7h7PxnsMAHHr2Q6njXQQ@mail.gmail.com>
Subject: Re: Efficient model for a sorting
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a113dffc08c61d5053e093666
archived-at: Tue, 04 Oct 2016 12:29:41 -0000

--001a113dffc08c61d5053e093666
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Thanks guys!

Good to know, that my approach is basically right, but I will check that
lucene indices by time.

2016-10-04 14:22 GMT+02:00 DuyHai Doan <doanduyhai@gmail.com>:

> "What scatter/gather? "
>
> http://www.slideshare.net/doanduyhai/sasi-cassandra-on-
> the-full-text-search-ride-voxxed-daybelgrade-2016/23
>
> "If you partition your data by user_id then you query only 1 shard to get
> sorted by time visitors for a user"
>
> Exact, but in this case, you're using a 2nd index only for sorting right =
?
> For SASI it's not even possible. Maybe it can work with Statrio Lucene im=
pl
>
> On Tue, Oct 4, 2016 at 2:15 PM, Dorian Hoxha <dorian.hoxha@gmail.com>
> wrote:
>
>> @DuyHai
>>
>> What scatter/gather? If you partition your data by user_id then you quer=
y
>> only 1 shard to get sorted by time visitors for a user.
>>
>> On Tue, Oct 4, 2016 at 2:09 PM, DuyHai Doan <doanduyhai@gmail.com> wrote=
:
>>
>>> MV is right now your best choice for this kind of sorting behavior.
>>>
>>> Secondary index (whatever the impl, SASI or Lucene) has a cost of
>>> scatter-gather if your cluster scale out. With MV you're at least
>>> guaranteed to hit a single node everytime
>>>
>>> On Tue, Oct 4, 2016 at 1:56 PM, Dorian Hoxha <dorian.hoxha@gmail.com>
>>> wrote:
>>>
>>>> Can you use the lucene index https://github.com/Stratio/cas
>>>> sandra-lucene-index ?
>>>>
>>>> On Tue, Oct 4, 2016 at 1:27 PM, Benjamin Roth <benjamin.roth@jaumo.com=
>
>>>> wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> I have a frequently used pattern which seems to be quite costly in CS=
.
>>>>> The pattern is always the same: I have a unique key and a sorting by =
a
>>>>> different field.
>>>>>
>>>>> To give an example, here a real life example from our model:
>>>>> CREATE TABLE visits.visits_in (
>>>>>     user_id int,
>>>>>     user_id_visitor int,
>>>>>     created timestamp,
>>>>>     PRIMARY KEY (user_id, user_id_visitor)
>>>>> ) WITH CLUSTERING ORDER BY (user_id_visitor ASC)
>>>>>
>>>>> CREATE MATERIALIZED VIEW visits.visits_in_sorted_mv AS
>>>>>     SELECT user_id, created, user_id_visitor
>>>>>     FROM visits.visits_in
>>>>>     WHERE user_id IS NOT NULL AND created IS NOT NULL AND
>>>>> user_id_visitor IS NOT NULL
>>>>>     PRIMARY KEY (user_id, created, user_id_visitor)
>>>>>     WITH CLUSTERING ORDER BY (created DESC, user_id_visitor DESC)
>>>>>
>>>>> This simply represents people, that visited my profile sorted by date
>>>>> desc but only one entry per visitor.
>>>>> Other examples with the same pattern could be a whats-app-like inbox
>>>>> where the last message of each sender is shown by date desc. There ar=
e lots
>>>>> of examples for that pattern.
>>>>>
>>>>> E.g. in redis I'd just use a sorted set, where the key could be like
>>>>> "visits_${user_id}", set key would be user_id_visitor and score
>>>>> the created timestamp.
>>>>> In MySQL I'd create the table with PK on user_id + user_id_visitor an=
d
>>>>> create an index on user_id + created
>>>>> In C* i use an MV.
>>>>>
>>>>> Is this the most efficient approach?
>>>>> I also could have done this without an MV but then the situation in
>>>>> our app would be far more complex.
>>>>> I know that denormalization is a common pattern in C* and I don't
>>>>> hesitate to use it but in this case, it is not as simple as it's not =
an
>>>>> append-only case but updates have to be handled correctly.
>>>>> If it is the first visit of a user, it's that simple, just 2 inserts
>>>>> in base table + denormalized table. But on a 2nd or 3rd visit, the 1s=
t or
>>>>> 2nd visit has to be deleted from the denormalized table before. Other=
wise
>>>>> the visit would not be unique any more.
>>>>> Handling this case without an MV requires a lot more effort, I guess
>>>>> even more effort than just using an MV.
>>>>> 1. You need kind of app-side locking to deal with race conditions
>>>>> 2. Read before write is required to determine if an old record has to
>>>>> be deleted
>>>>> 3. At least CL_QUORUM is required to make sure that read before write
>>>>> is always consistent
>>>>> 4. Old record has to be deleted on update
>>>>>
>>>>> I guess, using an MV here is more efficient as there is less roundtri=
p
>>>>> between C* and the app to do all that and the MV does not require str=
ong
>>>>> consistency as MV updates are always local and are eventual consisten=
t when
>>>>> the base table is. So there is also no need for distributed locks.
>>>>>
>>>>> I ask all this as we now use CS 3.x and have been advised that 3.x is
>>>>> still not considered really production ready.
>>>>>
>>>>> I guess in a perfect world, this wouldn't even require an MV if SASI
>>>>> indexes could be created over more than 1 column. E.g. in MySQL this =
case
>>>>> is nothing else than a BTree. AFAIK SASI indices are also BTrees, fil=
tering
>>>>> by Partition Key (which should to be done anyway) and sorting by a fi=
eld
>>>>> would perfectly do the trick. But from the docs, this is not possible=
 right
>>>>> now.
>>>>>
>>>>> Does anyone see a better solution or are all my assumptions correct?
>>>>>
>>>>> --
>>>>> Benjamin Roth
>>>>> Prokurist
>>>>>
>>>>> Jaumo GmbH =C2=B7 www.jaumo.com
>>>>> Wehrstra=C3=9Fe 46 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany
>>>>> Phone +49 7161 304880-6 =C2=B7 Fax +49 7161 304880-1
>>>>> AG Ulm =C2=B7 HRB 731058 =C2=B7 Managing Director: Jens Kammerer
>>>>>
>>>>
>>>>
>>>
>>
>


--=20
Benjamin Roth
Prokurist

Jaumo GmbH =C2=B7 www.jaumo.com
Wehrstra=C3=9Fe 46 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany
Phone +49 7161 304880-6 =C2=B7 Fax +49 7161 304880-1
AG Ulm =C2=B7 HRB 731058 =C2=B7 Managing Director: Jens Kammerer

--001a113dffc08c61d5053e093666
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Thanks guys!<div><br></div><div>Good to know, that my appr=
oach is basically right, but I will check that lucene indices by time.</div=
></div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">2016-10-04=
 14:22 GMT+02:00 DuyHai Doan <span dir=3D"ltr">&lt;<a href=3D"mailto:doandu=
yhai@gmail.com" target=3D"_blank">doanduyhai@gmail.com</a>&gt;</span>:<br><=
blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px=
 #ccc solid;padding-left:1ex"><div dir=3D"ltr">&quot;<span style=3D"font-si=
ze:12.8px">What scatter/gather? &quot;=C2=A0</span><div><span style=3D"font=
-size:12.8px"><br></span></div><div><span style=3D"font-size:12.8px"><a hre=
f=3D"http://www.slideshare.net/doanduyhai/sasi-cassandra-on-the-full-text-s=
earch-ride-voxxed-daybelgrade-2016/23" target=3D"_blank">http://www.slidesh=
are.net/<wbr>doanduyhai/sasi-cassandra-on-<wbr>the-full-text-search-ride-<w=
br>voxxed-daybelgrade-2016/23</a></span><br></div><span class=3D""><div><sp=
an style=3D"font-size:12.8px"><br></span></div><div><span style=3D"font-siz=
e:12.8px">&quot;</span><span style=3D"font-size:12.8px">If you partition yo=
ur data by user_id then you query only 1 shard to get sorted by time visito=
rs for a user&quot;</span></div><div><span style=3D"font-size:12.8px"><br><=
/span></div></span><div><span style=3D"font-size:12.8px">Exact, but in this=
 case, you&#39;re using a 2nd index only for sorting right ? For SASI it=
9;s not even possible. Maybe it can work with Statrio Lucene impl</span></d=
iv></div><div class=3D"HOEnZb"><div class=3D"h5"><div class=3D"gmail_extra"=
><br><div class=3D"gmail_quote">On Tue, Oct 4, 2016 at 2:15 PM, Dorian Hoxh=
a <span dir=3D"ltr">&lt;<a href=3D"mailto:dorian.hoxha@gmail.com" target=3D=
"_blank">dorian.hoxha@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex"><div dir=3D"ltr">@DuyHai<br><br>What scatter/gather? If you p=
artition your data by user_id then you query only 1 shard to get sorted by =
time visitors for a user.<br></div><div class=3D"m_5096964372591724231HOEnZ=
b"><div class=3D"m_5096964372591724231h5"><div class=3D"gmail_extra"><br><d=
iv class=3D"gmail_quote">On Tue, Oct 4, 2016 at 2:09 PM, DuyHai Doan <span =
dir=3D"ltr">&lt;<a href=3D"mailto:doanduyhai@gmail.com" target=3D"_blank">d=
oanduyhai@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"=
><div dir=3D"ltr">MV is right now your best choice for this kind of sorting=
 behavior.<div><br></div><div>Secondary index (whatever the impl, SASI or L=
ucene) has a cost of scatter-gather if your cluster scale out. With MV you&=
#39;re at least guaranteed to hit a single node everytime</div></div><div c=
lass=3D"m_5096964372591724231m_-341001490429289754HOEnZb"><div class=3D"m_5=
096964372591724231m_-341001490429289754h5"><div class=3D"gmail_extra"><br><=
div class=3D"gmail_quote">On Tue, Oct 4, 2016 at 1:56 PM, Dorian Hoxha <spa=
n dir=3D"ltr">&lt;<a href=3D"mailto:dorian.hoxha@gmail.com" target=3D"_blan=
k">dorian.hoxha@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gma=
il_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-lef=
t:1ex"><div dir=3D"ltr">Can you use the lucene index <a href=3D"https://git=
hub.com/Stratio/cassandra-lucene-index" target=3D"_blank">https://github.co=
m/Stratio/cas<wbr>sandra-lucene-index</a> ?<br></div><div class=3D"m_509696=
4372591724231m_-341001490429289754m_-147510864098611122HOEnZb"><div class=
=3D"m_5096964372591724231m_-341001490429289754m_-147510864098611122h5"><div=
 class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Tue, Oct 4, 2016 a=
t 1:27 PM, Benjamin Roth <span dir=3D"ltr">&lt;<a href=3D"mailto:benjamin.r=
oth@jaumo.com" target=3D"_blank">benjamin.roth@jaumo.com</a>&gt;</span> wro=
te:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-=
left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div>Hi!</div><div><=
br></div><div>I have a frequently used pattern which seems to be quite cost=
ly in CS. The pattern is always the same: I have a unique key and a sorting=
 by a different field.</div><div><br></div><div>To give an example, here a =
real life example from our model:</div><div><div style=3D"font-size:12.8px"=
><div>CREATE TABLE visits.visits_in (</div><div>=C2=A0 =C2=A0 user_id int,<=
/div><div>=C2=A0 =C2=A0 user_id_visitor int,</div><div>=C2=A0 =C2=A0 create=
d timestamp,</div><div>=C2=A0 =C2=A0 PRIMARY KEY (user_id, user_id_visitor)=
</div><div>) WITH CLUSTERING ORDER BY (user_id_visitor ASC)</div></div><div=
 style=3D"font-size:12.8px"><br></div><div style=3D"font-size:12.8px"><div>=
CREATE MATERIALIZED VIEW visits.visits_in_sorted_mv AS</div><div>=C2=A0 =C2=
=A0 SELECT user_id, created, user_id_visitor</div><div>=C2=A0 =C2=A0 FROM v=
isits.visits_in</div><div>=C2=A0 =C2=A0 WHERE user_id IS NOT NULL AND creat=
ed IS NOT NULL AND user_id_visitor IS NOT NULL</div><div>=C2=A0 =C2=A0 PRIM=
ARY KEY (user_id, created, user_id_visitor)</div><div>=C2=A0 =C2=A0 WITH CL=
USTERING ORDER BY (created DESC, user_id_visitor DESC)</div><div><br></div>=
<div>This simply represents people, that visited my profile sorted by date =
desc but only one entry per visitor.</div><div>Other examples with the same=
 pattern could be a whats-app-like inbox where the last message of each sen=
der is shown by date desc. There are lots of examples for that pattern.</di=
v></div></div><div><br></div><div>E.g. in redis I&#39;d just use a sorted s=
et, where the key could be like &quot;visits_${user_id}&quot;, set key woul=
d be=C2=A0<span style=3D"font-size:12.8px">user_id_visitor</span>=C2=A0and =
score the=C2=A0created=C2=A0timestamp.<br></div><div>In MySQL I&#39;d creat=
e the table with PK on user_id + user_id_visitor and create an index on use=
r_id + created</div><div>In C* i use an MV.</div><div><br></div><div>Is thi=
s the most efficient approach?</div><div>I also could have done this withou=
t an MV but then the situation in our app would be far more complex.</div><=
div>I know that denormalization is a common pattern in C* and I don&#39;t h=
esitate to use it but in this case, it is not as simple as it&#39;s not an =
append-only case but updates have to be handled correctly.</div><div>If it =
is the first visit of a user, it&#39;s that simple, just 2 inserts in base =
table + denormalized table. But on a 2nd or 3rd visit, the 1st or 2nd visit=
 has to be deleted from the denormalized table before. Otherwise the visit =
would not be unique any more.</div><div>Handling this case without an MV re=
quires a lot more effort, I guess even more effort than just using an MV.=
=C2=A0</div><div>1. You need kind of app-side locking to deal with race con=
ditions</div><div>2. Read before write is required to determine if an old r=
ecord has to be deleted</div><div>3. At least CL_QUORUM is required to make=
 sure that read before write is always consistent</div><div>4. Old record h=
as to be deleted on update</div><div><br></div><div>I guess, using an MV he=
re is more efficient as there is less roundtrip between C* and the app to d=
o all that and the MV does not require strong consistency as MV updates are=
 always local and are eventual consistent when the base table is. So there =
is also no need for distributed locks.</div><div><br></div><div>I ask all t=
his as we now use CS 3.x and have been advised that 3.x is still not consid=
ered really production ready.</div><div><br></div><div>I guess in a perfect=
 world, this wouldn&#39;t even require an MV if SASI indexes could be creat=
ed over more than 1 column. E.g. in MySQL this case is nothing else than a =
BTree. AFAIK SASI indices are also BTrees, filtering by Partition Key (whic=
h should to be done anyway) and sorting by a field would perfectly do the t=
rick. But from the docs, this is not possible right now.<br></div><div><br>=
</div><div><div>Does anyone see a better solution or are all my assumptions=
 correct?</div><span class=3D"m_5096964372591724231m_-341001490429289754m_-=
147510864098611122m_2088808432149665629HOEnZb"><font color=3D"#888888"><div=
><br></div></font></span></div><span class=3D"m_5096964372591724231m_-34100=
1490429289754m_-147510864098611122m_2088808432149665629HOEnZb"><font color=
=3D"#888888">-- <br><div class=3D"m_5096964372591724231m_-34100149042928975=
4m_-147510864098611122m_2088808432149665629m_6229246746531234103gmail_signa=
ture"><div dir=3D"ltr">Benjamin Roth<br>Prokurist<br><br>Jaumo GmbH =C2=B7 =
<a href=3D"http://www.jaumo.com" target=3D"_blank">www.jaumo.com</a><br>Weh=
rstra=C3=9Fe 46 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany<br>Phone <a href=
=3D"tel:%2B49%207161%20304880-6" value=3D"+4971613048806" target=3D"_blank"=
>+49 7161 304880-6</a> =C2=B7 Fax <a href=3D"tel:%2B49%207161%20304880-1" v=
alue=3D"+4971613048801" target=3D"_blank">+49 7161 304880-1</a><br>AG Ulm =
=C2=B7 HRB 731058 =C2=B7 Managing Director: Jens Kammerer</div></div>
</font></span></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div class=3D"gmail_signature" data-smartmail=3D"gmail_signature"><div dir=
=3D"ltr">Benjamin Roth<br>Prokurist<br><br>Jaumo GmbH =C2=B7 <a href=3D"htt=
p://www.jaumo.com" target=3D"_blank">www.jaumo.com</a><br>Wehrstra=C3=9Fe 4=
6 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany<br>Phone +49 7161 304880-6 =C2=
=B7 Fax +49 7161 304880-1<br>AG Ulm =C2=B7 HRB 731058 =C2=B7 Managing Direc=
tor: Jens Kammerer</div></div>
</div>

--001a113dffc08c61d5053e093666--