Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6D780200B8B for ; Tue, 4 Oct 2016 14:29:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6A684160AC9; Tue, 4 Oct 2016 12:29:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 3E4A8160AC5 for ; Tue, 4 Oct 2016 14:29:40 +0200 (CEST) Received: (qmail 7716 invoked by uid 500); 4 Oct 2016 12:29:38 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7706 invoked by uid 99); 4 Oct 2016 12:29:38 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Oct 2016 12:29:38 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 4D6731A10D3 for ; Tue, 4 Oct 2016 12:29:38 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=jaumo.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id BZ1bFpow2rET for ; Tue, 4 Oct 2016 12:29:35 +0000 (UTC) Received: from mail-oi0-f50.google.com (mail-oi0-f50.google.com [209.85.218.50]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 436DD5F399 for ; Tue, 4 Oct 2016 12:29:35 +0000 (UTC) Received: by mail-oi0-f50.google.com with SMTP id n132so168204261oih.1 for ; Tue, 04 Oct 2016 05:29:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jaumo.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=2Svq7LzaGkbWWZOv3NOOFwKDEym4mHAKywRqwkht2KM=; b=HU7or+xi/a/qLaNKSwDNLN3KbuUVgKFhHejtqLQvacZQSRwT+phy3CjbKx1FhKzSeS Tua1+1NyewRUsLRSicuS3CvlSpalvo6gAZv2L1gGfu/hxQjsq+F83EgejFCraPyw3ipF 9nDErzcwbuOF2g1mrq2SLgi1C6qX29fkBoJvI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=2Svq7LzaGkbWWZOv3NOOFwKDEym4mHAKywRqwkht2KM=; b=WDNjmpFIbhTwTLiJPGOt/UwLH0SV002zlCmNFltsCof57atwN12UV0zpf+8T/7YMZg h1s5nJpWBRAJi8E+xvjSO6eHdA2lH7rqTfP9hdI1F3ttOqk7mLwVYpD7H3c0bcij9Qr8 I2HtIWc3pPbsD8uEFAmXPCISsUmk57PYR5mJqQxay+8Kgj3PWSCiHSLBCGVRz9cEexor Vd/Aj1k0G9Zd+8Rft+TuZ5a14zRWaaC6saiUoHBjoZgnW/g3GX2+0gsc7sJnQ+sYvOha Pn/D5cyMT1VuCeOV0NEURNl6+FMxK0HLpxbjeJrVt1X7Qnnf29kOCRuUPBMPx07MCqfq gwgA== X-Gm-Message-State: AA6/9RkIx8ZeOjOtwDD2UfnrIzXHqbcKc4iPy8cMpXvYQFbnljmuM1x2kgzfZRYfHpq1ASFhvlR6onXCSAu01JtL X-Received: by 10.157.16.27 with SMTP id h27mr1990005ote.192.1475584174344; Tue, 04 Oct 2016 05:29:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.148.104 with HTTP; Tue, 4 Oct 2016 05:29:33 -0700 (PDT) In-Reply-To: References: From: Benjamin Roth Date: Tue, 4 Oct 2016 14:29:33 +0200 Message-ID: Subject: Re: Efficient model for a sorting To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a113dffc08c61d5053e093666 archived-at: Tue, 04 Oct 2016 12:29:41 -0000 --001a113dffc08c61d5053e093666 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks guys! Good to know, that my approach is basically right, but I will check that lucene indices by time. 2016-10-04 14:22 GMT+02:00 DuyHai Doan : > "What scatter/gather? " > > http://www.slideshare.net/doanduyhai/sasi-cassandra-on- > the-full-text-search-ride-voxxed-daybelgrade-2016/23 > > "If you partition your data by user_id then you query only 1 shard to get > sorted by time visitors for a user" > > Exact, but in this case, you're using a 2nd index only for sorting right = ? > For SASI it's not even possible. Maybe it can work with Statrio Lucene im= pl > > On Tue, Oct 4, 2016 at 2:15 PM, Dorian Hoxha > wrote: > >> @DuyHai >> >> What scatter/gather? If you partition your data by user_id then you quer= y >> only 1 shard to get sorted by time visitors for a user. >> >> On Tue, Oct 4, 2016 at 2:09 PM, DuyHai Doan wrote= : >> >>> MV is right now your best choice for this kind of sorting behavior. >>> >>> Secondary index (whatever the impl, SASI or Lucene) has a cost of >>> scatter-gather if your cluster scale out. With MV you're at least >>> guaranteed to hit a single node everytime >>> >>> On Tue, Oct 4, 2016 at 1:56 PM, Dorian Hoxha >>> wrote: >>> >>>> Can you use the lucene index https://github.com/Stratio/cas >>>> sandra-lucene-index ? >>>> >>>> On Tue, Oct 4, 2016 at 1:27 PM, Benjamin Roth >>>> wrote: >>>> >>>>> Hi! >>>>> >>>>> I have a frequently used pattern which seems to be quite costly in CS= . >>>>> The pattern is always the same: I have a unique key and a sorting by = a >>>>> different field. >>>>> >>>>> To give an example, here a real life example from our model: >>>>> CREATE TABLE visits.visits_in ( >>>>> user_id int, >>>>> user_id_visitor int, >>>>> created timestamp, >>>>> PRIMARY KEY (user_id, user_id_visitor) >>>>> ) WITH CLUSTERING ORDER BY (user_id_visitor ASC) >>>>> >>>>> CREATE MATERIALIZED VIEW visits.visits_in_sorted_mv AS >>>>> SELECT user_id, created, user_id_visitor >>>>> FROM visits.visits_in >>>>> WHERE user_id IS NOT NULL AND created IS NOT NULL AND >>>>> user_id_visitor IS NOT NULL >>>>> PRIMARY KEY (user_id, created, user_id_visitor) >>>>> WITH CLUSTERING ORDER BY (created DESC, user_id_visitor DESC) >>>>> >>>>> This simply represents people, that visited my profile sorted by date >>>>> desc but only one entry per visitor. >>>>> Other examples with the same pattern could be a whats-app-like inbox >>>>> where the last message of each sender is shown by date desc. There ar= e lots >>>>> of examples for that pattern. >>>>> >>>>> E.g. in redis I'd just use a sorted set, where the key could be like >>>>> "visits_${user_id}", set key would be user_id_visitor and score >>>>> the created timestamp. >>>>> In MySQL I'd create the table with PK on user_id + user_id_visitor an= d >>>>> create an index on user_id + created >>>>> In C* i use an MV. >>>>> >>>>> Is this the most efficient approach? >>>>> I also could have done this without an MV but then the situation in >>>>> our app would be far more complex. >>>>> I know that denormalization is a common pattern in C* and I don't >>>>> hesitate to use it but in this case, it is not as simple as it's not = an >>>>> append-only case but updates have to be handled correctly. >>>>> If it is the first visit of a user, it's that simple, just 2 inserts >>>>> in base table + denormalized table. But on a 2nd or 3rd visit, the 1s= t or >>>>> 2nd visit has to be deleted from the denormalized table before. Other= wise >>>>> the visit would not be unique any more. >>>>> Handling this case without an MV requires a lot more effort, I guess >>>>> even more effort than just using an MV. >>>>> 1. You need kind of app-side locking to deal with race conditions >>>>> 2. Read before write is required to determine if an old record has to >>>>> be deleted >>>>> 3. At least CL_QUORUM is required to make sure that read before write >>>>> is always consistent >>>>> 4. Old record has to be deleted on update >>>>> >>>>> I guess, using an MV here is more efficient as there is less roundtri= p >>>>> between C* and the app to do all that and the MV does not require str= ong >>>>> consistency as MV updates are always local and are eventual consisten= t when >>>>> the base table is. So there is also no need for distributed locks. >>>>> >>>>> I ask all this as we now use CS 3.x and have been advised that 3.x is >>>>> still not considered really production ready. >>>>> >>>>> I guess in a perfect world, this wouldn't even require an MV if SASI >>>>> indexes could be created over more than 1 column. E.g. in MySQL this = case >>>>> is nothing else than a BTree. AFAIK SASI indices are also BTrees, fil= tering >>>>> by Partition Key (which should to be done anyway) and sorting by a fi= eld >>>>> would perfectly do the trick. But from the docs, this is not possible= right >>>>> now. >>>>> >>>>> Does anyone see a better solution or are all my assumptions correct? >>>>> >>>>> -- >>>>> Benjamin Roth >>>>> Prokurist >>>>> >>>>> Jaumo GmbH =C2=B7 www.jaumo.com >>>>> Wehrstra=C3=9Fe 46 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany >>>>> Phone +49 7161 304880-6 =C2=B7 Fax +49 7161 304880-1 >>>>> AG Ulm =C2=B7 HRB 731058 =C2=B7 Managing Director: Jens Kammerer >>>>> >>>> >>>> >>> >> > --=20 Benjamin Roth Prokurist Jaumo GmbH =C2=B7 www.jaumo.com Wehrstra=C3=9Fe 46 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany Phone +49 7161 304880-6 =C2=B7 Fax +49 7161 304880-1 AG Ulm =C2=B7 HRB 731058 =C2=B7 Managing Director: Jens Kammerer --001a113dffc08c61d5053e093666 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks guys!

Good to know, that my appr= oach is basically right, but I will check that lucene indices by time.

2016-10-04= 14:22 GMT+02:00 DuyHai Doan <doanduyhai@gmail.com>:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
"What scatter/gather? "=C2=A0


"If you partition yo= ur data by user_id then you query only 1 shard to get sorted by time visito= rs for a user"

<= /span>
Exact, but in this= case, you're using a 2nd index only for sorting right ? For SASI it= 9;s not even possible. Maybe it can work with Statrio Lucene impl

On Tue, Oct 4, 2016 at 2:15 PM, Dorian Hoxh= a <dorian.hoxha@gmail.com> wrote:
@DuyHai

What scatter/gather? If you p= artition your data by user_id then you query only 1 shard to get sorted by = time visitors for a user.

On Tue, Oct 4, 2016 at 2:09 PM, DuyHai Doan <d= oanduyhai@gmail.com> wrote:
MV is right now your best choice for this kind of sorting= behavior.

Secondary index (whatever the impl, SASI or L= ucene) has a cost of scatter-gather if your cluster scale out. With MV you&= #39;re at least guaranteed to hit a single node everytime

<= div class=3D"gmail_quote">On Tue, Oct 4, 2016 at 1:56 PM, Dorian Hoxha <dorian.hoxha@gmail.com> wrote:

On Tue, Oct 4, 2016 a= t 1:27 PM, Benjamin Roth <benjamin.roth@jaumo.com> wro= te:
Hi!
<= br>
I have a frequently used pattern which seems to be quite cost= ly in CS. The pattern is always the same: I have a unique key and a sorting= by a different field.

To give an example, here a = real life example from our model:
CREATE TABLE visits.visits_in (
=C2=A0 =C2=A0 user_id int,<= /div>
=C2=A0 =C2=A0 user_id_visitor int,
=C2=A0 =C2=A0 create= d timestamp,
=C2=A0 =C2=A0 PRIMARY KEY (user_id, user_id_visitor)=
) WITH CLUSTERING ORDER BY (user_id_visitor ASC)

= CREATE MATERIALIZED VIEW visits.visits_in_sorted_mv AS
=C2=A0 =C2= =A0 SELECT user_id, created, user_id_visitor
=C2=A0 =C2=A0 FROM v= isits.visits_in
=C2=A0 =C2=A0 WHERE user_id IS NOT NULL AND creat= ed IS NOT NULL AND user_id_visitor IS NOT NULL
=C2=A0 =C2=A0 PRIM= ARY KEY (user_id, created, user_id_visitor)
=C2=A0 =C2=A0 WITH CL= USTERING ORDER BY (created DESC, user_id_visitor DESC)

=
This simply represents people, that visited my profile sorted by date = desc but only one entry per visitor.
Other examples with the same= pattern could be a whats-app-like inbox where the last message of each sen= der is shown by date desc. There are lots of examples for that pattern.

E.g. in redis I'd just use a sorted s= et, where the key could be like "visits_${user_id}", set key woul= d be=C2=A0user_id_visitor=C2=A0and = score the=C2=A0created=C2=A0timestamp.
In MySQL I'd creat= e the table with PK on user_id + user_id_visitor and create an index on use= r_id + created
In C* i use an MV.

Is thi= s the most efficient approach?
I also could have done this withou= t an MV but then the situation in our app would be far more complex.
<= div>I know that denormalization is a common pattern in C* and I don't h= esitate to use it but in this case, it is not as simple as it's not an = append-only case but updates have to be handled correctly.
If it = is the first visit of a user, it's that simple, just 2 inserts in base = table + denormalized table. But on a 2nd or 3rd visit, the 1st or 2nd visit= has to be deleted from the denormalized table before. Otherwise the visit = would not be unique any more.
Handling this case without an MV re= quires a lot more effort, I guess even more effort than just using an MV.= =C2=A0
1. You need kind of app-side locking to deal with race con= ditions
2. Read before write is required to determine if an old r= ecord has to be deleted
3. At least CL_QUORUM is required to make= sure that read before write is always consistent
4. Old record h= as to be deleted on update

I guess, using an MV he= re is more efficient as there is less roundtrip between C* and the app to d= o all that and the MV does not require strong consistency as MV updates are= always local and are eventual consistent when the base table is. So there = is also no need for distributed locks.

I ask all t= his as we now use CS 3.x and have been advised that 3.x is still not consid= ered really production ready.

I guess in a perfect= world, this wouldn't even require an MV if SASI indexes could be creat= ed over more than 1 column. E.g. in MySQL this case is nothing else than a = BTree. AFAIK SASI indices are also BTrees, filtering by Partition Key (whic= h should to be done anyway) and sorting by a field would perfectly do the t= rick. But from the docs, this is not possible right now.

=
Does anyone see a better solution or are all my assumptions= correct?

--
Benjamin Roth
Prokurist

Jaumo GmbH =C2=B7 = www.jaumo.com
Weh= rstra=C3=9Fe 46 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany
Phone +49 7161 304880-6 =C2=B7 Fax +49 7161 304880-1
AG Ulm = =C2=B7 HRB 731058 =C2=B7 Managing Director: Jens Kammerer







--
=
Benjamin Roth
Prokurist

Jaumo GmbH =C2=B7 www.jaumo.com
Wehrstra=C3=9Fe 4= 6 =C2=B7 73035 G=C3=B6ppingen =C2=B7 Germany
Phone +49 7161 304880-6 =C2= =B7 Fax +49 7161 304880-1
AG Ulm =C2=B7 HRB 731058 =C2=B7 Managing Direc= tor: Jens Kammerer
--001a113dffc08c61d5053e093666--