Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C8527200BBE for ; Fri, 11 Nov 2016 08:24:35 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id C6C32160AF5; Fri, 11 Nov 2016 07:24:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9798C160AE4 for ; Fri, 11 Nov 2016 08:24:34 +0100 (CET) Received: (qmail 57579 invoked by uid 500); 11 Nov 2016 07:24:32 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 57567 invoked by uid 99); 11 Nov 2016 07:24:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Nov 2016 07:24:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 547E2C0115 for ; Fri, 11 Nov 2016 07:24:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.679 X-Spam-Level: * X-Spam-Status: No, score=1.679 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=jaumo.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id XNnC8iSCYdMe for ; Fri, 11 Nov 2016 07:24:29 +0000 (UTC) Received: from mail-it0-f44.google.com (mail-it0-f44.google.com [209.85.214.44]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 0639A5F2F2 for ; Fri, 11 Nov 2016 07:24:28 +0000 (UTC) Received: by mail-it0-f44.google.com with SMTP id e187so59259559itc.0 for ; Thu, 10 Nov 2016 23:24:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jaumo.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=Pg1Qp+m+f5cUJWrHbmDhlbK9nyAVfELqHAAYS9DJLJQ=; b=RVvHH38R9j+j8GsPKPtjPiS5NtI4OoldEuZuUZs5j15DlvxcOrMNZj9Ysa+D2lV4LA bsv14p2MOzWcLUfIbQxBftzqT9An+Ccudy3CEp2ry/FWKWzBkOTLI9x405/8htkH//l6 Fo0Xm1q2FUokUeIJNk18O0YQfGXNpi/X1vOCk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=Pg1Qp+m+f5cUJWrHbmDhlbK9nyAVfELqHAAYS9DJLJQ=; b=IDxisRvI1QJtgFWK1TJY06zq070tYKBkJWv/UCwq2Zdo7+IOOpPlPobFJ3IjEtEjQN xfvsNbG8V9KMjQ1AdVxoimL4SDSLcTxO12gzARbeTLUJ6DjkVQPFB7dGqy5ro5m14fj9 2kt33UTpTiCG2khqbvcMMlbN+PXS70BTjFnvuvvvaIqV1W0zisR++CCgvKBNM+8TiGLw 21v6M5x4aZCXjyU4efBxs0iMReUK5MkHX3jmKN9XkHzEzUpMz7fiij9oELw9Q0A+kx6z TfRl0RMczW7Tgc38fywKuRByhkT70mbVFoDsFUkV5hd4Bvmi8JcFWguaVFZtH3M2X8re 7DPA== X-Gm-Message-State: ABUngvea3ngshI//gxujMu8je788J8Kg7DmvzXwh6XI/d6loIsqwbD+jeOhDsc1tho1nd04CM8G7ZceDbcoQxZrs X-Received: by 10.157.43.23 with SMTP id o23mr587076otb.206.1478849068106; Thu, 10 Nov 2016 23:24:28 -0800 (PST) MIME-Version: 1.0 Received: by 10.182.39.72 with HTTP; Thu, 10 Nov 2016 23:24:27 -0800 (PST) Received: by 10.182.39.72 with HTTP; Thu, 10 Nov 2016 23:24:27 -0800 (PST) In-Reply-To: References: <158445dcfc8.ca493bb728302.6473184930597507436@winguzone.com> <158489220c0.bda6a2eb120886.4737303318592947827@winguzone.com> <1584d522620.101caee7d223832.751476256929186922@winguzone.com> From: Benjamin Roth Date: Fri, 11 Nov 2016 08:24:27 +0100 Message-ID: Subject: =?UTF-8?B?UmU6IOetlOWkjTog562U5aSNOiBBIGRpZmZpY3VsdCBkYXRhIG1vZGVsIHdpdGggQyo=?= To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a1141db44619bce0541016110 archived-at: Fri, 11 Nov 2016 07:24:36 -0000 --001a1141db44619bce0541016110 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable This is the reason why One would like to use an mv for it. An mv Handels this. It adds a clustering Key while preserving uniqueness of the original pk. Am 11.11.2016 02:33 schrieb "Gang Liu" : > I guess orignal design is keep one record for one video per user. maybe > their app will report many play records when user watching one video. > So there will be many records when change primary key to (user_name, > last_time). Also > SELECT * FROM user_views WHERE user_name =3D ? LIMIT 10 > without group by video_id can't fit business requirement. > > regards, > Gang > > > On Thu, Nov 10, 2016 at 6:47 PM, Carlos Alonso wrote= : > >> What about having something like >> >> CREATE TABLE user_views ( >> user_name text, >> video_id text, >> position int, >> last_time timestamp, >> PRIMARY KEY(user_name, last_time) >> ) WITH CLUSTERING ORDER BY (last_time DESC); >> >> Where you insert a record everytime a user watches a video and then >> having a batch task (every night maybe?) that deletes the extra rows tha= t >> are not needed anymore. >> The query pattern for this is quite efficient as something like SELECT * >> FROM user_views WHERE user_name =3D ? LIMIT 10; >> >> Regards >> >> Carlos Alonso | Software Engineer | @calonso >> >> >> On 10 November 2016 at 09:19, Vladimir Yudovin >> wrote: >> >>> >Do you mean the oldest one should be removed when a new play is added? >>> Sure. As you described the issue "the last ten items may be adequate fo= r >>> the business" >>> >>> Best regards, Vladimir Yudovin, >>> >>> *Winguzone - Hosted Cloud >>> CassandraLaunch your cluster in minutes.* >>> >>> >>> ---- On Wed, 09 Nov 2016 20:47:05 -0500*Diamond ben >>> >* wrote ---- >>> >>> The solution maybe work. However, the play list will grow over time and >>> somebody maybe has ten thousands that will slow down the query and sort= . >>> Do you mean the oldest one should be removed when a new play is added? >>> >>> BTW, the version is 2.1.16 in our live system. >>> >>> >>> BRs, >>> >>> BEN >>> ------------------------------ >>> >>> *=E5=8F=91=E4=BB=B6=E4=BA=BA:* Vladimir Yudovin >>> *=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:* 2016=E5=B9=B411=E6=9C=889=E6=97= =A5 18:11:26 >>> *=E6=94=B6=E4=BB=B6=E4=BA=BA:* user >>> *=E4=B8=BB=E9=A2=98:* Re: =E7=AD=94=E5=A4=8D: A difficult data model wi= th C* >>> >>> You are welcome! ) >>> >>> >recent ten movies watched by the user within 30 days. >>> In this case you can't use PRIMARY KEY (user_name, video_id), as >>> video_id is demanded to fetch row, so all this stuff may be >>> >>> CREATE TYPE play (video_id text, position int, last_time timestamp); >>> CREATE TABLE recent (user_name text PRIMARY KEY, play_list >>> LIST>); >>> >>> You can easily retrieve play list for specific user by his ID. Instead >>> of LIST you can use MAP, I don't think that for ten entries it matters. >>> >>> >>> Best regards, Vladimir Yudovin, >>> >>> *Winguzone - Hosted Cloud Cassandra >>> Launch your cluster in minutes.* >>> >>> >>> ---- On Tue, 08 Nov 2016 22:29:48 -0500*ben ben >>> >* wrote ---- >>> >>> Hi Vladimir Yudovin, >>> >>> >>> Thank you very much for your detailed explaining. Maybe I didn't >>> describe the requirement clearly. The use cases should be: >>> >>> 1. a user login our app. >>> >>> 2. show the recent ten movies watched by the user within 30 days. >>> >>> 3. the user can click any one of the ten movie and continue to watch >>> from the last position she/he did. BTW, a movie can be watched several >>> times by a user and the last positon is needed indeed. >>> >>> BRs, >>> >>> BEN >>> ------------------------------ >>> >>> *=E5=8F=91=E4=BB=B6=E4=BA=BA:* Vladimir Yudovin >>> *=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4:* 2016=E5=B9=B411=E6=9C=888=E6=97= =A5 22:35:48 >>> *=E6=94=B6=E4=BB=B6=E4=BA=BA:* user >>> *=E4=B8=BB=E9=A2=98:* Re: A difficult data model with C* >>> >>> Hi Ben, >>> >>> if need very limited number of positions (as you said ten) may be you >>> can store them in LIST of UDT? Or just as JSON string? >>> So you'll have one row per each pair user-video. >>> >>> It can be something like this: >>> >>> CREATE TYPE play (position int, last_time timestamp); >>> CREATE TABLE recent (user_name text, video_id text, review >>> LIST>, PRIMARY KEY (user_name, video_id)); >>> >>> UPDATE recent set review =3D review + [(1234,12345)] where user_name=3D= 'some >>> user' AND video_id=3D'great video'; >>> UPDATE recent set review =3D review + [(1234,123456)] where >>> user_name=3D'some user' AND video_id=3D'great video'; >>> UPDATE recent set review =3D review + [(1234,1234567)] where >>> user_name=3D'some user' AND video_id=3D'great video'; >>> >>> You can delete the oldest entry by index: >>> DELETE review[0] FROM recent WHERE user_name=3D'some user' AND >>> video_id=3D'great video'; >>> >>> or by value, if you know the oldest entry: >>> >>> UPDATE recent SET review =3D review - [(1234,12345)] WHERE >>> user_name=3D'some user' AND video_id=3D'great video'; >>> >>> Best regards, Vladimir Yudovin, >>> >>> *Winguzone - Hosted Cloud Cassandra >>> Launch your cluster in minutes.* >>> >>> >>> ---- On Mon, 07 Nov 2016 21:54:08 -0500*ben ben >>> >* wrote ---- >>> >>> >>> Hi guys, >>> >>> We are maintaining a system for an on-line video service. ALL users' >>> viewing records of every movie are stored in C*. So she/he can continue= to >>> enjoy the movie from the last point next time. The table is designed as >>> below: >>> CREATE TABLE recent ( >>> user_name text, >>> vedio_id text, >>> position int, >>> last_time timestamp, >>> PRIMARY KEY (user_name, vedio_id) >>> ) >>> >>> It worked well before. However, the records increase every day and the >>> last ten items may be adequate for the business. The current model use >>> vedio_id as cluster key to keep a row for a movie, but as you know, the >>> business prefer to order by the last_time desc. If we use last_time as >>> cluster key, there will be many records for a singe movie and the recen= t >>> one is actually desired. So how to model that? Do you have any suggesti= ons? >>> Thanks! >>> >>> >>> BRs, >>> BEN >>> >>> >>> >>> >>> >>> >> > --001a1141db44619bce0541016110 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

This is the reason why One would like to use an mv for it. A= n mv Handels this. It adds a clustering Key while preserving uniqueness of = the original pk.


Am 11.11.2016 02:= 33 schrieb "Gang Liu" <gangban.lau@gmail.com>:
I guess orignal design is keep one record fo= r one video per user. maybe their app will report many play records when us= er watching one video.=C2=A0
So there will be many records when change = primary key to (user_name,=C2=A0last_time). Also= =C2=A0
SELECT * FROM user_views = WHERE user_name =3D ? LIMIT 10
without group by video_id can't fit business requirement.

regards,
Gang


On = Thu, Nov 10, 2016 at 6:47 PM, Carlos Alonso <info@mrcalonso.com>= wrote:
What abou= t having something like

CREATE TABLE user_views (
<= div>=C2=A0 user_name text,
=C2=A0 video_id text,=C2=A0
= =C2=A0 position int,=C2=A0
=C2=A0 last_time timestamp,
= =C2=A0 PRIMARY KEY(user_name, last_time)
)=C2=A0WITH CLUSTERING O= RDER BY (last_time DESC);

Where you insert a recor= d everytime a user watches a video and then having a batch task (every nigh= t maybe?) that deletes the extra rows that are not needed anymore.
The query pattern for this is quite efficient as something like SELECT * = FROM user_views WHERE user_name =3D ? LIMIT 10;

Re= gards

Carlos Alonso | Software Engineer |=C2=A0@calonso

On 10 November 2016 at 09:19, Vladimir Yudov= in <vladyu@winguzone.com> wrote:
>Do you mean the oldest one should be= removed when a new play is added?
Sure. As you descri= bed the issue "the last ten items may be adequate for the business&quo= t;

Best regards, Vladimir Yudovin,
=
Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes= .


---- On Wed, 09 Nov 2016 20:47:05 -0500Diamond ben <diamond.ben@outlook.com>= ; wrote ----

The solution maybe work. However, the play list will grow over= time and somebody maybe has ten thousands that will slow down the query an= d sort . Do you mean the oldest one should be removed when a new play is ad= ded?

BTW, the version is 2.1.16 in our live system.

BRs,

BEN



=
=E5=8F=91=E4=BB=B6=E4=BA=BA: V= ladimir Yudovin <vladyu@winguzone.com>
=E5=8F=91=E9=80=81=E6=97=B6=E9=97=B4= : 2016=E5=B9=B411=E6=9C=889=E6=97=A5 18:11:26
=E6=94=B6=E4=BB=B6= =E4=BA=BA: user
=E4=B8=BB=E9=A2=98: Re: =E7=AD=94=E5=A4=8D: = A difficult data model with C*
=C2=A0
You are welcome! )

>recent ten movies watch= ed by the user=C2=A0within 30 days.
In this case you c= an't use=C2=A0PRIMARY KEY (user_name, video_id), as video_id is demande= d to fetch row, so all this stuff may be
CREATE TYPE play (video_id text, position int, last_time ti= mestamp);
CREATE TABLE recent (user_name text PRIMARY KEY, pl= ay_list LIST<frozen<play>>);
Y= ou can easily retrieve play list for specific user by his ID. Instead of LI= ST you can use MAP, I don't think that for ten entries it matters.
<= /div>


Best regards, Vladimir Yudovin= ,
Winguzone - Hosted Cloud Cassandra
Launch your cluster in = minutes.


<= div>---- On Tue, 08 Nov 2016 22:29:48 -0500ben ben <diamond.ben@outlook.com>= wrote ----

Hi Vladimir Yudovin,

<= br>

=C2=A0=C2=A0=C2=A0 Thank you very much for your detailed explaini= ng. Maybe I didn't describe the requirement clearly. The use cases shou= ld be:

1. a user login our app.

2. show the recent ten = movies watched by the user=C2=A0within 30 days.

3. t= he user can click=C2=A0any one of the ten movie and continue t= o watch from the last position she/he did. BTW, a movie can be watched seve= ral times by a user and the last positon is needed indeed.


=

BRs,

BEN


=E5=8F=91=E4=BB=B6=E4=BA=BA: Vladimir Yudovin <vladyu@winguzone.com>
=E5=8F=91=E9=80=81=E6=97=B6=E9= =97=B4: 2016=E5=B9=B411=E6=9C=888=E6=97=A5 22:35:48
=E6=94=B6=E4= =BB=B6=E4=BA=BA: user
=E4=B8=BB=E9=A2=98: Re: A difficult da= ta model with C*
=C2=A0
Hi Ben,

if need very limited number of posit= ions (as you said ten) may be you can store them in LIST of UDT? Or just as= JSON string?
So you'll have one row per each pair user-v= ideo.

It can be something like this:

CREATE TYPE play (position int, last_time timestamp);<= br>
CREATE TABLE recent (user_name text, video_id text, review LI= ST<frozen<play>>, PRIMARY KEY (user_name, video_id));
=

UPDATE recent set review =3D review + [(1234,12345)] wh= ere user_name=3D'some user' AND video_id=3D'great video';
UPDATE recent set review =3D review + [(1234,123456)] where us= er_name=3D'some user' AND video_id=3D'great video';
UPDATE recent set review =3D review + [(1234,1234567)] where user_na= me=3D'some user' AND video_id=3D'great video';

You can delete the oldest entry by index:
DE= LETE review[0] FROM recent WHERE user_name=3D'some user' AND video_= id=3D'great video';

or by value, if yo= u know the oldest entry:

UPDATE recent SET rev= iew =3D review - [(1234,12345)]=C2=A0 WHERE user_name=3D'some user'= AND video_id=3D'great video';

Be= st regards, Vladimir Yudovin,
Winguzone - Hosted Cloud Cassandra=
Launch your cluster in minutes.


---- On Mon, 07 Nov 2016 21:54:08 -0500b= en ben <dia= mond.ben@outlook.com> wrote ----

<= blockquote style=3D"border-left:1.0px solid rgb(204,204,204);padding-left:6= .0px;margin:0 0 0 5.0px">


<= div>
Hi guys,

We are maintaining a system = for an on-line video service. ALL users' viewing records of every movie= are stored in C*. So she/he can continue to enjoy the movie from the last = point next time. The table is designed as below:
CREATE TABLE= recent (
user_name text,
vedio_id text,
position int,
last_time timestamp,
PRIM= ARY KEY (user_name, vedio_id)
)

= It worked well before. However, the records increase every day and the last= ten items may be adequate for the business. The current model use vedio_id= as cluster key to keep a row for a movie, but as you know, the business pr= efer to order by the last_time desc. If we use last_time as cluster key, t= here will be many records for a singe movie and the recent one is actually = desired. So how to model that? Do you have any suggestions?
= Thanks!


BRs,
BEN<= br>







--001a1141db44619bce0541016110--