Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8529CEC82 for ; Tue, 22 Jan 2013 11:45:49 +0000 (UTC) Received: (qmail 16191 invoked by uid 500); 22 Jan 2013 11:45:47 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 15872 invoked by uid 500); 22 Jan 2013 11:45:46 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 15844 invoked by uid 99); 22 Jan 2013 11:45:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jan 2013 11:45:46 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of post@fantasista.no designates 213.236.237.140 as permitted sender) Received: from [213.236.237.140] (HELO mx1.mailserveren.com) (213.236.237.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Jan 2013 11:45:24 +0000 Received: from localhost ([127.0.0.1]) by mx1.mailserveren.com with esmtpa (Exim 4.80.1) (envelope-from ) id 1TxcHn-0002q5-JU for user@cassandra.apache.org; Tue, 22 Jan 2013 12:45:03 +0100 Message-Id: <95fc5acff05036bebec9094b7d91c520b42f7924@pop3.fantasista.no> From: "Vegard Berget" Reply-To: "Vegard Berget" To: user@cassandra.apache.org X-Mailer: Atmail 6.6.2.11727 X-Originating-IP: 46.19.16.3 in-reply-to: Subject: Re: How to store large columns? Date: Tue, 22 Jan 2013 12:45:03 +0100 Content-Type: multipart/alternative; boundary="=_0c2c0e5939c7860a2e381641a65aae56" MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org --=_0c2c0e5939c7860a2e381641a65aae56 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi,=0ANo, the keys are hashed to be distributed, at least if you use=0AR= andomPartitioner.From=C2=A0http://www.datastax.com/docs/1.0/cluster_arch= itecture/partitioning:"To=0Adistribute the data evenly across the number= of nodes, a hashing=0Aalgorithm creates an MD5 hash value of the row ke= y"=0A.vegard,=0A=C2=A0=C2=A0=C2=A0=0A=0A----- Original Message -----=0AF= rom: user@cassandra.apache.org=0ATo:=0ACc:=0ASent:Tue, 22 Jan 2013 09:40= :19 -0200=0ASubject:Re: How to store large columns?=0A=0A=09But, this ke= ys have the same prefix. So, they will be distributed on=0Athe same node= Right? =0A=0A 2013/1/21 Jason Brown =0A The reason for multiple keys= (and, by extension, multiple columns)=0Ais to better distribute the wri= te/read load across the cluster as keys=0Awill (hopefully) be=C2=A0distr= ibuted=C2=A0on different nodes. This helps to=0Aavoid hot spots. =0A Hop= e this helps, =0A -Jason Brown Netflix=0A=0A-------------------------=0A= FROM: S=C3=A1vio Teles [savio.teles@lupa.inf.ufg.br [2]]=0ASENT: Monday= , January 21, 2013 9:51 AM=0ATO: user@cassandra.apache.org [3] =0ASUBJEC= T: Re: How to store large columns?=0A=0A Astyanax split large objects= into multiple keys. Is it a good idea?=0A It is better to split into m= ultiple columns?=0A=0A Thanks=0A=0A2013/1/21 S=C3=A1vio Teles =0A=0A Tha= nks Keith Wright. =0A =C2=A0=0A2013/1/21 Keith Wright =0A This may be= helpful:=0A=C2=A0https://github.com/Netflix/astyanax/wiki/Chunked-Objec= t-Store [6] =0A From: Vegard Berget =0AReply-To: "user@cassandra.apac= he.org [8]" , Vegard Berget =0ADate: Monday, January 21, 2013 8:35 AM=0A= To: "user@cassandra.apache.org [11]" =0ASubject: Re: How to store large= columns?=0A=0A =C2=A0=0A=0A=09Hi, =0A=0A=09You could split it into mu= ltiple columns on the client side: =C2=A0=0A RowKeyData: Part1: [1mb], P= art2: [1mb], Part3: [1mb]...PartN[1mb] =0A=0A=09Now you can use multiple= get() in parallell to get the files back and=0Athen join them back to o= ne file. =0A=0A=09I _think_ maybe the new CQL3-protocol does not have th= e same=0Alimitation, but I have never tried large columns there, so some= one=0Awith more experience than me will have to confirm this. =0A=0A=09.= vegard, =0A ----- Original Message -----=0A From: user@cassandra.apache= org [13] =0ATo: =0ACc: =0ASent: Mon, 21 Jan 2013 11:16:40 -0200=0ASubj= ect: How to store large columns?=0A=0AWe wish to store a column in a row= with size larger than=0Athrift_framed_transport_size_in_mb. But, Thrift= has a maximum frame=0Asize configured by thrift_framed_transport_size_i= n_mb in=0Acassandra.yaml. =0A so, How to store columns with size larger= than=0Athrift_framed_transport_size_in_mb? Increasing this value does= not=0Asolve the problem, since we have columns with varying sizes.=0A= =0A -- =0AAtenciosamente,=0A S=C3=A1vio S. Teles de Oliveira=0A voice: = +55 62 9136 6996=0Ahttp://br.linkedin.com/in/savioteles [15]=0AMestrand= o em Ci=C3=AAncias da Computa=C3=A7=C3=A3o - UFG =0A Arquiteto de Softwa= re=0A Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG= =0A =0A=0A -- =0AAtenciosamente,=0A S=C3=A1vio S. Teles de Oliv= eira=0Avoice: +55 62 9136 6996=0Ahttp://br.linkedin.com/in/savioteles [= 16]=0AMestrando em Ci=C3=AAncias da Computa=C3=A7=C3=A3o - UFG =0A Arqui= teto de Software=0A Laboratory for Ubiquitous and Pervasive Application= s (LUPA) - UFG =0A =0A=0A -- =0AAtenciosamente,=0A S=C3=A1vio S. Tel= es de Oliveira=0Avoice: +55 62 9136 6996=0Ahttp://br.linkedin.com/in/sav= ioteles [17]=0AMestrando em Ci=C3=AAncias da Computa=C3=A7=C3=A3o - UFG= =0A Arquiteto de Software=0A Laboratory for Ubiquitous and Pervasive A= pplications (LUPA) - UFG =0A =0A=0A-- =0AAtenciosamente,=0AS=C3= =A1vio S. Teles de Oliveira=0Avoice: +55 62 9136 6996=0Ahttp://br.linked= in.com/in/savioteles [18]=0A Mestrando em Ci=C3=AAncias da Computa=C3=A7= =C3=A3o - UFG =0AArquiteto de Software=0A Laboratory for Ubiquitous and= Pervasive Applications (LUPA) - UFG =0A=0ALinks:=0A------=0A[1] mailto= :jasbrown@netflix.com=0A[2] mailto:savio.teles@lupa.inf.ufg.br=0A[3] mai= lto:user@cassandra.apache.org=0A[4] mailto:savio.teles@lupa.inf.ufg.br= =0A[5] mailto:kwright@nanigans.com=0A[6] https://github.com/Netflix/asty= anax/wiki/Chunked-Object-Store=0A[7] mailto:post@fantasista.no=0A[8] mai= lto:user@cassandra.apache.org=0A[9] mailto:user@cassandra.apache.org=0A[= 10] mailto:post@fantasista.no=0A[11] mailto:user@cassandra.apache.org=0A= [12] mailto:user@cassandra.apache.org=0A[13] mailto:user@cassandra.apach= e.org=0A[14] mailto:user@cassandra.apache.org=0A[15] http://br.linkedin.= com/in/savioteles=0A[16] http://br.linkedin.com/in/savioteles=0A[17] htt= p://br.linkedin.com/in/savioteles=0A[18] http://br.linkedin.com/in/savio= teles=0A --=_0c2c0e5939c7860a2e381641a65aae56 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi,

No, the keys are hashe= d to be distributed, at least if you use RandomPartitioner.
Fr= om=C2=A0http://www.datastax.com/docs/1.0/cluster_architecture/partitioni= ng:
"To distribute the data evenly across th= e number of nodes, a hashing algorithm creates an MD5 hash value of the= row key"

.vegard,

=C2=A0=C2=A0= =C2=A0



----- Original Message -----<= br />
From:
user@cassandra.apache.org

=
To:
<user@cassandra.apache.org&= gt;
Cc:

Sent:
Tue, 22 Jan 2013 09:40:19 -0200
Subject:
Re: How to store large columns?=


But, this keys have the same prefix. So, they will= be distributed on the same node. Right?

=0A
2013/1/21 Jason Brown <jasbrown@netflix.com>
=0A=0A=0A=0A=0A
=0A
The reason for multiple keys (and, by= extension, multiple columns) is to better distribute the write/read loa= d across the cluster as keys will (hopefully) be=C2=A0distributed=C2=A0o= n different=0A nodes. This helps to avoid hot spots.=0A

= =0A
Hope this helps,
=0A

=0A
-Jason Brown=0A
Netflix
=0A
=0A<= b>From: S=C3=A1vio Teles [savio.teles@lupa.inf.ufg.br]
Sent: Monday, January= 21, 2013 9:51 AM
To: user@cassandra.apache.org
=0ASubject: Re: How to store large columns?
=0A
=0A
=0A
Astyanax split= large objects into multiple keys. Is it a good idea? =0AIt is better to split =0Ainto multiple columns?

=0AThan= ks

2013/1/21 S=C3=A1= vio Teles <savio.teles@lupa.inf.ufg.br>
=0A
=0AThanks Keith Wright.=0A
=0A
=0A=0A=C2=A0
2013/1/21 Keith Wright <kwright@naniga= ns.com>
=0A
=0A=0A

=0A=0A
=0A=0AFrom: Vega= rd Berget <post@fantasista.no>
Reply-To: "
user@cassandra.apache.org" <= ;user@cassandra.apache.org<= /a>>, Vegard Berget <post@fa= ntasista.no>
Date: = Monday, January 21, 2013 8:35 AM
= To: "user@cassandra.= apache.org" <user@ca= ssandra.apache.org>
Subjec= t: Re: How to store large columns?
=0A
=0A
=0A=

=0A
=0A
=0A=C2=A0

Hi,

= =0A

You could split it into multiple columns on the client side: =C2= =A0
=0ARowKeyData: Part1: [1mb], Part2: [1mb], Part3: [1mb]...PartN= [1mb]

=0A

Now you can use multiple get() in parallell to get the fi= les back and then join them back to one file.

=0A

I _think_ maybe t= he new CQL3-protocol does not have the same limitation, but I have never= tried large columns there, so someone with more experience than me will= have to confirm this.

=0A

.vegard,

=0A
=0A
=0A-= ---- Original Message -----
=0A
From:
=0Auser@cassandra.apache.org=0A<= /div>=0A
To:
=0A<user@cassandra.apache.org><= br />
Cc:
=0A
Sent:
=0AMon, 21 Jan 2013 11:16:40 -0200
Subject:
=0AHow to store large columns= ?


We wish to store a column in a row with=0Asize= larger than thrift_framed_transport_size_in_m= b. But, Thrift has a maximum frame size confi= gured by thrift_framed_transport_size_in_mb in cassandra.yaml.=0A
= =0Aso, How to store columns with size larger than thrift_framed_transport_size_in_mb?=0AIncreasing this value =0Adoes not solve the problem, since we have=0Acolumns wit= h varying sizes.

=0A-- Atenciosamente,
=0AS=C3=A1vio S. Teles d= e Oliveira
=0A
voice: =0A+55 62 9136 6996
http://br.linkedin.com/in= /savioteles
Mestrando em Ci=C3=AAncias da Computa=C3=A7=C3= =A3o - UFG
=0AArquiteto de Software
=0A
=0A
Lab= oratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
=0A<= /div>=0A
=0A
=0A
=0A
=0A
=0A
=0A=0A=
=0A
=0A
=0A


=0A--=
Atenciosamente,
=0AS=C3=A1vio S. Tel= es de Oliveira
voice: =0A+55 62 9136 6996
<= a href=3D"http://br.linkedin.com/in/savioteles">http://br.linkedin.com/i= n/savioteles
Mestrando em Ci=C3=AAncias da Computa=C3=A7= =C3=A3o - UFG
=0AArquiteto de Software
=0A
=0A
= Laboratory for Ubiquitous and Pervasive Applications (LUPA) - UFG
= =0A
=0A
=0A=0A
=0A
=0A
=0A
=0A


=0A--
Atenciosa= mente,
=0AS=C3=A1vio S. Teles de Oliveira
voice: <= a>+55 62 9136 6996
http://br.linkedin.com/in/savioteles
Mestrando em Ci= =C3=AAncias da Computa=C3=A7=C3=A3o - UFG
=0AArquiteto de Software=
=0A
=0A
Laboratory for Ubiquitous and Pervasive Appl= ications (LUPA) - UFG
=0A
=0A
=0A=0A
=0A
= =0A
=0A
=0A
=0A
=0A=0A
=0A
=0A


--
Atenciosamen= te,
S=C3=A1vio S. Teles de Oliveira
voice: +55 62= 9136 6996
http://= br.linkedin.com/in/savioteles
=0A=0AMestrando em Ci=C3=AAn= cias da Computa=C3=A7=C3=A3o - UFG
Arquiteto de Software
=0A
Laboratory for Ubiquitous and Pervasive Applications (LUP= A) - UFG
=0A
=0A=0A
--=_0c2c0e5939c7860a2e381641a65aae56--