Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0BD4ED34D for ; Mon, 29 Oct 2012 19:38:25 +0000 (UTC) Received: (qmail 57136 invoked by uid 500); 29 Oct 2012 19:38:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 57100 invoked by uid 500); 29 Oct 2012 19:38:22 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 57092 invoked by uid 99); 29 Oct 2012 19:38:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Oct 2012 19:38:22 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_FRT_BELOW2 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of andre271@gmail.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-ob0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Oct 2012 19:38:17 +0000 Received: by mail-ob0-f172.google.com with SMTP id v19so5597892obq.31 for ; Mon, 29 Oct 2012 12:37:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=BvFaCTVSYm9h5vm+ZTVxYMmciBOTTs3L0YpUnTeky9Q=; b=neVKqY6xN1BYaG4tpS2XQvmHCuc+cBGlKPZSNTOqPVm8B82En9YoiYufqPuYc5NePZ wt9Sridd43w3z7/gpnlUIFwyZ2AmHx3i/xK93h/ypwOQArO8ro26aXpBmtWcmXJsH/Gf Mafa7Cn2GOr6/hN4OdE9VGuhJkwA5V7WBYdIxlxhqHjfEbnsa4qjMu5pP3IfhPqmdGsa fxR0U2m9UFGW82Wut3/PKfQ4Tz+cPsV1owNKhpADfCqdhHRIRSdwM8dBwhZswpP1HVPg Jy6tfMWxrcsjZCzxsP3HaxwHWmnZT/AS97uaQwKINuUVsXtUMht2dqPQQKB61JJzA4Of YHdA== Received: by 10.182.0.1 with SMTP id 1mr25563164oba.18.1351539477226; Mon, 29 Oct 2012 12:37:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.182.49.101 with HTTP; Mon, 29 Oct 2012 12:37:16 -0700 (PDT) In-Reply-To: References: From: Andre Tavares Date: Mon, 29 Oct 2012 17:37:16 -0200 Message-ID: Subject: Re: ColumnFamilyInputFormat - error when column name is UUID To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=f46d043bdf4270a4e104cd37ce57 X-Virus-Checked: Checked by ClamAV on apache.org --f46d043bdf4270a4e104cd37ce57 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Marcelo, das vezes q tive este problema geralmente era porque o valor UUID sendo tratado para o cassandra n=E3o correspondia a um valor "exato" em UUID, pa= ra isso utilizava bastante o UUID.randomUUID() (para gerar um UUID valido) e UUID.fromString("081f4500-047e-401c-8c0b-a41fefd099d7") - este para transformar uma String em UUID valido. Como temos 2 keyspaces no cassandra (dmp_input->Astyanax) e (dmp->PlayOrm) pode acontecer destes frameworks tratarem as chaves UUID de maneira diferentes (em nossa implementa=E7=E3o feita ) portanto acho v=E1lido a solu=E7=E3o que voc=EA encontrou (sorry por n=E3o = ter enxergado o probs antes caso, seja este o seu caso ...) Abs, Andr=E9 2012/10/29 Marcelo Elias Del Valle > Answering myself: it seems we can't have any non type 1 UUIDs in column > names. I used the UTF8 comparator and saved my UUIDs as strings, it worke= d. > > > 2012/10/29 Marcelo Elias Del Valle > >> Hello, >> >> I am using ColumnFamilyInputFormat the same way it's described in >> this example: >> https://github.com/apache/cassandra/blob/trunk/examples/hadoop_word_coun= t/src/WordCount.java#L215 >> >> I have been able to successfully process data in cassandra by using >> hadoop. However, as this solution doesn't allow me to filter which data = in >> cassandra I want to filter, I decided to create a query column family to >> list data I want to process in hadoop. This column family is as follows: >> >> row key: YYYYMM >> column name: UUID - user ID >> column value: timestamp - last processed date >> >> The problem is, when I run hadoop, I get the exception bellow. Is >> there any limitation in having UUIDs as column names? I am generating my >> user IDs with java.util.UUID.randomUUID() for now. I could change the >> method later, but only type 1 UUIDs are 16 bits longer, isn't it? >> >> >> java.lang.RuntimeException: InvalidRequestException(why:UUIDs must be >> exactly 16 bytes) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.m= aybeInit(ColumnFamilyRecordReader.java:391) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.c= omputeNext(ColumnFamilyRecordReader.java:397) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.c= omputeNext(ColumnFamilyRecordReader.java:323) >> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIter= ator.java:143) >> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java= :138) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(Column= FamilyRecordReader.java:188) >> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(Ma= pTask.java:532) >> at >> org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) >> at >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212) >> Caused by: InvalidRequestException(why:UUIDs must be exactly 16 bytes) >> at >> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassa= ndra.java:12254) >> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassa= ndra.java:683) >> at >> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.= java:667) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.m= aybeInit(ColumnFamilyRecordReader.java:356) >> ... 11 more >> >> Best regards, >> -- >> Marcelo Elias Del Valle >> http://mvalle.com - @mvallebr >> > > > > -- > Marcelo Elias Del Valle > http://mvalle.com - @mvallebr > --f46d043bdf4270a4e104cd37ce57 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Marcelo,

das vezes q tive este problema geralmente era p= orque o valor UUID sendo tratado para o cassandra n=E3o correspondia a um v= alor "exato" =A0em UUID, para isso utilizava bastante o UUID.rand= omUUID() (para gerar um UUID valido) e=A0UUID.fromString("081f4500-047= e-401c-8c0b-a41fefd099d7") - este para transformar uma String em UUID = valido.

Como temos 2 keyspaces no cassandra (dmp_input->Asty= anax) e (dmp->PlayOrm) pode acontecer destes frameworks tratarem as chav= es UUID de maneira diferentes (em nossa implementa=E7=E3o feita )

portanto acho v=E1lido a solu=E7=E3o que voc=EA encontrou (s= orry por n=E3o ter enxergado o probs antes caso, seja este o seu caso ...)<= /div>

Abs,

Andr=E9

2012/10/29 Marcelo Elias Del Valle <mvallebr@gmail.com>
=
Answering myself: it seems we can't have any non type 1 UUIDs in column= names. I used the UTF8 comparator and saved my UUIDs as strings, it worked= .


2012/10/29 Marcelo Elias Del Valle <mvallebr@gmail.com>
Hello,

=A0 =A0 I am using= ColumnFamilyInputFormat the same way it's described in this example:= =A0https://github.com/= apache/cassandra/blob/trunk/examples/hadoop_word_count/src/WordCount.java#L= 215

=A0 =A0 I have been able to successfully process data i= n cassandra by using hadoop. However, as this solution doesn't allow me= to filter which data in cassandra I want to filter, I decided to create a = query column family to list data I want to process in hadoop. This column f= amily is as follows:
=A0=A0
row key: YYYYMM
column name: UUID - user ID=
column value: timestamp - last processed date

=A0 =A0 =A0The problem is, when I run hadoop, I get the exception be= llow. Is there any limitation in having UUIDs as column names? I am generat= ing my user IDs with java.util.UUID.randomUUID() for now. I could change th= e method later, but only type 1 UUIDs are 16 bits longer, isn't it?


java.lang.RuntimeException: Invalid= RequestException(why:UUIDs must be exactly 16 bytes)
at org.apache.cassandra.hadoop.ColumnFami= lyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyRecordReader.java:39= 1)
at org.apache.cassandra.h= adoop.ColumnFamilyRecordReader$StaticRowIterator.computeNext(ColumnFamilyRe= cordReader.java:397)
at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$StaticRowIterator.= computeNext(ColumnFamilyRecordReader.java:323)
at com.google.common.coll= ect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
= at com.google.common.collect.A= bstractIterator.hasNext(AbstractIterator.java:138)
at org.apache.cassandra.h= adoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:1= 88)
at org.apache.ha= doop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)<= /div>
at org.apache.hadoop.mapr= educe.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Map= per.java:143)
at org.apache.hadoop.mapr= ed.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:= 370)
at org.apache.hadoop.mapr= ed.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: Inv= alidRequestException(why:UUIDs must be exactly 16 bytes)
at org.apache.cassandra.t= hrift.Cassandra$get_range_slices_result.read(Cassandra.java:12254)
at org.apache.thrift.TServic= eClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.t= hrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:683)
= at org.apache.cassandra.thrift= .Cassandra$Client.get_range_slices(Cassandra.java:667)
at org.apache.cassandra.h= adoop.ColumnFamilyRecordReader$StaticRowIterator.maybeInit(ColumnFamilyReco= rdReader.java:356)
.= .. 11 more

Best regards,
--=
Marcelo Elias Del Valle
http://mvalle.com=A0- @mvallebr



--
Marcelo Elia= s Del Valle
http://mvall= e.com=A0- @mvallebr

--f46d043bdf4270a4e104cd37ce57--