incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Pig / Map Reduce on Cassandra
Date Thu, 14 Mar 2013 13:16:52 GMT
Did the example work as it was presented in the README.txt ? 

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 13/03/2013, at 11:31 AM, cscetbon.ext@orange.com wrote:

> Finally I've found the answer in CassandraStorage.java !
> 
> columns is not an alias but a bag that you fill with columns (name+value) that don't
have metadata. 
> 
> That's why your sample doesn't return anything in my test as I've only filled existing
columns (found in the CQL CREATE command)
> I think you should update the file cassandra/examples/pig/README.txt to explain where
this columns comes from or by changing the description of the test cause it doesn't really
determine the top 50 column names but only the top 50 column names put in the bag "columns"
(without metadata)
> 
> Regards
> -- 
> Cyril SCETBON
> 
> On Mar 13, 2013, at 6:44 PM, cscetbon.ext@orange.com wrote:
> 
>> I'm trying to execute your sample pig script and I don't understand where the alias
"columns" comes from :
>> 
>> grunt> rows = LOAD 'cassandra://MyKeyspace/MyColumnFamily' USING CassandraStorage();
>> grunt> cols = FOREACH rows GENERATE flatten(columns);
>> 
>> I suppose it's defined by the call to getSchema function in src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
however even if the order is well executed by Pig, cols can't be dumped.
>> 
>> if I dump rows I get :
>> 
>> (iza,(),(),(),(birth_year,77),(gender,F),{})
>> (cyril,(),(),(),(birth_year,79),(gender,M),{}) 
>> 
>> but when I dump cols I get nothing :(
>> 
>> any idea ?
>> 
>> -- 
>> Cyril SCETBON
>> 
>> On Mar 13, 2013, at 10:26 AM, Cyril Scetbon <cscetbon.ext@orange.com> wrote:
>> 
>>> Ok forget it. It was a mix of mistakes like environment variables not set, package
name not added in the script and libraries not found.
>>> 
>>> Regards
>>> -- 
>>> Cyril SCETBON
>>> 
>>> On Mar 12, 2013, at 10:43 AM, cscetbon.ext@orange.com wrote:
>>> 
>>>> I'm already using Cassandra 1.2.2 with only one line to test the cassandra
access :
>>>> 
>>>> rows = LOAD 'cassandra://twissandra/users' USING org.apache.cassandra.hadoop.pig.CassandraStorage();
>>>> 
>>>> extracted from the sample script provided in the sources
>>>> -- 
>>>> Cyril SCETBON
>>>> 
>>>> On Mar 12, 2013, at 6:57 AM, aaron morton <aaron@thelastpickle.com>
wrote:
>>>> 
>>>>>> any idea why the function loadFunc does not work correctly ?
>>>>> No sorry. 
>>>>> Not sure why you are linking to the CQL info or what Pig script / config
you are running. 
>>>>> Did you follow the example in the examples/pig in the source distribution
? 
>>>>> 
>>>>> Also please use at least cassandra 1.1. 
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> -----------------
>>>>> Aaron Morton
>>>>> Freelance Cassandra Consultant
>>>>> New Zealand
>>>>> 
>>>>> @aaronmorton
>>>>> http://www.thelastpickle.com
>>>>> 
>>>>> On 11/03/2013, at 9:39 AM, cscetbon.ext@orange.com wrote:
>>>>> 
>>>>>> You said all versions. However, when I try to access cassandra://twissandra/users
based on http://www.datastax.com/docs/1.0/dml/using_cql I get :
>>>>>> 
>>>>>> 2013-03-11 17:35:48,444 [main] INFO  org.apache.pig.Main - Apache
Pig version 0.11.0 (r1446324) compiled Feb 14 2013, 16:40:57
>>>>>> 2013-03-11 17:35:48,445 [main] INFO  org.apache.pig.Main - Logging
error messages to: /Users/cyril/pig_1363019748442.log
>>>>>> 2013-03-11 17:35:48.583 java[13809:1203] Unable to load realm info
from SCDynamicStore
>>>>>> 2013-03-11 17:35:48,750 [main] INFO  org.apache.pig.impl.util.Utils
- Default bootup file /Users/cyril/.pigbootup not found
>>>>>> 2013-03-11 17:35:48,831 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: file:///
>>>>>> 2013-03-11 17:35:49,235 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 2245: Cannot get schema from loadFunc org.apache.cassandra.hadoop.pig.CassandraStorage
>>>>>> 
>>>>>> with pig 0.11.0
>>>>>> 
>>>>>> any idea why the function loadFunc does not work correctly ?
>>>>>> 
>>>>>> thanks
>>>>>> -- 
>>>>>> Cyril SCETBON
>>>>>> 
>>>>>> On Jan 18, 2013, at 7:00 PM, aaron morton <aaron@thelastpickle.com>
wrote:
>>>>>> 
>>>>>>>> Silly question -- but does hive/pig hadoop etc work with
cassandra
>>>>>>>> 1.1.8?  Or only with 1.2?  
>>>>>>> all versions. 
>>>>>>> 
>>>>>>>> We are using astyanax library, which seems
>>>>>>>> to fail horribly on 1.2, 
>>>>>>> How does it fail ? 
>>>>>>> If you think you have a bug post it at https://github.com/Netflix/astyanax
>>>>>>> 
>>>>>>> Cheers
>>>>>>> 
>>>>>>> -----------------
>>>>>>> Aaron Morton
>>>>>>> Freelance Cassandra Developer
>>>>>>> New Zealand
>>>>>>> 
>>>>>>> @aaronmorton
>>>>>>> http://www.thelastpickle.com
>>>>>>> 
>>>>>>> On 18/01/2013, at 7:48 AM, James Lyons <james.lyons@gmail.com>
wrote:
>>>>>>> 
>>>>>>>> Silly question -- but does hive/pig hadoop etc work with
cassandra
>>>>>>>> 1.1.8?  Or only with 1.2?  We are using astyanax library,
which seems
>>>>>>>> to fail horribly on 1.2, so we're still on 1.1.8.  But we're
just
>>>>>>>> starting out with this and i'm still debating between cassandra
and
>>>>>>>> hbase.  So I just want to know if there is a limitation here
or not,
>>>>>>>> as I have no idea when 1.2 support will exist in astyanax.
>>>>>>>> 
>>>>>>>> That said, are there other java (scala) libraries that people
use to
>>>>>>>> connect to cassandra that support 1.2?
>>>>>>>> 
>>>>>>>> -James-
>>>>>>>> 
>>>>>>>> On Thu, Jan 17, 2013 at 8:30 AM,  <cscetbon.ext@orange.com>
wrote:
>>>>>>>>> Ok, I understand that I need to manage both cassandra
and hadoop components
>>>>>>>>> and that pig will use hadoop components to launch its
tasks which will use
>>>>>>>>> Cassandra as the Storage engine.
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> --
>>>>>>>>> Cyril SCETBON
>>>>>>>>> 
>>>>>>>>> On Jan 17, 2013, at 4:03 PM, James Schappet <jschappet@gmail.com>
wrote:
>>>>>>>>> 
>>>>>>>>> This really depends on how you design your Hadoop Cluster.
 The testing I
>>>>>>>>> have done, had Hadoop and Cassandra Nodes collocated
on the same hosts.
>>>>>>>>> Remember that Pig code runs inside of your hadoop cluster,
and connects to
>>>>>>>>> Cassandra as the Database engine.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have not done any testing with Hive, so someone else
will have to answer
>>>>>>>>> that question.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: <cscetbon.ext@orange.com>
>>>>>>>>> Reply-To: <user@cassandra.apache.org>
>>>>>>>>> Date: Thursday, January 17, 2013 8:58 AM
>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>>>>>> Subject: Re: Pig / Map Reduce on Cassandra
>>>>>>>>> 
>>>>>>>>> Jimmy,
>>>>>>>>> 
>>>>>>>>> I understand that CFS can replace HDFS for those who
use Hadoop. I just want
>>>>>>>>> to use pig and hive on cassandra. I know that pig samples
are provided and
>>>>>>>>> work now with cassandra natively (they are part of the
core). However, does
>>>>>>>>> it mean that the process will be spread over nodes with
>>>>>>>>> number_of_mapper=number_of_nodes or something like that
?
>>>>>>>>> Can Hive connect to Cassandra 1.2 easily too ?
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Cyril Scetbon
>>>>>>>>> 
>>>>>>>>> On Jan 17, 2013, at 2:42 PM, James Schappet <jschappet@gmail.com>
wrote:
>>>>>>>>> 
>>>>>>>>> CFS is Cassandra File System:
>>>>>>>>> http://www.datastax.com/dev/blog/cassandra-file-system-design
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> But you don't need CFS to connect from PIG to Cassandra.
 The latest
>>>>>>>>> versions of Cassandra Source ship with examples of connecting
from pig to
>>>>>>>>> cassandra.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> apache-cassandra-1.2.0-src/examples/pig   --
>>>>>>>>> http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.2.0/apache-cassandra-1.2.0-src.tar.gz
>>>>>>>>> 
>>>>>>>>> --Jimmy
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> From: <cscetbon.ext@orange.com>
>>>>>>>>> Reply-To: <user@cassandra.apache.org>
>>>>>>>>> Date: Thursday, January 17, 2013 6:35 AM
>>>>>>>>> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>>>>>>>> Subject: Re: Pig / Map Reduce on Cassandra
>>>>>>>>> 
>>>>>>>>> what do you mean ? it's not needed by Pig or Hive to
access Cassandra data.
>>>>>>>>> 
>>>>>>>>> Regards
>>>>>>>>> 
>>>>>>>>> On Jan 16, 2013, at 11:14 PM, Brandon Williams <driftx@gmail.com>
wrote:
>>>>>>>>> 
>>>>>>>>> You won't get CFS,
>>>>>>>>> but it's not a hard requirement, either.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _________________________________________________________________________________________________________________________
>>>>>>>>> 
>>>>>>>>> Ce message et ses pieces jointes peuvent contenir des
informations
>>>>>>>>> confidentielles ou privilegiees et ne doivent donc
>>>>>>>>> pas etre diffuses, exploites ou copies sans autorisation.
Si vous avez recu
>>>>>>>>> ce message par erreur, veuillez le signaler
>>>>>>>>> a l'expediteur et le detruire ainsi que les pieces jointes.
Les messages
>>>>>>>>> electroniques etant susceptibles d'alteration,
>>>>>>>>> France Telecom - Orange decline toute responsabilite
si ce message a ete
>>>>>>>>> altere, deforme ou falsifie. Merci.
>>>>>>>>> 
>>>>>>>>> This message and its attachments may contain confidential
or privileged
>>>>>>>>> information that may be protected by law;
>>>>>>>>> they should not be distributed, used or copied without
authorisation.
>>>>>>>>> If you have received this email in error, please notify
the sender and
>>>>>>>>> delete this message and its attachments.
>>>>>>>>> As emails may be altered, France Telecom - Orange is
not liable for messages
>>>>>>>>> that have been modified, changed or falsified.
>>>>>>>>> Thank you.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _________________________________________________________________________________________________________________________
>>>>>>>>> 
>>>>>>>>> Ce message et ses pieces jointes peuvent contenir des
informations
>>>>>>>>> confidentielles ou privilegiees et ne doivent donc
>>>>>>>>> pas etre diffuses, exploites ou copies sans autorisation.
Si vous avez recu
>>>>>>>>> ce message par erreur, veuillez le signaler
>>>>>>>>> a l'expediteur et le detruire ainsi que les pieces jointes.
Les messages
>>>>>>>>> electroniques etant susceptibles d'alteration,
>>>>>>>>> France Telecom - Orange decline toute responsabilite
si ce message a ete
>>>>>>>>> altere, deforme ou falsifie. Merci.
>>>>>>>>> 
>>>>>>>>> This message and its attachments may contain confidential
or privileged
>>>>>>>>> information that may be protected by law;
>>>>>>>>> they should not be distributed, used or copied without
authorisation.
>>>>>>>>> If you have received this email in error, please notify
the sender and
>>>>>>>>> delete this message and its attachments.
>>>>>>>>> As emails may be altered, France Telecom - Orange is
not liable for messages
>>>>>>>>> that have been modified, changed or falsified.
>>>>>>>>> Thank you.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _________________________________________________________________________________________________________________________
>>>>>>>>> 
>>>>>>>>> Ce message et ses pieces jointes peuvent contenir des
informations
>>>>>>>>> confidentielles ou privilegiees et ne doivent donc
>>>>>>>>> pas etre diffuses, exploites ou copies sans autorisation.
Si vous avez recu
>>>>>>>>> ce message par erreur, veuillez le signaler
>>>>>>>>> a l'expediteur et le detruire ainsi que les pieces jointes.
Les messages
>>>>>>>>> electroniques etant susceptibles d'alteration,
>>>>>>>>> France Telecom - Orange decline toute responsabilite
si ce message a ete
>>>>>>>>> altere, deforme ou falsifie. Merci.
>>>>>>>>> 
>>>>>>>>> This message and its attachments may contain confidential
or privileged
>>>>>>>>> information that may be protected by law;
>>>>>>>>> they should not be distributed, used or copied without
authorisation.
>>>>>>>>> If you have received this email in error, please notify
the sender and
>>>>>>>>> delete this message and its attachments.
>>>>>>>>> As emails may be altered, France Telecom - Orange is
not liable for messages
>>>>>>>>> that have been modified, changed or falsified.
>>>>>>>>> Thank you.
>>>>>>> 
>>>>>> 
>>>>>> _________________________________________________________________________________________________________________________
>>>>>> 
>>>>>> Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc
>>>>>> pas etre diffuses, exploites ou copies sans autorisation. Si vous
avez recu ce message par erreur, veuillez le signaler
>>>>>> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
>>>>>> France Telecom - Orange decline toute responsabilite si ce message
a ete altere, deforme ou falsifie. Merci.
>>>>>> 
>>>>>> This message and its attachments may contain confidential or privileged
information that may be protected by law;
>>>>>> they should not be distributed, used or copied without authorisation.
>>>>>> If you have received this email in error, please notify the sender
and delete this message and its attachments.
>>>>>> As emails may be altered, France Telecom - Orange is not liable for
messages that have been modified, changed or falsified.
>>>>>> Thank you.
>>>>> 
>>>> 
>>>> _________________________________________________________________________________________________________________________
>>>> 
>>>> Ce message et ses pieces jointes peuvent contenir des informations confidentielles
ou privilegiees et ne doivent donc
>>>> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu
ce message par erreur, veuillez le signaler
>>>> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
>>>> France Telecom - Orange decline toute responsabilite si ce message a ete
altere, deforme ou falsifie. Merci.
>>>> 
>>>> This message and its attachments may contain confidential or privileged information
that may be protected by law;
>>>> they should not be distributed, used or copied without authorisation.
>>>> If you have received this email in error, please notify the sender and delete
this message and its attachments.
>>>> As emails may be altered, France Telecom - Orange is not liable for messages
that have been modified, changed or falsified.
>>>> Thank you.
>>> 
>> 
>> _________________________________________________________________________________________________________________________
>> 
>> Ce message et ses pieces jointes peuvent contenir des informations confidentielles
ou privilegiees et ne doivent donc
>> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message
par erreur, veuillez le signaler
>> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques
etant susceptibles d'alteration,
>> France Telecom - Orange decline toute responsabilite si ce message a ete altere,
deforme ou falsifie. Merci.
>> 
>> This message and its attachments may contain confidential or privileged information
that may be protected by law;
>> they should not be distributed, used or copied without authorisation.
>> If you have received this email in error, please notify the sender and delete this
message and its attachments.
>> As emails may be altered, France Telecom - Orange is not liable for messages that
have been modified, changed or falsified.
>> Thank you.
> 
> _________________________________________________________________________________________________________________________
> 
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou
privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message
par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques
etant susceptibles d'alteration,
> France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme
ou falsifie. Merci.
> 
> This message and its attachments may contain confidential or privileged information that
may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message
and its attachments.
> As emails may be altered, France Telecom - Orange is not liable for messages that have
been modified, changed or falsified.
> Thank you.


Mime
View raw message