incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Miguel Angel Martin junquera <mianmarjun.mailingl...@gmail.com>
Subject Re: CqlStorage creates wrong schema for Pig
Date Mon, 26 Aug 2013 08:32:33 GMT
hi Chad .

I have this issue

I send a mail to user-pig-list and  I still i can resolve this, and I can
not  access to column values.
In this mail  I write some things that I try without results... and
information about this issue.


http://mail-archives.apache.org/mod_mbox/pig-user/201308.mbox/%3CCAJeG_hQ9S2Po3_XytZX5Xki4J1maO8q26jYdG2Wndy_KYiv9CQ@mail.gmail.com%3E



I hope  someOne reply  one comment, idea or  solution about  this issue or
bug.


I have reviewed the CqlStorage class in code cassandra 1.2.8  but i do not
have configure the environmetn to debug  and trace this issue.

Only  I find some comments like, but I do not understand at all.


/**

 * A LoadStoreFunc for retrieving data from and storing data to Cassandra

 *

 * A row from a standard CF will be returned as nested tuples:

 * (((key1, value1), (key2, value2)), ((name1, val1), (name2, val2))).
 */


I you found some idea or solution, please post it

thanks









2013/8/23 Chad Johnston <cjohnston@megatome.com>

> (I'm using Cassandra 1.2.8 and Pig 0.11.1)
>
> I'm loading some simple data from Cassandra into Pig using CqlStorage. The
> CqlStorage loader defines a Pig schema based on the Cassandra schema, but
> it seems to be wrong.
>
> If I do:
>
> data = LOAD 'cql://bookdata/books' USING CqlStorage();
> DESCRIBE data;
>
> I get this:
>
> data: {isbn: chararray,bookauthor: chararray,booktitle:
> chararray,publisher: chararray,yearofpublication: int}
>
> However, if I DUMP data, I get results like these:
>
> ((isbn,0425093387),(bookauthor,Georgette Heyer),(booktitle,Death in the
> Stocks),(publisher,Berkley Pub Group),(yearofpublication,1986))
>
> Clearly the results from Cassandra are key/value pairs, as would be
> expected. I don't know why the schema generated by CqlStorage() would be so
> different.
>
> This is really causing me problems trying to access the column values. I
> tried a naive approach of FLATTENing each tuple, then trying to access the
> values that way:
>
> flattened = FOREACH data GENERATE
>   FLATTEN(isbn),
>   FLATTEN(booktitle),
>   ...
> values = FOREACH flattened GENERATE
>   $1 AS ISBN,
>   $3 AS BookTitle,
>   ...
>
> As soon as I try to access field $5, Pig complains about the index being
> out of bounds.
>
> Is there a way to solve the schema/reality mismatch? Am I doing something
> wrong, or have I stumbled across a defect?
>
> Thanks,
> Chad
>

Mime
View raw message