Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of
 mianmarjun.mailinglist@gmail.com designates 209.85.217.180 as permitted
 sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAJeG_hRWnSgAmNXWHw6ei1aAU4QS+5fg1-msmw2TcGUO0C-Qaw@mail.gmail.com>
References: 
 <CAPZMN7HquhuoNVj+hjp67Ckv8c_WOWV0HaaYnnmYyS6SZw-yqA@mail.gmail.com>
	<CAJeG_hSO-Apii=Xt4b+w3+eu_Yq-0y35c8kiL64k1ekLJtZkdg@mail.gmail.com>
	<CAJeG_hTmL700fCtMC-WV+Ds_0Hwe3H3n5Qxfik_zcsy+dH6Xdw@mail.gmail.com>
	<CAPZMN7F3dSFMoqdc=5DRxEhodr0=36bb7p9_7qzpUVjx10eYjA@mail.gmail.com>
	<CAJeG_hRWnSgAmNXWHw6ei1aAU4QS+5fg1-msmw2TcGUO0C-Qaw@mail.gmail.com>
Date: Mon, 2 Sep 2013 15:09:10 +0200
Message-ID: 
 <CAJeG_hQDwhcfVu2mffSNh0F5GYry+mf7zc2Wwu0KmKHLUw=M8w@mail.gmail.com>
Subject: Re: CqlStorage creates wrong schema for Pig
From: Miguel Angel Martin junquera <mianmarjun.mailinglist@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=001a11c352aa3517ea04e566474f

--001a11c352aa3517ea04e566474f
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

hi all:

More info :

https://issues.apache.org/jira/browse/CASSANDRA-5941


I tried this (and gen. cassandra 1.2.9)  but do not work for me,

git clone http://git-wip-us.apache.org/repos/asf/cassandra.git
cd cassandra
git checkout cassandra-1.2
patch -p1 < 5867-bug-fix-filter-push-down-1.2-branch.txt
ant


Miguel Angel Mart=EDn Junquera
Analyst Engineer.
miguelangel.martin@brainsins.com


2013/9/2 Miguel Angel Martin junquera <mianmarjun.mailinglist@gmail.com>

> *good/nice job !!!*
> *
> *
> *
> *
> *I'd testing with an udf only with  string schema type  this is better
> and elaborate work..*
> *
> *
> *Regads*
>
>
> Miguel Angel Mart=EDn Junquera
> Analyst Engineer.
> miguelangel.martin@brainsins.com
>
>
>
> 2013/8/31 Chad Johnston <cjohnston@megatome.com>
>
>> I threw together a quick UDF to work around this issue. It just extracts
>> the value portion of the tuple while taking advantage of the CqlStorage
>> generated schema to keep the type correct.
>>
>> You can get it here: https://github.com/iamthechad/cqlstorage-udf
>>
>> I'll see if I can find more useful information and open a defect, since
>> that's what this seems to be.
>>
>> Chad
>>
>>
>> On Fri, Aug 30, 2013 at 2:02 AM, Miguel Angel Martin junquera <
>> mianmarjun.mailinglist@gmail.com> wrote:
>>
>>> I try this:
>>>
>>> *rows =3D LOAD
>>> 'cql://keyspace1/test?page_size=3D1&split_size=3D4&where_clause=3Dage%3=
D30' USING
>>> CqlStorage();*
>>>
>>> *dump rows;*
>>>
>>> *ILLUSTRATE rows;*
>>>
>>> *describe rows;*
>>>
>>> *
>>> *
>>>
>>> *values2=3D FOREACH rows GENERATE  TOTUPLE (id) as
>>> (mycolumn:tuple(name,value));*
>>>
>>> *dump values2;*
>>>
>>> *describe values2;*
>>> *
>>> *
>>>
>>> But I get this results:
>>>
>>>
>>>
>>> -------------------------------------------------------------
>>> | rows     | id:chararray   | age:int   | title:chararray   |
>>> -------------------------------------------------------------
>>> |          | (id, 6)        | (age, 30) | (title, QA)       |
>>> -------------------------------------------------------------
>>>
>>> rows: {id: chararray,age: int,title: chararray}
>>> 2013-08-30 09:54:37,831 [main] ERROR org.apache.pig.tools.grunt.Grunt -
>>> ERROR 1031: Incompatable field schema: left is
>>> "tuple_0:tuple(mycolumn:tuple(name:bytearray,value:bytearray))", right =
is
>>> "org.apache.pig.builtin.totuple_id_1:tuple(id:chararray)"
>>>
>>>
>>>
>>>
>>>
>>> or
>>>
>>>
>>>
>>> ....
>>>
>>> *values2=3D FOREACH rows GENERATE  TOTUPLE (id) ;*
>>> *dump values2;*
>>> *describe values2;*
>>>
>>>
>>>
>>>
>>> and  the results are:
>>>
>>>
>>> ...
>>> (((id,6)))
>>> (((id,5)))
>>> values2: {org.apache.pig.builtin.totuple_id_8: (id: chararray)}
>>>
>>>
>>>
>>> Aggg!!!!!
>>>
>>>
>>> *
>>> *
>>>
>>>
>>>
>>> Miguel Angel Mart=EDn Junquera
>>> Analyst Engineer.
>>> miguelangel.martin@brainsins.com
>>>
>>>
>>>
>>> 2013/8/26 Miguel Angel Martin junquera <mianmarjun.mailinglist@gmail.co=
m
>>> >
>>>
>>>> hi Chad .
>>>>
>>>> I have this issue
>>>>
>>>> I send a mail to user-pig-list and  I still i can resolve this, and I
>>>> can not  access to column values.
>>>> In this mail  I write some things that I try without results... and
>>>> information about this issue.
>>>>
>>>>
>>>>
>>>> http://mail-archives.apache.org/mod_mbox/pig-user/201308.mbox/%3CCAJeG=
_hQ9S2Po3_XytZX5Xki4J1maO8q26jYdG2Wndy_KYiv9CQ@mail.gmail.com%3E
>>>>
>>>>
>>>>
>>>> I hope  someOne reply  one comment, idea or  solution about  this issu=
e
>>>> or bug.
>>>>
>>>>
>>>> I have reviewed the CqlStorage class in code cassandra 1.2.8  but i do
>>>> not have configure the environmetn to debug  and trace this issue.
>>>>
>>>> Only  I find some comments like, but I do not understand at all.
>>>>
>>>>
>>>> /**
>>>>
>>>>  * A LoadStoreFunc for retrieving data from and storing data to
>>>> Cassandra
>>>>
>>>>  *
>>>>
>>>>  * A row from a standard CF will be returned as nested tuples:
>>>>
>>>>  * (((key1, value1), (key2, value2)), ((name1, val1), (name2, val2))).
>>>>  */
>>>>
>>>>
>>>> I you found some idea or solution, please post it
>>>>
>>>> thanks
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2013/8/23 Chad Johnston <cjohnston@megatome.com>
>>>>
>>>>> (I'm using Cassandra 1.2.8 and Pig 0.11.1)
>>>>>
>>>>> I'm loading some simple data from Cassandra into Pig using CqlStorage=
.
>>>>> The CqlStorage loader defines a Pig schema based on the Cassandra sch=
ema,
>>>>> but it seems to be wrong.
>>>>>
>>>>> If I do:
>>>>>
>>>>> data =3D LOAD 'cql://bookdata/books' USING CqlStorage();
>>>>> DESCRIBE data;
>>>>>
>>>>> I get this:
>>>>>
>>>>> data: {isbn: chararray,bookauthor: chararray,booktitle:
>>>>> chararray,publisher: chararray,yearofpublication: int}
>>>>>
>>>>> However, if I DUMP data, I get results like these:
>>>>>
>>>>> ((isbn,0425093387),(bookauthor,Georgette Heyer),(booktitle,Death in
>>>>> the Stocks),(publisher,Berkley Pub Group),(yearofpublication,1986))
>>>>>
>>>>> Clearly the results from Cassandra are key/value pairs, as would be
>>>>> expected. I don't know why the schema generated by CqlStorage() would=
 be so
>>>>> different.
>>>>>
>>>>> This is really causing me problems trying to access the column values=
.
>>>>> I tried a naive approach of FLATTENing each tuple, then trying to acc=
ess
>>>>> the values that way:
>>>>>
>>>>> flattened =3D FOREACH data GENERATE
>>>>>   FLATTEN(isbn),
>>>>>   FLATTEN(booktitle),
>>>>>   ...
>>>>> values =3D FOREACH flattened GENERATE
>>>>>   $1 AS ISBN,
>>>>>   $3 AS BookTitle,
>>>>>   ...
>>>>>
>>>>> As soon as I try to access field $5, Pig complains about the index
>>>>> being out of bounds.
>>>>>
>>>>> Is there a way to solve the schema/reality mismatch? Am I doing
>>>>> something wrong, or have I stumbled across a defect?
>>>>>
>>>>> Thanks,
>>>>> Chad
>>>>>
>>>>
>>>>
>>>
>>
>

--001a11c352aa3517ea04e566474f
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">hi all:<div><br></div><div>More info :</div><div><br></div=
><div><a href=3D"https://issues.apache.org/jira/browse/CASSANDRA-5941">http=
s://issues.apache.org/jira/browse/CASSANDRA-5941</a><br><div><br></div><div=
>
<br><div><br></div><div>I tried this (and gen. cassandra 1.2.9) =A0but do n=
ot work for me,=A0<br><div><br></div><div><pre style=3D"margin-top:0px;marg=
in-bottom:10px;padding:5px;border:0px;font-size:14px;vertical-align:baselin=
e;background-color:rgb(238,238,238);font-family:Consolas,Menlo,Monaco,&#39;=
Lucida Console&#39;,&#39;Liberation Mono&#39;,&#39;DejaVu Sans Mono&#39;,&#=
39;Bitstream Vera Sans Mono&#39;,&#39;Courier New&#39;,monospace,serif;over=
flow:auto;width:auto;max-height:600px;color:rgb(0,0,0);line-height:18px">
<code style=3D"margin:0px;padding:0px;border:0px;vertical-align:baseline;fo=
nt-family:Consolas,Menlo,Monaco,&#39;Lucida Console&#39;,&#39;Liberation Mo=
no&#39;,&#39;DejaVu Sans Mono&#39;,&#39;Bitstream Vera Sans Mono&#39;,&#39;=
Courier New&#39;,monospace,serif">git clone <a href=3D"http://git-wip-us.ap=
ache.org/repos/asf/cassandra.git">http://git-wip-us.apache.org/repos/asf/ca=
ssandra.git</a>
cd cassandra
git checkout cassandra-1.2
patch -p1 &lt; 5867-bug-fix-filter-push-down-1.2-branch.txt
ant</code></pre></div></div></div></div></div><div class=3D"gmail_extra"><b=
r clear=3D"all"><div><div dir=3D"ltr"><div><span style=3D"background-color:=
transparent;vertical-align:baseline;font-size:10pt;white-space:pre-wrap;fon=
t-family:Arial"><img src=3D"https://analytics.brainsins.com/images/logo_pun=
tos_negros.png" width=3D"96" height=3D"30"><br>
</span></div><div><span style=3D"background-color:transparent;vertical-alig=
n:baseline;font-size:10pt;white-space:pre-wrap;font-family:Arial">Miguel An=
gel Mart=EDn Junquera</span></div><div></div><div></div><div></div><div><sp=
an style=3D"background-color:transparent;font-family:Arial;font-size:10pt;w=
hite-space:pre-wrap">Analyst Engineer.</span></div>
<div><span style=3D"color:rgb(85,85,85);white-space:nowrap"><a href=3D"mail=
to:miguelangel.martin@brainsins.com" target=3D"_blank">miguelangel.martin@b=
rainsins.com</a></span><br></div><div><br></div></div></div>
<br><br><div class=3D"gmail_quote">2013/9/2 Miguel Angel Martin junquera <s=
pan dir=3D"ltr">&lt;<a href=3D"mailto:mianmarjun.mailinglist@gmail.com" tar=
get=3D"_blank">mianmarjun.mailinglist@gmail.com</a>&gt;</span><br><blockquo=
te class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc so=
lid;padding-left:1ex">
<div dir=3D"ltr"><i style=3D"line-height:15px;font-size:13px;background-col=
or:rgb(250,250,250);font-family:Verdana,Arial,Tahoma,Calibri,Geneva,sans-se=
rif">good/nice job !!!</i><br><div><i style=3D"line-height:15px;font-size:1=
3px;background-color:rgb(250,250,250);font-family:Verdana,Arial,Tahoma,Cali=
bri,Geneva,sans-serif"><br>


</i></div><div><i style=3D"line-height:15px;font-size:13px;background-color=
:rgb(250,250,250);font-family:Verdana,Arial,Tahoma,Calibri,Geneva,sans-seri=
f"><br></i></div><div><i style=3D"line-height:15px;font-size:13px;backgroun=
d-color:rgb(250,250,250);font-family:Verdana,Arial,Tahoma,Calibri,Geneva,sa=
ns-serif">I&#39;d testing with an udf only with =A0string schema type =A0th=
is is better and elaborate work..</i></div>

<div><i style=3D"line-height:15px;font-size:13px;background-color:rgb(250,2=
50,250);font-family:Verdana,Arial,Tahoma,Calibri,Geneva,sans-serif"><br></i=
></div><div><i style=3D"line-height:15px;font-size:13px;background-color:rg=
b(250,250,250);font-family:Verdana,Arial,Tahoma,Calibri,Geneva,sans-serif">=
Regads</i></div>


</div><div class=3D"gmail_extra"><div class=3D"im"><br clear=3D"all"><div><=
div dir=3D"ltr"><div><span style=3D"background-color:transparent;vertical-a=
lign:baseline;font-size:10pt;white-space:pre-wrap;font-family:Arial"><img s=
rc=3D"https://analytics.brainsins.com/images/logo_puntos_negros.png" width=
=3D"96" height=3D"30"><br>

</span></div><div><span style=3D"background-color:transparent;vertical-alig=
n:baseline;font-size:10pt;white-space:pre-wrap;font-family:Arial">Miguel An=
gel Mart=EDn Junquera</span></div><div></div><div></div><div></div><div><sp=
an style=3D"background-color:transparent;font-family:Arial;font-size:10pt;w=
hite-space:pre-wrap">Analyst Engineer.</span></div>

<div><span style=3D"color:rgb(85,85,85);white-space:nowrap"><a href=3D"mail=
to:miguelangel.martin@brainsins.com" target=3D"_blank">miguelangel.martin@b=
rainsins.com</a></span><br></div><div><br></div></div></div>
<br><br></div><div><div class=3D"h5"><div class=3D"gmail_quote">2013/8/31 C=
had Johnston <span dir=3D"ltr">&lt;<a href=3D"mailto:cjohnston@megatome.com=
" target=3D"_blank">cjohnston@megatome.com</a>&gt;</span><br><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex">

<div dir=3D"ltr">I threw together a quick UDF to work around this issue. It=
 just extracts the value portion of the tuple while taking advantage of the=
 CqlStorage generated schema to keep the type correct.<div><br></div><div>


You can get it here:=A0<a href=3D"https://github.com/iamthechad/cqlstorage-=
udf" target=3D"_blank">https://github.com/iamthechad/cqlstorage-udf</a></di=
v><div><br></div><div>I&#39;ll see if I can find more useful information an=
d open a defect, since that&#39;s what this seems to be.</div>

<span><font color=3D"#888888">
<div><br></div><div>Chad</div></font></span></div><div><div><div class=3D"g=
mail_extra"><br><br><div class=3D"gmail_quote">On Fri, Aug 30, 2013 at 2:02=
 AM, Miguel Angel Martin junquera <span dir=3D"ltr">&lt;<a href=3D"mailto:m=
ianmarjun.mailinglist@gmail.com" target=3D"_blank">mianmarjun.mailinglist@g=
mail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div dir=3D"ltr">I try this:<div><br></div><=
blockquote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px"><block=
quote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px">


<span style=3D"font-family:arial,sans-serif;font-size:13px"><i><b>rows =3D =
LOAD &#39;cql://keyspace1/test?page_size=3D1&amp;split_size=3D4&amp;where_c=
lause=3Dage%3D30&#39; USING CqlStorage();</b></i></span></blockquote>
<blockquote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px"><i><b=
>dump rows;</b></i></blockquote><blockquote style=3D"margin:0px 0px 0px 40p=
x;border:none;padding:0px"><i><b>ILLUSTRATE rows;</b></i></blockquote><bloc=
kquote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px">


<i><b>describe rows;</b></i></blockquote><blockquote style=3D"margin:0px 0p=
x 0px 40px;border:none;padding:0px"><i><b><br></b></i></blockquote><blockqu=
ote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px"><i><b>values2=
=3D FOREACH rows GENERATE =A0TOTUPLE (id) as (mycolumn:tuple(name,value));<=
/b></i></blockquote>


<blockquote style=3D"margin:0px 0px 0px 40px;border:none;padding:0px"><i><b=
>dump values2;</b></i></blockquote><blockquote style=3D"margin:0px 0px 0px =
40px;border:none;padding:0px"><div><i><b>describe values2;</b></i></div><di=
v>


<i><b><br></b></i></div></blockquote></blockquote>But I get this results:<d=
iv><br></div><div><br></div><div><br></div><div><div>----------------------=
---------------------------------------</div><div>| rows =A0 =A0 | id:chara=
rray =A0 | age:int =A0 | title:chararray =A0 |=A0</div>


<div>-------------------------------------------------------------</div><di=
v>| =A0 =A0 =A0 =A0 =A0| (id, 6) =A0 =A0 =A0 =A0| (age, 30) | (title, QA) =
=A0 =A0 =A0 |=A0</div><div>------------------------------------------------=
-------------</div><div>


<br></div><div>rows: {id: chararray,age: int,title: chararray}</div><div>20=
13-08-30 09:54:37,831 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR=
 1031: Incompatable field schema: left is &quot;tuple_0:tuple(mycolumn:tupl=
e(name:bytearray,value:bytearray))&quot;, right is &quot;org.apache.pig.bui=
ltin.totuple_id_1:tuple(id:chararray)&quot;</div>


<div><br></div><div><br></div><div><br></div><div><br></div><div><br></div>=
<div>or=A0</div><div><br></div><div><br></div><div><br></div><div>....</div=
><div><br></div></div><blockquote style=3D"margin:0px 0px 0px 40px;border:n=
one;padding:0px">


<div><b>values2=3D FOREACH rows GENERATE =A0TOTUPLE (id) ;</b></div><div><b=
>dump values2;</b></div><div><b>describe values2;</b></div></blockquote><di=
v><div><br></div><div><br></div><div><br></div><div>and =A0the results are:=
</div>


<div><br></div><div><br></div><div>...</div><div><div>(((id,6)))</div><div>=
(((id,5)))</div><div>values2: {org.apache.pig.builtin.totuple_id_8: (id: ch=
ararray)}</div></div><div><br></div><div><br></div><div><br></div><div>


Aggg!!!!!</div><div><br></div><div><br></div><blockquote style=3D"margin:0p=
x 0px 0px 40px;border:none;padding:0px"><blockquote style=3D"margin:0px 0px=
 0px 40px;border:none;padding:0px"><i><b><br></b></i></blockquote></blockqu=
ote>


</div></div><div class=3D"gmail_extra"><br clear=3D"all"><div><div dir=3D"l=
tr"><div><span style=3D"background-color:transparent;vertical-align:baselin=
e;font-size:10pt;white-space:pre-wrap;font-family:Arial"><img width=3D"96" =
height=3D"30"><br>


</span></div><div><span style=3D"background-color:transparent;vertical-alig=
n:baseline;font-size:10pt;white-space:pre-wrap;font-family:Arial">Miguel An=
gel Mart=EDn Junquera</span></div><div></div><div></div><div></div><div><sp=
an style=3D"background-color:transparent;font-family:Arial;font-size:10pt;w=
hite-space:pre-wrap">Analyst Engineer.</span></div>


<div><span style=3D"color:rgb(85,85,85);white-space:nowrap"><a href=3D"mail=
to:miguelangel.martin@brainsins.com" target=3D"_blank">miguelangel.martin@b=
rainsins.com</a></span><br></div><div><br></div></div></div><div><div>

<br><br><div class=3D"gmail_quote">2013/8/26 Miguel Angel Martin junquera <=
span dir=3D"ltr">&lt;<a href=3D"mailto:mianmarjun.mailinglist@gmail.com" ta=
rget=3D"_blank">mianmarjun.mailinglist@gmail.com</a>&gt;</span><br><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s=
olid;padding-left:1ex">


<div dir=3D"ltr">hi Chad .<div><br></div><div>I have this issue</div><div><=
br></div><div>I send a mail to user-pig-list and =A0I still i can resolve t=
his, and I can not =A0access to column values.</div><div>In this mail =A0I =
write some things that I try without results... and information about this =
issue.</div>


<div><br></div><div><br></div><div><a href=3D"http://mail-archives.apache.o=
rg/mod_mbox/pig-user/201308.mbox/%3CCAJeG_hQ9S2Po3_XytZX5Xki4J1maO8q26jYdG2=
Wndy_KYiv9CQ@mail.gmail.com%3E" target=3D"_blank">http://mail-archives.apac=
he.org/mod_mbox/pig-user/201308.mbox/%3CCAJeG_hQ9S2Po3_XytZX5Xki4J1maO8q26j=
YdG2Wndy_KYiv9CQ@mail.gmail.com%3E</a><br>


</div><div><br></div><div><br></div><div><br></div><div>I hope =A0someOne r=
eply =A0one comment, idea or =A0solution about =A0this issue or bug.</div><=
div><br></div><div><br></div><div>I have reviewed the CqlStorage class in c=
ode cassandra 1.2.8 =A0but i do not have configure the environmetn to debug=
 =A0and trace this issue.</div>


<div><br></div><div>Only =A0I find some comments like, but I do not underst=
and at all.=A0</div><div><br></div><div><br></div>


<p>/**</p>
<p>=A0* A LoadStoreFunc for retrieving data from and storing data to <span>=
Cassandra</span></p>
<p>=A0*</p>
<p>=A0* A row from a standard CF will be returned as nested tuples:=A0</p>
<p>=A0* (((key1, value1), (key2, value2)), ((name1, val1), (name2, val2))).=
</p>
<div>=A0*/</div><div><br></div><div><br></div><div>I you found some idea or=
 solution, please post it</div><div><br></div><div>thanks</div><div><br></d=
iv><div><br></div><div><br></div><div>=A0</div>


<div><br></div><div><br></div><div>
<br></div></div><div><div><div class=3D"gmail_extra"><br><br><div class=3D"=
gmail_quote">2013/8/23 Chad Johnston <span dir=3D"ltr">&lt;<a href=3D"mailt=
o:cjohnston@megatome.com" target=3D"_blank">cjohnston@megatome.com</a>&gt;<=
/span><br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
<div dir=3D"ltr"><div>(I&#39;m using Cassandra 1.2.8 and Pig 0.11.1)<br></d=
iv><div><br></div><div>I&#39;m loading some simple data from Cassandra into=
 Pig using CqlStorage. The CqlStorage loader defines a Pig schema based on =
the Cassandra schema, but it seems to be wrong.</div>


<div><br></div><div>If I do:</div><div>=A0 =A0=A0</div><div>data =3D LOAD &=
#39;cql://bookdata/books&#39; USING CqlStorage();</div><div>DESCRIBE data;<=
/div><div><br></div><div>I get this:</div><div><br></div><div>data: {isbn: =
chararray,bookauthor: chararray,booktitle: chararray,publisher: chararray,y=
earofpublication: int}</div>


<div><br></div><div>However, if I DUMP data, I get results like these:</div=
><div><br></div><div>((isbn,0425093387),(bookauthor,Georgette Heyer),(bookt=
itle,Death in the Stocks),(publisher,Berkley Pub Group),(yearofpublication,=
1986))</div>


<div><br></div><div>Clearly the results from Cassandra are key/value pairs,=
 as would be expected. I don&#39;t know why the schema generated by CqlStor=
age() would be so different.</div><div><br></div><div>This is really causin=
g me problems trying to access the column values. I tried a naive approach =
of FLATTENing each tuple, then trying to access the values that way:</div>


<div><br></div><div>flattened =3D FOREACH data GENERATE</div><div>=A0 FLATT=
EN(isbn),</div><div>=A0 FLATTEN(booktitle),</div><div>=A0 ...</div><div>val=
ues =3D FOREACH flattened GENERATE</div><div>=A0 $1 AS ISBN,</div><div>=A0 =
$3 AS BookTitle,</div>


<div>=A0 ...</div><div><br></div><div>As soon as I try to access field $5, =
Pig complains about the index being out of bounds.=A0</div><div><br></div><=
div>Is there a way to solve the schema/reality mismatch? Am I doing somethi=
ng wrong, or have I stumbled across a defect?</div>


<div><br></div><div>Thanks,</div><div>Chad</div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div></div></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div></div></div>
</blockquote></div><br></div>

--001a11c352aa3517ea04e566474f--