cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pavel Yaskevich (Updated) (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-3371) Cassandra inferred schema and actual data don't match
Date Mon, 13 Feb 2012 22:58:59 GMT


Pavel Yaskevich updated CASSANDRA-3371:

    Attachment: 3371-v6-cleanup.patch

+1 on the v6 with cleanup patch attached - replaced ArrayList, HashMap with interfaces, added
generic description to reader/writer so no more blind casts, changed thrift {Super}Column
to user setX(...) methods and removed whitespaces. 
> Cassandra inferred schema and actual data don't match
> -----------------------------------------------------
>                 Key: CASSANDRA-3371
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.7
>            Reporter: Pete Warden
>            Assignee: Brandon Williams
>         Attachments: 0001-Rework-pig-schema.txt, 0002-Output-support-to-match-input.txt,
3371-v2.txt, 3371-v3.txt, 3371-v4.txt, 3371-v5-rebased.txt, 3371-v5.txt, 3371-v6-cleanup.patch,
3371-v6.txt, pig.diff, smoke_test.txt
> It's looking like there may be a mismatch between the schema that's being reported by
the latest, and the data that's actually returned. Here's an example:
> rows = LOAD 'cassandra://Frap/PhotoVotes' USING CassandraStorage();
> DESCRIBE rows;
> rows: {key: chararray,columns: {(name: chararray,value: bytearray,photo_owner: chararray,value_photo_owner:
bytearray,pid: chararray,value_pid: bytearray,matched_string: chararray,value_matched_string:
bytearray,src_big: chararray,value_src_big: bytearray,time: chararray,value_time: bytearray,vote_type:
chararray,value_vote_type: bytearray,voter: chararray,value_voter: bytearray)}}
> DUMP rows;
> (691831038_1317937188.48955,{(photo_owner,1596090180),(pid,6855155124568798560),(matched_string,),(src_big,),(time,Thu
Oct 06 14:39:48 -0700 2011),(vote_type,album_dislike),(voter,691831038)})
> getSchema() is reporting the columns as an inner bag of tuples, each of which contains
16 values. In fact, getNext() seems to return an inner bag containing 7 tuples, each of which
contains two values. 
> It appears that things got out of sync with this change:
> See more discussion at:

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message