cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Will Oberman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4789) CassandraStorage.getNextWide produces corrupt data
Date Wed, 10 Oct 2012 21:07:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473559#comment-13473559
] 

Will Oberman commented on CASSANDRA-4789:
-----------------------------------------

I'm now 99% sure the problem is keys that map to a single column are being skipped over, and
their values glued into the key after them.  But I'm not sure the most elegant fix...
                
> CassandraStorage.getNextWide produces corrupt data
> --------------------------------------------------
>
>                 Key: CASSANDRA-4789
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4789
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1.5
>            Reporter: Will Oberman
>            Assignee: Brandon Williams
>
> This took me a while to track down.  I'm seeing the problem when the "key changes" case
happens.  The intended behavior (as far as I can tell) when the key changes is the method
returns the current tuple, and picks up where it left off on the next call to getNextWide().
 The problem I'm seeing is the sometimes the current key advances between method calls, sometimes
not.  "Not" being the correct behavior, since the code is saving the value into an instance
variable, but when the key advances there is a key/value mismatch (the result being the values
for two different keys are being glued together).  I think the problem might be related to
keys that only have a single column???  I'm still trying to track that down to help assist
in solving this case...
> Maybe this will be clearer from me pasting a bunch of logging I added to the class. 
The log messages are fairly self documenting (I hope):  
> ...lots of previous logging...
> enter getNextWide
> hasNext = true
> set key = dVNhbXAxMzQ3ODM1OA%3D%3D
> lastRow != null
> added 1 items to bag from lastRow
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> key changed, new key = 669392df09572d0045b964bc65f86a2c
> exit getNextWide
> enter getNextWide
> hasNext = true
> //!!!THIS IS THE PROBLEM HERE I THINK!!!
> //!!!Usually the key here == key before "exit getNextWide"!!!
> set key = 5f900ee4bb1850f8cf387cc3d5fc23ca
> //!!! lastRow is data for 669392df09572d0045b964bc65f86a2c !!! 
> //!!! but it's being added to key 5f900ee4bb1850f8cf387cc3d5fc23ca !!!
> lastRow != null
> added 1 items to bag from lastRow
> //!!! Here are the real values for 5f900ee4bb1850f8cf387cc3d5fc23ca !!!
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> key changed, new key = 50438549-cdb6-8c44-f93a-d18d7daeffd8
> exit getNextWide
> enter getNextWide
> hasNext = true
> set key = 50438549-cdb6-8c44-f93a-d18d7daeffd8

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message