Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9D33D067 for ; Wed, 10 Oct 2012 21:07:03 +0000 (UTC) Received: (qmail 60471 invoked by uid 500); 10 Oct 2012 21:07:03 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 60437 invoked by uid 500); 10 Oct 2012 21:07:03 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 60196 invoked by uid 99); 10 Oct 2012 21:07:03 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Oct 2012 21:07:03 +0000 Date: Wed, 10 Oct 2012 21:07:03 +0000 (UTC) From: "Will Oberman (JIRA)" To: commits@cassandra.apache.org Message-ID: <1413199921.22401.1349903223465.JavaMail.jiratomcat@arcas> In-Reply-To: <1953580761.22349.1349902143829.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (CASSANDRA-4789) CassandraStorage.getNextWide produces corrupt data MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473559#comment-13473559 ] Will Oberman commented on CASSANDRA-4789: ----------------------------------------- I'm now 99% sure the problem is keys that map to a single column are being skipped over, and their values glued into the key after them. But I'm not sure the most elegant fix... > CassandraStorage.getNextWide produces corrupt data > -------------------------------------------------- > > Key: CASSANDRA-4789 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4789 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Affects Versions: 1.1.5 > Reporter: Will Oberman > Assignee: Brandon Williams > > This took me a while to track down. I'm seeing the problem when the "key changes" case happens. The intended behavior (as far as I can tell) when the key changes is the method returns the current tuple, and picks up where it left off on the next call to getNextWide(). The problem I'm seeing is the sometimes the current key advances between method calls, sometimes not. "Not" being the correct behavior, since the code is saving the value into an instance variable, but when the key advances there is a key/value mismatch (the result being the values for two different keys are being glued together). I think the problem might be related to keys that only have a single column??? I'm still trying to track that down to help assist in solving this case... > Maybe this will be clearer from me pasting a bunch of logging I added to the class. The log messages are fairly self documenting (I hope): > ...lots of previous logging... > enter getNextWide > hasNext = true > set key = dVNhbXAxMzQ3ODM1OA%3D%3D > lastRow != null > added 1 items to bag from lastRow > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > key changed, new key = 669392df09572d0045b964bc65f86a2c > exit getNextWide > enter getNextWide > hasNext = true > //!!!THIS IS THE PROBLEM HERE I THINK!!! > //!!!Usually the key here == key before "exit getNextWide"!!! > set key = 5f900ee4bb1850f8cf387cc3d5fc23ca > //!!! lastRow is data for 669392df09572d0045b964bc65f86a2c !!! > //!!! but it's being added to key 5f900ee4bb1850f8cf387cc3d5fc23ca !!! > lastRow != null > added 1 items to bag from lastRow > //!!! Here are the real values for 5f900ee4bb1850f8cf387cc3d5fc23ca !!! > added 1 items to bag from row > hasNext = true > added 1 items to bag from row > hasNext = true > key changed, new key = 50438549-cdb6-8c44-f93a-d18d7daeffd8 > exit getNextWide > enter getNextWide > hasNext = true > set key = 50438549-cdb6-8c44-f93a-d18d7daeffd8 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira