cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value
Date Thu, 25 Apr 2013 22:48:16 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-5504:
--------------------------------------

    Attachment: 5504-v3.txt

Thanks for the patch, Oleksandr.

It looks to me like the root of the problem is that {{key.put(this.getCurrentKey())}} destructively
modifies currentKey.  Attached is a patch to duplicate the buffer first.

This has the added benefit that we don't have to impose any overhead on the new mapreduce
api to solve this problem in the old mapred one.
                
> Eternal iteration when using newer hadoop version due to next() call and empty key value
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5504
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.2.3
>            Reporter: Oleksandr Petrov
>            Priority: Critical
>         Attachments: 5504-v3.txt, patch2.diff, patch.diff
>
>
> Currently, when using newer hadoop versions, due to the call to 
> next(ByteBuffer key, SortedMap<ByteBuffer, IColumn> value)
> within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That
causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key
is called, key is already empty. This will cause Hadoop to request the same range again and
again all the time.
> Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves
it for the next iteration along with all the rows, this allows query for the next range to
be fully correct.
> This patch is branched from 1.2.3 version.
> Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message