cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mck SembWever (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3150) ColumnFormatRecordReader loops forever
Date Wed, 07 Sep 2011 19:19:09 GMT


Mck SembWever commented on CASSANDRA-3150:

Here keyRange is startToken to split.getEndToken()
startToken is updated each iterate to the last row read (each iterate is batchRowCount rows).

What happens is split.getEndToken() doesn't correspond to any of the rowKeys?
To me it reads that startToken will hop over split.getEndToken() and get_rage_slices(..) will
start returning wrapping ranges. This will still return rows and so the iteration will continue,
now forever.

The only way out for this code today is a) startToken equals split.getEndToken(), or b) get_range_slices(..)
is called with startToken equals split.getEndToken() OR a gap so small there exists no rows
in between.

> ColumnFormatRecordReader loops forever
> --------------------------------------
>                 Key: CASSANDRA-3150
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.4
>            Reporter: Mck SembWever
>            Assignee: Mck SembWever
>            Priority: Critical
>         Attachments: CASSANDRA-3150.patch
> From
> {quote}
> bq. Cassandra-0.8.4 w/ ByteOrderedPartitioner
> bq. CFIF's inputSplitSize=196608
> bq. 3 map tasks (from 4013) is still running after read 25 million rows.
> bq. Can this be a bug in StorageService.getSplits(..) ?
> getSplits looks pretty foolproof to me but I guess we'd need to add
> more debug logging to rule out a bug there for sure.
> I guess the main alternative would be a bug in the recordreader paging.
> {quote}

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message