accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Hugo <>
Subject MapReduce mapper not seeing all rows
Date Tue, 26 Feb 2013 20:17:39 GMT

I'm running a map reduce job over a table using AccumuloRowInputFormat.
 For debugging purposes I'm logging the key.getRow() so I can see what rows
it's finding as it progresses.

If I don't specify any ranges on the input format, it skips significant
number of rows - that is, I don't see any logging indicating that it
traversed them.

To see if it was a visibility issue, I tried explicitly setting a range,
like this:

        AccumuloRowInputFormat.setRanges(job.getConfiguration(), ranges);

When doing that it does process the rows that it otherwise skips.

The same TimestampFilter is being applied in both scenarios, no other
filters / iterators are being used.

Any thoughts on why, when run without the ranges specified, it isn't seeing
a significant portion of the data?



View raw message