accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Hugo <>
Subject Re: MapReduce mapper not seeing all rows
Date Tue, 26 Feb 2013 20:31:49 GMT
Our row keys are a combination of two elements, like this:



When running without any ranges set, we're missing an entire prefix worth -
e.g. we don't get any rows that start with "foo"

When I tried running with the range set, I did a prefix range on "foo" and
it then found the rows starting with "foo"

On Tue, Feb 26, 2013 at 2:28 PM, Billie Rinaldi <> wrote:

> Have you noticed any pattern in the rows it seems to be missing?  E.g.
> every other row, the last row in each tablet, etc.?  When you set a range,
> what range did you set?
> Billie
> On Tue, Feb 26, 2013 at 12:17 PM, Mike Hugo <> wrote:
>> Hello,
>> I'm running a map reduce job over a table using AccumuloRowInputFormat.
>>  For debugging purposes I'm logging the key.getRow() so I can see what rows
>> it's finding as it progresses.
>> If I don't specify any ranges on the input format, it skips significant
>> number of rows - that is, I don't see any logging indicating that it
>> traversed them.
>> To see if it was a visibility issue, I tried explicitly setting a range,
>> like this:
>>         AccumuloRowInputFormat.setRanges(job.getConfiguration(), ranges);
>> When doing that it does process the rows that it otherwise skips.
>> The same TimestampFilter is being applied in both scenarios, no other
>> filters / iterators are being used.
>> Any thoughts on why, when run without the ranges specified, it isn't
>> seeing a significant portion of the data?
>> Thanks,
>> Mike

View raw message