accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Billie Rinaldi <bil...@apache.org>
Subject Re: MapReduce mapper not seeing all rows
Date Tue, 26 Feb 2013 20:28:57 GMT
Have you noticed any pattern in the rows it seems to be missing?  E.g.
every other row, the last row in each tablet, etc.?  When you set a range,
what range did you set?

Billie


On Tue, Feb 26, 2013 at 12:17 PM, Mike Hugo <mike@piragua.com> wrote:

> Hello,
>
> I'm running a map reduce job over a table using AccumuloRowInputFormat.
>  For debugging purposes I'm logging the key.getRow() so I can see what rows
> it's finding as it progresses.
>
> If I don't specify any ranges on the input format, it skips significant
> number of rows - that is, I don't see any logging indicating that it
> traversed them.
>
> To see if it was a visibility issue, I tried explicitly setting a range,
> like this:
>
>         AccumuloRowInputFormat.setRanges(job.getConfiguration(), ranges);
>
> When doing that it does process the rows that it otherwise skips.
>
> The same TimestampFilter is being applied in both scenarios, no other
> filters / iterators are being used.
>
> Any thoughts on why, when run without the ranges specified, it isn't
> seeing a significant portion of the data?
>
> Thanks,
>
> Mike
>

Mime
View raw message