accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Vines <>
Subject Re: AccumuloInputFormat.setRanges
Date Thu, 03 Jan 2013 22:33:15 GMT
As Josh said, using a series of Ranges would be more efficient. Depending
on the quantity, there is a known bug in older releases when you have a LOT
of ranges, but barring that it should work for you. Instead of doing a
range containing the entire table, you can do a bunch of single row ranges
which correspond to the query terms. The mappers should only ever get data
which was expressed in the set of ranges supplied.

On Wed, Jan 2, 2013 at 6:30 PM, Seastrom, Jessica K <>wrote:

> Using AccumuloInputFormat.setRanges(conf, someRange), should I expect that
> the Key,Values as input to the Map method will be restricted to those keys
> in the set contained in someRange?
> My current implementation filters K,V pairs using the DistributedCache to
> hold the query terms
> (if(myDistributedCacheQueryTermsHashSet.contains(key.getRow())…) but I
> wonder if AccumuloInputFormat.setRanges is an alternate implementation. It
> didn't seem to filter as above, but perhaps I'm just not implementing it
> correctly.
> Thank you,
> Jessica

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message