accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Johnson <jej2...@gmail.com>
Subject Re: Accumulo 1.7 InputFormat Iterator Question
Date Thu, 18 Aug 2016 16:28:24 GMT
Thanks Josh, that is what I ultimately ended up doing, I suppose I was just
misusing the API before and it became really apparent with the upgrade.
Again thanks for all of the feedback.

On Thu, Aug 18, 2016 at 11:35 AM, Josh Elser <josh.elser@gmail.com> wrote:

> You could try following the same pattern as the AccumuloInputFormat:
> Create your own JamieInputFormatWithIterator which has static methods which
> will make all of the AIF.addIterator(...) calls you need, delegating the
> interface methods to AIF. This could also just be utility methods and you
> would leave your AIF calls as-is.
>
> IMO, this is often just done in your org.apache.hadoop.util.Tool
> implementation before submitting the job to run.
>
> Jamie Johnson wrote:
>
>> I had been handling this in the input format where I don't have access
>> to the job.  Should this be handled in a tool instead?
>>
>> I have thought about doing it in the input splits in initialize but it
>> requires a cast to range input split so it seemed like there might be a
>> better way.
>>
>>
>> On Aug 17, 2016 5:31 PM, "Russ Weeks" <rweeks@newbrightidea.com
>> <mailto:rweeks@newbrightidea.com>> wrote:
>>
>>     Hi, Jamie,
>>
>>     Try the static method AccumuloInputFormat.addIterator(job, new
>>     IteratorSetting(...)).
>>
>>     Note that the method isn't idempotent. To clear the iterators on a
>>     job you can
>>     call job.getConfiguration.unset("AccumuloInputFormat.ScanOpts.Ite
>> rators")
>>     (but that isn't officially part of the public API)
>>
>>     -Russ
>>
>>     On Wed, Aug 17, 2016 at 2:26 PM Jamie Johnson <jej2003@gmail.com
>>     <mailto:jej2003@gmail.com>> wrote:
>>
>>         I am upgrading from Accumulo 1.6 to 1.7 and I am trying to
>>         understand how iterators are supposed to be set in 1.7 for an
>>         input format.  In my situation, if a particular property is set
>>         an additional iterator needs to be added to do some additional
>>         checking.  Previously I had done this in the
>>         AbstractRecordReader.setupIterators() method but this has been
>>         deprecated.  I had attempted to put them in
>>         AbstractRecordReader.contextIterators(), but this isn't always
>>         called.  This change has made me question if I was ever doing
>>         this according to best practices and now wonder what the correct
>>         way to do this is.  Any pointers would be greatly appreciated.
>>
>>
>>

Mime
View raw message