accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Faro <kevin.f...@gmail.com>
Subject Re: MultipleInputs with AccumuloInputFormat
Date Tue, 05 Nov 2013 17:13:29 GMT
I recently looked into that and came to the same realization.

I ended up writing a new input format that did the cartesian product of two
tables.  But to do that I had to store values for the left configuration
and right configuration and then copy over whichever config settings I
wanted to use for the AIF depending on which split i needed in the
RecordReader.

It would have been awesome if I could have just used the MultipleInputs ...

--Kevin


On Tue, Nov 5, 2013 at 10:24 AM, Josh Elser <josh.elser@gmail.com> wrote:

> In executing some MapReduce over Accumulo with the AccumuloInputFormat, I
> came to the realization that AIF fundamentally doesn't work with concepts
> like MultipleInputs in Hadoop (http://hadoop.apache.org/
> docs/current/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html).
> Given that you can only write one set of configuration for AIF into a
> Configuration object, there's not a mechanism to support multiple. This
> appears to be the case across all versions.
>
> Is this correct? Have I overlooked something?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message