accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-391) Multi-table Accumulo input format
Date Wed, 09 Oct 2013 19:28:43 GMT


Christopher Tubbs commented on ACCUMULO-391:

{quote}the complexity introduced by keeping keeping separate iterator, range, columns, and
tablenames on the job just made it very prone to falling into a bad state{quote}

Right, I'd hate to have two implementations to maintain as well, especially since one is a
specialized case of the other. We can simplify that, while preserving the existing single-table
API, by internally delegating to the general purpose implementation (all this can happen in
the common Configurator, so we don't have to maintain it separately for each mapred/mapreduce
API), in order to have less code to maintain. I'm thinking that if the TableQueryConfig is
easily serialized/deserialized, and fully mutable, it's a simple matter for the existing separate
methods to deserialize the one table, mutate it, and serialize it back to the config. The
rest of the code (the getSplits implementation, etc.) would be common.

> Multi-table Accumulo input format
> ---------------------------------
>                 Key: ACCUMULO-391
>                 URL:
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: John Vines
>            Assignee: Corey J. Nolet
>            Priority: Minor
>              Labels: mapreduce,
>             Fix For: 1.6.0
>         Attachments: ACCUMULO-391.patch, multi-table-if.patch, new-multitable-if.patch
> Just realized we had no MR input method which supports multiple Tables for an input format.
I would see it making the table the mapper's key and making the Key/Value a tuple, or alternatively
have the Table/Key be the key tuple and stick with Values being the value.

This message was sent by Atlassian JIRA

View raw message