accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billie Rinaldi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-391) Multi-table Accumulo input format
Date Tue, 24 Jul 2012 17:11:35 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421549#comment-13421549
] 

Billie Rinaldi commented on ACCUMULO-391:
-----------------------------------------

The patch is looking pretty good.  An additional thing you'll need to do is update the fetchColumns
method so that you're fetching columns for particular tables.  Also, how would you feel about
getting rid of TableKey and leaving this as an InputFormat<Key,Value>?  I suggest this
because the RangeInputSplit already contains the table name, and a Mapper can access it through
((RangeInputSplit) context.getInputSplit()).getTableName().  It's somewhat awkward, but the
advantage of keeping InputFormat<Key,Value> is that you can have Mappers that don't
care which table they're running over and can be used with either the single table or multi-table
input format.  If we want to make it easier to grab the table name, we could add a public
static method that pulls it from a Context, or from an InputSplit.

A separate thing I want to do is get rid of the AccumuloIterator and AccumuloIteratorOption
configuration objects and just make IteratorSetting Writable so it can be used directly. 
I'll open another ticket about that, though.
                
> Multi-table Accumulo input format
> ---------------------------------
>
>                 Key: ACCUMULO-391
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-391
>             Project: Accumulo
>          Issue Type: New Feature
>    Affects Versions: 1.5.0-SNAPSHOT
>            Reporter: John Vines
>            Assignee: William Slacum
>            Priority: Minor
>              Labels: mapreduce,
>         Attachments: multi-table-if.patch, new-multitable-if.patch
>
>
> Just realized we had no MR input method which supports multiple Tables for an input format.
I would see it making the table the mapper's key and making the Key/Value a tuple, or alternatively
have the Table/Key be the key tuple and stick with Values being the value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message