accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Corey J. Nolet (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-2553) AccumuloFileOutputFormat should be able to support output for multiple tables.
Date Thu, 10 Apr 2014 15:44:15 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965462#comment-13965462
] 

Corey J. Nolet commented on ACCUMULO-2553:
------------------------------------------

I've got this working with a GroupedKeyRangePartitioner that accepts multiple splits files
for groups and creates one logical set of cutpoints for all of the splits. To guarantee, however,
that split points don't bleed between tables, I had to bound each of the ranges. 

I was able to create the keys/value and bulk ingest them into 6 tables at the same time using
MultipleOutputs, the GroupedKeyRangePartitioner, and no changes to the AccumuloFileOutputFormat.


> AccumuloFileOutputFormat should be able to support output for multiple tables.
> ------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-2553
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-2553
>             Project: Accumulo
>          Issue Type: New Feature
>            Reporter: Corey J. Nolet
>            Priority: Minor
>
> This may not necessarily be something that would require changes in the AccumuloFileOutputFormat
itself. Perhaps the ability to use it with Hadoop's MultipleOutputs is really the solution.
> It would be useful if the user could specify multiple directories where RFiles should
be placed and have a mechanism for populating the RFiles in the necessary directories based
on a table name or group name. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message