accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Marion (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
Date Fri, 24 May 2013 13:52:20 GMT


Dave Marion commented on ACCUMULO-118:

Personally I am not a fan of the hash idea. I would rather see a mapping of namespace prefix
to NN in the configuration (ns1 = hdfs://host:port, ns2 = hdfs://host:port). I'm thinking
forward to table file load balancing across namespaces and backups (see my comment from 3/Apr/12).
If for example you quiesced the database and performed a backup, then you could change the
namespace mapping such that ns1 and ns2 point to the same hdfs://host:port if for some reason
you lost the 2nd hdfs instance (it crashed, you wanted to remove it, etc). 

This could also allow for of Hadoop wile Accumulo is still running. Think about the scenario
where ns1 is on racks 1&2 and ns2 is on racks 3&4 of a cluster and the files of table
T are spread across ns1 and ns2. You could change the configuration of the table file load
balancer (new feature) that puts new files on ns2. You recompact the table so now all new
files are on ns2. When done for all tables (and walogs), then you can shutdown ns1 and upgrade
to a new version of Hadoop.
> accumulo could work across HDFS instances, which would help it to scale past a single
> ----------------------------------------------------------------------------------------------
>                 Key: ACCUMULO-118
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master, tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.6.0
>         Attachments: ACCUMULO-118-01.txt
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
> Consider using full path names to files, which would allow the servers to access the
files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to break up
the namespace.
> We may need a pluggable strategy to determine namespace for new files.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message