accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Newton (Commented) (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
Date Tue, 03 Apr 2012 17:28:24 GMT


Eric Newton commented on ACCUMULO-118:

If we used viewfs, we would still need a more flexible way of choosing where in the file system
we want to put a tablet.  Right now we use

configured root of everything / tables / tableId / generated id / generated id . file extension

Some possible implementations:

 * configure a particular table onto a namespace, by mounting another namespace at the tableId.
But this would be difficult to predict and configure.
 * have a "namespace" property, on a per-table config, which would provide the root directory
to use for the table. You could not ever  change it, though.  Also, it might be nice to have
a table spread over multiple namespaces.
 * use a hashing technique to map names into different real namespaces.  But if we did this
as a layer over other filesystems, we would need to perform a read against all nns in order
to list the contents of a directory.  I'm not sure if we do very many directory listings,
so maybe this wouldn't be worse.
 * pluggable component that would choose the filenames to use.  You could use hashes to distribute
files, or choose based on namenode health, decommission status, etc.  Unfortunately, this
would make the organization of the table's files less coherent.

> accumulo could work across HDFS instances, which would help it to scale past a single
> ----------------------------------------------------------------------------------------------
>                 Key: ACCUMULO-118
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master, tserver
>    Affects Versions: 1.5.0
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
> Consider using full path names to files, which would allow the servers to access the
files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to break up
the namespace.
> We may need a pluggable strategy to determine namespace for new files.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message