accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-118) accumulo could work across HDFS instances, which would help it to scale past a single namenode
Date Thu, 13 Feb 2014 23:58:20 GMT


Josh Elser commented on ACCUMULO-118:

I was just testing this out. I tried to add a new volume to an existing 1.6 instance I had
lying around.

I expected that each volume I specified in {{instance.volumes}} was the "equivalent" of how
multiple {{instance.dfs.uri}}+{{instance.dfs.dir}} would have worked. In other words, I expected
each element in {{instance.volumes}} to be the base directory that Accumulo would write to.
Instead, it actually wrote to that volume + {{instance.dfs.dir}}.

This irks me for a few reasons:

# I must have the same base directory used in HDFS across all volumes (not the end of the
world, but I don't see any reason to impose that on our end).
# I expected {{instance.volumes}} to be a replacement to {{instance.dfs.dir}} and {{instance.dfs.uri}},
but the new configuration still relies on the old configuration.

Let me try to be crystal clear. I had an existing installation on machine1 in {{/accumulo1.6}}
in HDFS. I tried to add a second volume, stored on machine2, in {{/accumulo1.6-newvolume}}
(I already had an /accumulo1.6 from other testing on machine2). I configured my {{instance.volumes}}
value to {{hdfs://machine1:8020/accumulo1.6,hdfs://machine2:8020/accumulo1.6-newvolume}}.
Sadly, when invoking {{bin/accumulo init --add-volumes}}, this failed on me because it actually
looked in {{hdfs://machine1:8020/accumulo1.6/accumulo}} and {{hdfs://machine2:8020/accumulo1.6-newvolume/accumulo}}.

> accumulo could work across HDFS instances, which would help it to scale past a single
> ----------------------------------------------------------------------------------------------
>                 Key: ACCUMULO-118
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: master, tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>            Priority: Blocker
>             Fix For: 1.6.0
>         Attachments: ACCUMULO-118-01.txt, ACCUMULO-118-02.txt
>   Original Estimate: 2,016h
>  Remaining Estimate: 2,016h
> Consider using full path names to files, which would allow the servers to access the
files on any HDFS file system.
> Work may exist elsewhere to run HDFS using a number of NameNode instances to break up
the namespace.
> We may need a pluggable strategy to determine namespace for new files.

This message was sent by Atlassian JIRA

View raw message