hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6727) Refresh data volumes on DataNode based on configuration changes
Date Wed, 17 Sep 2014 01:02:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14136600#comment-14136600
] 

Colin Patrick McCabe commented on HDFS-6727:
--------------------------------------------

Thinking about this a little more, we might have timeouts if the reconfigurable API does everything
synchronously.  So I think we need to modify the existing API to just start a reconfiguration,
and add another API that queries what keys are currently being reconfigured.

>From the command line, this would be like
{code}
$ hdfs reconfigure -datanode -start
Datanode reconfiguration started at 12:34 PM, Dec 12 2014
{code}
followed by
{code}
hdfs reconfigure -datanode -list
Datanode is in the process of reconfiguring {{dfs.data.dirs}} and {{foo.bar.baz}}
{code}

later....
{code}
hdfs reconfigure -datanode -list
Datanode reconfiguration completed at 13:46 PM, Dec 12 2014
{code}

In the logs, we'd see information about whether the storage directories were all successfully
reconfigured.

If the user tried to reconfigure while another reconfiguration was going on, we'd get:
{code}
$ hdfs reconfigure -datanode -start
Can't start a new reconfiguration because the one started at 12:34 PM, Dec 12 2014 is still
going on
{code}

I think we can do this in a follow-up change, though.

Back to the patch itself....
{code}
+  public synchronized List<StorageLocation> addVolumes(
+      final List<StorageLocation> volumes,
+      final Collection<String> bpids, final Configuration conf)
       throws IOException {
{code}

Let's change this to not throw IOException.  It's confusing to have both a return value of
the volumes that we successfully added and an IOE.  I would pick one... just the list of volumes.

{code}
    final List<IOException> exceptions = Collections.synchronizedList(
        new ArrayList<IOException>());
    List<Thread> volumeLoadingThreads = Lists.newArrayList();
    for (final String bpid : bpids) {
      // Multi threading
      Thread t = new Thread() {
        public void run() {
          try {
            fsVolume.addBlockPool(bpid, conf);
            fsVolume.getVolumeMap(bpid, tempVolumeMap);
          } catch (IOException e) {
            LOG.warn("Caught exception when adding " + fsVolume +
              ". Will throw later.", e);
            exceptions.add(e);
          }
        }
      };
      volumeLoadingThreads.add(t);
      t.start();
    }
    for (Thread t : volumeLoadingThreads) {
      try {
        t.join();
      } catch (InterruptedException ie) {
        exceptions.add(new IOException(ie));
      }
 {code}
We're already creating a thread per volume... I think multiplying that by a thread per bpid
is too much.  Keep in mind that when dealing with a volume, having multiple threads will be
less useful since we'll often be I/O bound (a single volume is normally a single disk).  So
let's just have a for loop here, not multiple threads.

Looks pretty good aside from that.

> Refresh data volumes on DataNode based on configuration changes
> ---------------------------------------------------------------
>
>                 Key: HDFS-6727
>                 URL: https://issues.apache.org/jira/browse/HDFS-6727
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>    Affects Versions: 2.5.0, 2.4.1
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>              Labels: datanode
>         Attachments: HDFS-6727.000.delta-HDFS-6775.txt, HDFS-6727.001.patch, HDFS-6727.002.patch,
HDFS-6727.003.patch, HDFS-6727.combo.patch
>
>
> HDFS-1362 requires DataNode to reload configuration file during the runtime, so that
DN can change the data volumes dynamically. This JIRA reuses the reconfiguration framework
introduced by HADOOP-7001 to enable DN to reconfigure at runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message