hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: getFileBlockLocations on a newly created and closed file
Date Thu, 07 Jun 2012 22:41:36 GMT
Hey Nicholas,

This is the expected behavior based on the default configuration of
dfs.replication.min. When you close the file, the client waits until
all of the DNs have the block fully written, but the DNs report the
replica to the NN asychronously. So with the default configuration,
the client then only waits for 1 replica to be available before
allowing the file to be closed.

If you need to wait for more replicas, I would recommend polling after
closing the file.

-Todd

On Thu, Jun 7, 2012 at 3:36 PM, N Keywal <nkeywal@gmail.com> wrote:
> Hello,
>
> I have a hdfs behavior that I cannot explain: when creating a file in
> hdfs, closing it, and counting the number of servers containing a
> replicated block, I don't have immediately the right answer (3): I may
> get 2 sometimes. If I wait before counting, then it's always 3. Is
> this to be expected? Here is a piece of code showing the point.
>
> Thanks,
>
> N.
>
> --
> package org.apache.hadoop.test;
>
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.fs.FSDataOutputStream;
> import org.apache.hadoop.fs.FileStatus;
> import org.apache.hadoop.fs.FileSystem;
> import org.apache.hadoop.fs.Path;
> import org.apache.hadoop.hdfs.MiniDFSCluster;
> import org.junit.Test;
>
> import static junit.framework.Assert.assertEquals;
>
> public class TestHDFS {
>
>  @Test
>  public void testFSUTils() throws Exception {
>    final Configuration conf = new Configuration();
>    final String hosts[] = {"host1", "host2", "host3", "host4"};
>    final byte[] data = new byte[1]; // Will fit in one block
>    final Path testFile = new Path("/test1.txt");
>
>      MiniDFSCluster dfsCluster = new MiniDFSCluster(0, conf,
> hosts.length, true, true, true, null, null, hosts, null);
>    try {
>      FileSystem fs = dfsCluster.getFileSystem();
>      dfsCluster.waitClusterUp();
>
>      for (int i = 0; i < 200; ++i) {
>        FSDataOutputStream out = fs.create(testFile);
>        out.write(data, 0, 1);
>        out.close();
>
>        // Put a sleep here to make me work
>        //Thread.sleep(1000);
>
>        FileStatus status = fs.getFileStatus(testFile);
>        int nbHosts = fs.getFileBlockLocations(status, 0,
> status.getLen())[0].getHosts().length;
>        assertEquals(1, fs.getFileBlockLocations(status, 0,
> status.getLen()).length);
>        assertEquals("Wrong number of hosts distributing blocks at
> iteration " + i, 3, nbHosts);
>
>        fs.delete(testFile, true);
>      }
>
>    } finally {
>      dfsCluster.shutdown();
>    }
>  }
> }



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
View raw message