hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3507) Rename of a directory with many opened files blocks name-node for a long time. changeLease() to blame.
Date Fri, 06 Jun 2008 02:26:46 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Konstantin Shvachko updated HADOOP-3507:
----------------------------------------

    Description: 
I am creating a directory containing 200,000 files, and then renaming it.
The rename operation is twice as long as the the total time for creating all those files.
The worst thing is that the rename blocks the name-node for minutes. I tried it with a bigger
directory containing 1 mln files - it blocks for 30 minutes.
The rename itself is fast, its the changeLease() called after renaming that takes all the
time.
As I can see from the code changeLease() gets tailMap() of the directory that it renames and
scans the whole tail.
If the number of open files is large as in my case this takes forever because the tailMap
includes all files in the subtree.

Simple way to reproduce it is to run
{code}
NNThroughputBenchmark -op open -files N
{code}
with a large N. This will first create N files in directory "/NNThroughputBenchmark/create"
and then rename it to "/NNThroughputBenchmark/open".

  was:
I am creating a directory containing 200,000 files, and then renaming it.
The rename operation is twice as long as the the total time for creating all those files.
The worst thing is that the rename blocks the name-node for minutes. I tried it with a bigger
directory containing 1 mln files - it blocks for 30 minutes.
The rename itself is fast it the changeLease() that takes place after the rename that takes
all the time.
As I can see from the code changeLease() gets tailMap() of the directory that it renames and
scans the whole tail.
If the number of open files is large as in my case this takes forever.

Simple way to reproduce it is to run
{code}
NNThroughputBenchmark -op open -files N
{code}
with a large N. This will first create N files in directory "/NNThroughputBenchmark/create"
and then rename it to "/NNThroughputBenchmark/open".


> Rename of a directory with many opened files blocks name-node for a long time. changeLease()
to blame.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3507
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3507
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Konstantin Shvachko
>             Fix For: 0.18.0
>
>
> I am creating a directory containing 200,000 files, and then renaming it.
> The rename operation is twice as long as the the total time for creating all those files.
> The worst thing is that the rename blocks the name-node for minutes. I tried it with
a bigger directory containing 1 mln files - it blocks for 30 minutes.
> The rename itself is fast, its the changeLease() called after renaming that takes all
the time.
> As I can see from the code changeLease() gets tailMap() of the directory that it renames
and scans the whole tail.
> If the number of open files is large as in my case this takes forever because the tailMap
includes all files in the subtree.
> Simple way to reproduce it is to run
> {code}
> NNThroughputBenchmark -op open -files N
> {code}
> with a large N. This will first create N files in directory "/NNThroughputBenchmark/create"
and then rename it to "/NNThroughputBenchmark/open".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message