hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pete Wyckoff <pwyck...@facebook.com>
Subject Re: Status FUSE-Support of HDFS
Date Mon, 03 Nov 2008 16:48:38 GMT

Reads are 20-30% slower
Writes are 33% slower before https://issues.apache.org/jira/browse/HADOOP-3805 - You need
a kernel > 2.6.26-rc* to test 3805, which I don't have :(

These #s are with hadoop 0.17 and the 0.18.2 version of fuse-dfs.

-- pete


On 11/2/08 6:23 AM, "Robert Krüger" <krueger@signal7.de> wrote:



Hi Pete,

thanks for the info. That helps a lot. We will probably test it for our
use cases then. Did you benchmark throughput when reading writing files
through fuse-dfs and compared it to command line tool or API access? Is
there a notable difference?

Thanks again,

Robert



Pete Wyckoff wrote:
> It has come a long way since 0.18 and facebook keeps our (0.17) dfs mounted via fuse
and uses that for some operations.
>
> There have recently been some problems with fuse-dfs when used in a multithreaded environment,
but those have been fixed in 0.18.2 and 0.19. (do not use 0.18 or 0.18.1)
>
> The current (known) issues are:
>   1. Wrong semantics when copying over an existing file - namely it does a delete and
then re-creates the file, so ownership/permissions may end up wrong. There is a patch for
this.
>   2. When directories have 10s of thousands of files, performance can be very poor.
>   3. Posix truncate is supported only for truncating it to 0 size since hdfs doesn't
support truncate.
>   4. Appends are not supported - this is a libhdfs problem and there is a patch for it.
>
> It is still a pre-1.0 product for sure, but it has been pretty stable for us.
>
>
> -- pete
>
>
> On 10/31/08 9:08 AM, "Robert Krüger" <krueger@signal7.de> wrote:
>
>
>
> Hi,
>
> could anyone tell me what the current Status of FUSE support for HDFS
> is? Is this something that can be expected to be usable in a few
> weeks/months in a production environment? We have been really
> happy/successful with HDFS in our production system. However, some
> software we use in our application simply requires an OS-Level file
> system which currently requires us to do a lot of copying between HDFS
> and a regular file system for processes which require that software and
> FUSE support would really eliminate that one disadvantage we have with
> HDFS. We wouldn't even require the performance of that to be outstanding
> because just by eliminatimng the copy step, we would greatly increase
> the thruput of those processes.
>
> Thanks for sharing any thoughts on this.
>
> Regards,
>
> Robert
>
>
>




Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message