hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vinod Kumar Vavilapalli <vino...@hortonworks.com>
Subject Re: Is FileSystem thread-safe?
Date Fri, 17 May 2013 17:34:57 GMT

I see. The lots-of-part-files pattern is what most of us end up using.

Thanks,
+Vinod Kumar Vavilapalli

On May 17, 2013, at 10:16 AM, John Lilley wrote:

> Vinod,
> Thanks, I was mostly asking in the context of attempting to unify the output of multiple
tasks.  I’ve seen that in most cases, users opt to output a folder full of file parts into
HDFS and then read them directly or unify them later.
> John
>  
>  
> From: Vinod Kumar Vavilapalli [mailto:vinodkv@hortonworks.com] 
> Sent: Friday, May 17, 2013 11:14 AM
> To: user@hadoop.apache.org
> Subject: Re: Is FileSystem thread-safe?
>  
>  
> As of today, there is no atomic append, so no, what you say isn't possible. FWIU, it
is one appender at a time - achieved through a lease per file, and multiple concurrent leases
aren't allowed for any given file.
>  
> Thanks,
> +Vinod Kumar Vavilapalli
>  
> On May 17, 2013, at 6:40 AM, John Lilley wrote:
> 
> 
> Thanks! Does this also imply that multiple clients may open the same HDFS file for append
simultaneously, and expect append requests to be interleaved?
> john
>  
> From: Arpit Agarwal [mailto:aagarwal@hortonworks.com] 
> Sent: Monday, April 01, 2013 4:18 PM
> To: user@hadoop.apache.org
> Subject: Re: Is FileSystem thread-safe?
>  
> Hi John,
> 
> DistributedFileSystem is intended to be thread-safe, true to its name. 
> 
> Metadata operations are handled by the NameNode server which synchronizes concurrent
client requests via locks (you can look at the FSNameSystem class).
> 
> Some discussion on the thread-safety aspects of HDFS:
> http://storageconference.org/2010/Papers/MSST/Shvachko.pdf
> 
> -Arpit
> 
> 
> 
> On Sun, Mar 31, 2013 at 11:52 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> If you look at DistributedFileSystem source code, you would see that it calls the DFSClient
field member for most of the actions.
> Requests to Namenode are then made through ClientProtocol.
>  
> An hdfs committer would be able to give you affirmative answer.
>  
> 
> On Sun, Mar 31, 2013 at 11:27 AM, John Lilley <john.lilley@redpoint.net> wrote:
> From: Ted Yu [mailto:yuzhihong@gmail.com] 
> Subject: Re: Is FileSystem thread-safe?
> >>FileSystem is an abstract class, what concrete class are you using (DistributedFileSystem,
etc) ?
> Good point.  I am calling FileSystem.get(URI uri, Configuration conf) with an URI like
“hdfs://server:port/…” on a remote server, so I assume it is creating a DistributedFileSystem.
 However I am not finding any documentation discussing its thread-safety (or lack thereof),
perhaps you can point me to it?
> Thanks, john
>  


Mime
View raw message