hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <akimbal...@gmail.com>
Subject Re: problem to write on HDFS
Date Mon, 14 Mar 2011 18:15:27 GMT

I think your requirements are outside the operating envelope for HDFS'
design. HDFS is not particularly well-suited for interactive operation --
it's designed for batch workloads like those performed by MapReduce.
Opening, writing, and closing 100,000 files/second is unlikely to work on

When users head to your site, why do they need to write files in HDFS? What
do those files represent?

If each servlet is just trying to log its operations to HDFS, then you
should use a system like Flume which will aggregate the log records together
into a smaller number of streams and then write those to a more reasonable
number of files which are kept open for a longer amount of time.

If instead each servlet is saving some sort of per-user state that you
expect to retrieve in an online fashion, you should look at HBase or another
distributed (key, value) store (there are several options), which will
enable faster retrieval of this information.

Good luck,
- Aaron

On Mon, Mar 14, 2011 at 10:48 AM, Alessandro Binhara <binhara@gmail.com>wrote:

> Hello ...
> I have a servlet on tomcat.. and it open a hdfs and write simple file with
> a
> content of post information.
> Well , in first test we had a 14.000 request per second.
> My servet start many trads to write on filesystem.
> i got this message on tomcat:
> Mar 11, 2011 6:00:20 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
> SEVERE: Socket accept failed
> java.net.SocketException: Too many open files
> HDFS is slow to write a file?
> How is a better strategy to write on HFDS...
> In real aplication we will have a 100.000 request per second to salve in
> hdfs.
> thanks..

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message