hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David B. Ritch" <david.ri...@gmail.com>
Subject HDFS and long-running processes
Date Thu, 02 Jul 2009 12:21:52 GMT
I have been told that it is not a good idea to keep HDFS files open for
a long time.  The reason sounded like a memory leak in the name node -
that over time, the resources absorbed by an open file will increase.

Is this still an issue  with Hadoop-0,19.x and 0-20.x?  Was it ever an
issue?

I have an application that keeps a number of files open, and executes
pseudo-random reads from them in response to externally generated
queries.  Should it close and re-open active files that are open longer
than a certain amount of time?  If so, how long is too long to keep a
file open?  And why?

Thanks!

David

Mime
View raw message