hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adeel Qureshi <adeelmahm...@gmail.com>
Subject hdfs write files in streaming fashion
Date Mon, 19 Aug 2013 19:38:50 GMT
I have a servlet that receives files in a streaming fashion and our
original design was to receive the file in /tmp directory and then move it
to hdfs via an external process but that seems to add an additional (may be
unnecessary step). My question is if I receive files in a servlet as a post
request (file is in body of request) and I open a bufferedwriter on hdfs

1. are the files really written in a streaming fashion such that nothing is
held in memory because these are huge files and maintaining in memory and
then at the end sending the whole file to hdfs wont make sense

2. if for some reason we decide half way down the file to reject it and not
move it to hdfs, since it was being streamed do we have to remove the file
or simply because the write stream isnt closed or some exception is thrown
that it will be automatically cleaned by file system.


View raw message