hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From psdc1978 <psdc1...@gmail.com>
Subject Re: HDFS and MapReduce and /tmp directory
Date Mon, 05 Apr 2010 12:40:15 GMT
Yes, I know that, but this answer won't answer my questions.

On Mon, Apr 5, 2010 at 12:31 PM, Rekha Joshi <rekhajos@yahoo-inc.com> wrote:

>  In manner of providing a quick byte,  /tmp folder( check hadoop.tmp.dir)
> is only temporarily used by MR process and they are ideally cleaned up after
> the job has finished execution on cleanup/abort.
> MR is a process which loads/stores data in HDFS. Most of your queries
> relate to knowing your default hdfs location. You can find that by “hadoop
> dfs –ls”.The path preceding .Trash is your default hdfs location.
>
> HTH,
> /
>
> On 4/5/10 4:24 PM, "psdc1978" <psdc1978@gmail.com> wrote:
>
> Hi,
>
> When I run an MapReduce example, I've noticed that some temporary
> directories are buit in /tmp directory.
>
> In my case, in the /tmp/hadoop directory it was created the following file
> directory during the execution of wordcount example:
>
>
> job_201004041803_0002/
> |-- attempt_201004041803_0002_m_000000_0_0_m_0
> |   |-- job.xml
> |   |-- output
> |   |   |-- file.out
> |   |   `-- file.out.index
> |   |-- pid
> |   `-- split.dta
>
> 1 - In the map attempt task it exists a file.out and split.dta file.The
> split.dta is the map output produced by the map and that will be fetched by
> the reducer?
>
> 2 - What's the file.out and file.out.index?
>
> 3 - Is this data were written by MR anything related to HDFS?
>
> 4 - I'm a bit confused to differentiate between the files that are written
> in /tmp directory during the execution of my example, and the place where
> the files are written with the command
> "bin/hadoop dfs -copyFromLocal".
>
> a) When I execute the "bin/hadoop dfs -copyFromLocal <from> <to>" command,
> where's the destination folder?
>
> b) Is it in memory or is physically in my HD?
>
> c) If the files are written in the HD, in wich directory are they?
>
> d) What is the difference between the data written win the command
> -copyFromLocal and the data written in the /tmp directory?
>
>
> 5 - The output of a reducer example comes in the form part_0000 that is
> written in gutenberg-output. Where is this file? Is it in my HD?
>
>
> Thank you,
>
>


-- 
Pedro

Mime
View raw message