hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Craig Macdonald <cra...@dcs.gla.ac.uk>
Subject Re: A question regarding the execution engine
Date Thu, 06 Mar 2008 14:51:04 GMT
Hi Pi,

I have a JIRA on this issue: PIG-102
It needs feedback from the community whether it should be a 
configuration property or a high-level command.

Craig

Benjamin Reed wrote:
> This uses the FileLocalizer. All file references are sent through the 
> FileLocalizer. If we are doing MAPREDUCE and a file reference starts with 
> file:, we copy it to a temp file in HDFS before we start the job and use that 
> temp file as the input or output of the map reduce job.
>
> ben
>
> On Thursday 06 March 2008 04:07:41 pi song wrote:
>   
>> Dear pig-dev mailling-list,
>>
>> I just wanna understand this bit quickly. Below is the code from
>> TestMapReduce.java. As you can see the temp file is created in local
>> machine but I don't understand how Hadoop MapReduce pick up the file from
>> local file system rather than HDFS?
>>
>>         PigServer pig = new PigServer(MAPREDUCE);
>>         File tmpFile = File.createTempFile("test", ".txt");
>>         PrintStream ps = new PrintStream(new FileOutputStream(tmpFile));
>>         for(int i = 0; i < 10; i++) {
>>             ps.println(i+"\t"+i);
>>         }
>>         ps.close();
>>         String query = "foreach (load 'file:"+tmpFile+"') generate $0,$1;";
>>         System.out.println(query);
>>         pig.registerQuery("asdf_id = " + query);
>>         try {
>>             pig.deleteFile("frog");
>>         } catch(Exception e) {}
>>
>> Cheers,
>> Pi
>>     
>
>
>   


Mime
View raw message