hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Zhang <zjf...@gmail.com>
Subject Re: load files
Date Mon, 28 Jun 2010 14:06:14 GMT
part-xxxxx for is old hadoop mapred api, and part-m-xxxxx and
part-r-xxxxx is for new hadoop mapred api
You can use hadoop's globstatus("part-*") to handle both of these cases.



2010/6/28 Gang Luo <lgpublic@yahoo.com.cn>:
> Thanks, Jeff.
> In pig, the file name look like this: part-m-xxxxx(for map result) or part-r-xxxxx(for
reduce result), which are different from the hadoop style (part-xxxxx). So, can we control
the name of each generated file? How?
>
> Thanks,
> -Gang
>
>
>
> ----- 原始邮件 ----
> 发件人: Jeff Zhang <zjffdu@gmail.com>
> 收件人: pig-dev@hadoop.apache.org
> 发送日期: 2010/6/27 (周日) 9:22:30 下午
> 主   题: Re: load files
>
> Hi Gang,
>
> The path specified in load can be both file or directory, besides you
> can also leverage hadoop's globstatus.  The path specified in store is
> a directory.
>
>
>
> On Mon, Jun 28, 2010 at 4:44 AM, Gang Luo <lgpublic@yahoo.com.cn> wrote:
>> Hi all,
>> when we specify the path of input to a load operator, is it a file or a directory?
Similarly, when we use store-load to connect two MR operators, is the path specified in the
store and load a directory?
>>
>> Thanks,
>> -Gang
>>
>>
>>
>>
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>
>
>
>
>



-- 
Best Regards

Jeff Zhang

Mime
View raw message