hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (PIG-860) Load split by 'file' - not documented in pig latin reference manual
Date Mon, 22 Jun 2009 18:58:07 GMT

    [ https://issues.apache.org/jira/browse/PIG-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722760#action_12722760
] 

Thejas M Nair edited comment on PIG-860 at 6/22/09 11:56 AM:
-------------------------------------------------------------

Use case, from discussion in user mailing list -

------ Forwarded Message
From: Pradeep Kamath <pradeepk@yah...c.com>
Reply-To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>
Date: Mon, 22 Jun 2009 11:04:04 -0700
To: <pig-user@hadoop.apache.org>
Conversation: How to make the pig unsplittable ?
Subject: RE: How to make the pig unsplittable ?

You can load using the following syntax:
A = load 'inputfile' split by 'file';

The "split by 'file'" will ensure issplittable is set to false and the
input is not split.

-Pradeep

-----Original Message-----
From: zhang jianfeng [mailto:zjff..ail.com] 
Sent: Sunday, June 21, 2009 10:48 PM
To: pig-user@hadoop.apache.org
Subject: How to make the pig unsplittable ?

Hi all,

Because of my input file format, the first line of file is the
definition of
each field, and then lines of records. So I did not found one good
method of
using customer slicer.

So I'd like to make the pig do not split my file, but I did not found an
easy way. Now I have to change the code in POLoad and LOLoad, make the
variable isSplitable false.

Is there any easier way to make it unsplittable, such as configuration ?


Thank you for any help.


Jeff Zhang

------ End of Forwarded Message

      was (Author: thejas):
    Use case, from discussion in user mailing list -

------ Forwarded Message
From: Pradeep Kamath <pradeepk@yahoo-inc.com>
Reply-To: "pig-user@hadoop.apache.org" <pig-user@hadoop.apache.org>
Date: Mon, 22 Jun 2009 11:04:04 -0700
To: <pig-user@hadoop.apache.org>
Conversation: How to make the pig unsplittable ?
Subject: RE: How to make the pig unsplittable ?

You can load using the following syntax:
A = load 'inputfile' split by 'file';

The "split by 'file'" will ensure issplittable is set to false and the
input is not split.

-Pradeep

-----Original Message-----
From: zhang jianfeng [mailto:zjffdu@gmail.com] 
Sent: Sunday, June 21, 2009 10:48 PM
To: pig-user@hadoop.apache.org
Subject: How to make the pig unsplittable ?

Hi all,

Because of my input file format, the first line of file is the
definition of
each field, and then lines of records. So I did not found one good
method of
using customer slicer.

So I'd like to make the pig do not split my file, but I did not found an
easy way. Now I have to change the code in POLoad and LOLoad, make the
variable isSplitable false.

Is there any easier way to make it unsplittable, such as configuration ?


Thank you for any help.


Jeff Zhang

------ End of Forwarded Message
  
> Load split by 'file' - not documented in pig latin reference manual
> -------------------------------------------------------------------
>
>                 Key: PIG-860
>                 URL: https://issues.apache.org/jira/browse/PIG-860
>             Project: Pig
>          Issue Type: Task
>          Components: documentation
>            Reporter: Thejas M Nair
>            Priority: Minor
>
> "split by 'file' " is not documented in Pig Latin Reference Manual (http://hadoop.apache.org/pig/docs/r0.2.0/piglatin.html).
> There is a description about the option here -http://wiki.apache.org/pig/PigStreamingFunctionalSpec
(section 4.3).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message