Return-Path: X-Original-To: apmail-pig-commits-archive@www.apache.org Delivered-To: apmail-pig-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 29684DFDD for ; Sat, 15 Dec 2012 20:29:33 +0000 (UTC) Received: (qmail 57711 invoked by uid 500); 15 Dec 2012 20:29:32 -0000 Delivered-To: apmail-pig-commits-archive@pig.apache.org Received: (qmail 57639 invoked by uid 500); 15 Dec 2012 20:29:32 -0000 Mailing-List: contact commits-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list commits@pig.apache.org Received: (qmail 57632 invoked by uid 99); 15 Dec 2012 20:29:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Dec 2012 20:29:32 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 15 Dec 2012 20:29:30 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id 639CE23889B8; Sat, 15 Dec 2012 20:29:10 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r1422348 - in /pig/trunk: CHANGES.txt src/docs/src/documentation/content/xdocs/func.xml Date: Sat, 15 Dec 2012 20:29:09 -0000 To: commits@pig.apache.org From: cheolsoo@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20121215202910.639CE23889B8@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: cheolsoo Date: Sat Dec 15 20:29:08 2012 New Revision: 1422348 URL: http://svn.apache.org/viewvc?rev=1422348&view=rev Log: PIG-3085: Errors and lacks in document "Built In Functions" (miyakawataku via cheolsoo) Modified: pig/trunk/CHANGES.txt pig/trunk/src/docs/src/documentation/content/xdocs/func.xml Modified: pig/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/pig/trunk/CHANGES.txt?rev=1422348&r1=1422347&r2=1422348&view=diff ============================================================================== --- pig/trunk/CHANGES.txt (original) +++ pig/trunk/CHANGES.txt Sat Dec 15 20:29:08 2012 @@ -64,6 +64,8 @@ PIG-3013: BinInterSedes improve chararra BUG FIXES +PIG-3085: Errors and lacks in document "Built In Functions" (miyakawataku via cheolsoo) + PIG-3084: Improve exceptions messages in POPackage (julien) PIG-3072: Pig job reporting negative progress (knoguchi via rohini) Modified: pig/trunk/src/docs/src/documentation/content/xdocs/func.xml URL: http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/func.xml?rev=1422348&r1=1422347&r2=1422348&view=diff ============================================================================== --- pig/trunk/src/docs/src/documentation/content/xdocs/func.xml (original) +++ pig/trunk/src/docs/src/documentation/content/xdocs/func.xml Sat Dec 15 20:29:08 2012 @@ -410,7 +410,7 @@ DUMP X;
Example -

In this example COUNT_STAR is used the count the tuples in a bag.

+

In this example COUNT_STAR is used to count the tuples in a bag.

X = FOREACH B GENERATE COUNT_STAR(A); @@ -1339,7 +1339,7 @@ dump X;

Use JsonStorage to store JSON data.

-

Note that there is no concept of delimit in JsonLoader or JsonStorer. The data is encoded in standard JSON format. JsonLoader optionally takes a schema as the construct argument.

+

Note that there is no concept of delimit in JsonLoader or JsonStorage. The data is encoded in standard JSON format. JsonLoader optionally takes a schema as the construct argument.

@@ -1453,7 +1453,7 @@ STORE X INTO 'output' USING PigDump();

Load/Store Statements

Load statements – PigStorage expects data to be formatted using field delimiters, either the tab character ('\t') or other specified character.

-

Store statements – PigStorage outputs data using field deliminters, either the tab character ('\t') or other specified character, and the line feed record delimiter ('\n').

+

Store statements – PigStorage outputs data using field delimiters, either the tab character ('\t') or other specified character, and the line feed record delimiter ('\n').

Field/Record Delimiters

Field Delimiters – For load and store statements the default field delimiter is the tab character ('\t'). You can use other characters as field delimiters, but separators such as ^A or Ctrl-A should be represented in Unicode (\u0001) using UTF-16 encoding (see Wikipedia ASCII, Unicode, and UTF-16).

@@ -1470,7 +1470,7 @@ STORE X INTO 'output' USING PigDump();

If the noschema option is NOT specified, and a schema is found, it gets loaded when loading data.

-

Note that regardless of whether or not you store the schema, you always need to specify the correct delimiter to read your data. If you store reading delimiter "#" and then load using the default delimiter, your data will not be parsed correctly.

+

Note that regardless of whether or not you store the schema, you always need to specify the correct delimiter to read your data. If you store using delimiter "#" and then load using the default delimiter, your data will not be parsed correctly.

Record Provenance

If tagPath or tagFile option is specified, PigStorage will add a pseudo-column INPUT_FILE_PATH or INPUT_FILE_NAME respectively to the beginning of the record. As the name suggests, it is the input file path/name containing this particular record. Please note tagsource is deprecated.

@@ -1511,7 +1511,7 @@ A = LOAD 'student' USING PigStorage('\t' A = LOAD 'student' AS (name: chararray, age:int, gpa: float); -

In this example PigStorage stores the contents of X into files with fields that are delimited with an asterisk ( * ). The STORE function specifies that the files will be located in a directory named output and that the files will be named part-nnnnn (for example, part-00000).

+

In this example PigStorage stores the contents of X into files with fields that are delimited with an asterisk ( * ). The STORE statement specifies that the files will be located in a directory named output and that the files will be named part-nnnnn (for example, part-00000).

STORE X INTO 'output' USING PigStorage('*'); @@ -1708,8 +1708,8 @@ STORE A INTO 'hbase://users_table' USING
Math Functions -

For general information about these functions, see the Java API Specification, -Class Math. Note the following:

+

For general information about these functions, see the Java API Specification, +Class Math. Note the following:

  • @@ -2464,7 +2464,7 @@ Use the ROUND function to return the val

    x

    -

    CEIL(x)

    +

    ROUND(x)

    @@ -2746,8 +2746,8 @@ Use the TANH function to return the hype
    String Functions -

    For general information about these functions, see the Java API Specification, -Class String. Note the following:

    +

    For general information about these functions, see the Java API Specification, +Class String. Note the following:

    • @@ -2821,14 +2821,14 @@ Use the INDEXOF function to determine th
      LAST_INDEX_OF -

      Returns the index of the last occurrence of a character in a string, searching backward from a start index.

      +

      Returns the index of the last occurrence of a character in a string, searching backward from the end of the string.

      Syntax
      -

      LAST_INDEX_OF(expression)

      +

      LAST_INDEX_OF(string, 'character')

      @@ -2853,22 +2853,13 @@ Use the INDEXOF function to determine th

      The character being searched for, in quotes.

      - - -

      startIndex

      - - -

      The index from which to begin the backward search.

      -

      The string index begins with zero (0).

      - -
      Usage

      -Use the LAST_INDEX_OF function to determine the index of the last occurrence of a character in a string. The backward search for the character begins at the designated start index. +Use the LAST_INDEX_OF function to determine the index of the last occurrence of a character in a string. The backward search for the character begins at the end of the string.

      @@ -3031,7 +3022,7 @@ REGEX_EXTRACT('192.168.1.5:8020', '(.*):
      -

      REGEX_EXTRACT (string, regex)

      +

      REGEX_EXTRACT_ALL (string, regex)

      @@ -3137,7 +3128,7 @@ Use the REPLACE function to replace exis

      For example, to change "open source software" to "open source wiki" use this statement: -REPLACE(string,'software','wiki'); +REPLACE(string,'software','wiki')

      Note that the REPLACE function is internally implemented using @@ -3189,10 +3180,12 @@ by prefixing them with double backslashe -

      Limit

      +

      limit

      -

      The number of times the pattern (the compiled representation of the regular expression) is applied.

      +

      If the value is positive, the pattern (the compiled representation of the regular expression) is applied at most limit-1 times, therefore the value of the argument means the maximum length of the result tuple. The last element of the result tuple will contain all input after the last match.

      +

      If the value is negative, no limit is applied for the length of the result tuple.

      +

      If the value is zero, no limit is applied for the length of the result tuple too, and trailing empty strings (if any) will be removed.

      @@ -3392,7 +3385,7 @@ Use the UPPER function to convert all ch Datetime Functions

      -For general information about datetime type operations, see the Java API Specification, +For general information about datetime type operations, see the Java API Specification, Java Date class, and JODA DateTime class. And for the information of ISO date and time formats, please refer to Date and Time Formats.

      @@ -4580,7 +4573,7 @@ In this example the top 10 occurrences a A = LOAD 'data' as (first: chararray, second: chararray); B = GROUP A BY (first, second); -C = FOREACH B generate FLATTEN(group), COUNT(*) as count; +C = FOREACH B generate FLATTEN(group), COUNT(A) as count; D = GROUP C BY first; // again group by first topResults = FOREACH D { result = TOP(10, 2, C); // and retain top 10 occurrences of 'second' in first