From commits-return-3664-apmail-pig-commits-archive=pig.apache.org@pig.apache.org Thu Dec 16 22:50:10 2010
Return-Path:
Delivered-To: apmail-pig-commits-archive@www.apache.org
Received: (qmail 67708 invoked from network); 16 Dec 2010 22:50:06 -0000
Received: from unknown (HELO mail.apache.org) (140.211.11.3)
by 140.211.11.9 with SMTP; 16 Dec 2010 22:50:06 -0000
Received: (qmail 46575 invoked by uid 500); 16 Dec 2010 22:50:06 -0000
Delivered-To: apmail-pig-commits-archive@pig.apache.org
Received: (qmail 46556 invoked by uid 500); 16 Dec 2010 22:50:06 -0000
Mailing-List: contact commits-help@pig.apache.org; run by ezmlm
Precedence: bulk
List-Help:
List-Unsubscribe:
List-Post:
List-Id:
Reply-To: dev@pig.apache.org
Delivered-To: mailing list commits@pig.apache.org
Received: (qmail 46549 invoked by uid 99); 16 Dec 2010 22:50:05 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Dec 2010 22:50:05 +0000
X-ASF-Spam-Status: No, hits=-2000.0 required=10.0
tests=ALL_TRUSTED
X-Spam-Check-By: apache.org
Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4)
by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Dec 2010 22:50:03 +0000
Received: by eris.apache.org (Postfix, from userid 65534)
id 83D212388A64; Thu, 16 Dec 2010 22:49:43 +0000 (UTC)
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Subject: svn commit: r1050207 - in /pig/trunk: CHANGES.txt
src/docs/src/documentation/content/xdocs/basic.xml
src/docs/src/documentation/content/xdocs/test.xml
Date: Thu, 16 Dec 2010 22:49:43 -0000
To: commits@pig.apache.org
From: olga@apache.org
X-Mailer: svnmailer-1.0.8
Message-Id: <20101216224943.83D212388A64@eris.apache.org>
Author: olga
Date: Thu Dec 16 22:49:43 2010
New Revision: 1050207
URL: http://svn.apache.org/viewvc?rev=1050207&view=rev
Log:
PIG-1768: 09 docs: illustrate (changec via olgan)
Modified:
pig/trunk/CHANGES.txt
pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml
pig/trunk/src/docs/src/documentation/content/xdocs/test.xml
Modified: pig/trunk/CHANGES.txt
URL: http://svn.apache.org/viewvc/pig/trunk/CHANGES.txt?rev=1050207&r1=1050206&r2=1050207&view=diff
==============================================================================
--- pig/trunk/CHANGES.txt (original)
+++ pig/trunk/CHANGES.txt Thu Dec 16 22:49:43 2010
@@ -24,6 +24,8 @@ INCOMPATIBLE CHANGES
IMPROVEMENTS
+PIG-1768: 09 docs: illustrate (changec via olgan)
+
PIG-1768: docs reorg (changec via olgan)
PIG-1712: ILLUSTRATE rework (yanz)
Modified: pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml
URL: http://svn.apache.org/viewvc/pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml?rev=1050207&r1=1050206&r2=1050207&view=diff
==============================================================================
--- pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml (original)
+++ pig/trunk/src/docs/src/documentation/content/xdocs/basic.xml Thu Dec 16 22:49:43 2010
@@ -284,12 +284,36 @@ grunt> C = FOREACH B GENERATE COUNT ($0)
grunt> DUMP C;
-
-
+
+
Data Types and More
+
+
+Identifiers
+
Identifiers include the names of relations (aliases), fields, variables, and so on.
+In Pig, identifiers start with a letter and can be followed by any number of letters, digits, or underscores.
Displays a step-by-step execution of a sequence of statements.
@@ -372,7 +372,7 @@ Local Rearrange[tuple]{chararray}(false)
-script scriptfile
-
The script keyword followed by the name of a Pig script file (for example, myscript.pig).
+
The script keyword followed by the name of a Pig script (for example, myscript.pig).
The script file should not contain an ILLUSTRATE statement.
@@ -380,92 +380,128 @@ Local Rearrange[tuple]{chararray}(false)
Usage
-
Use the ILLUSTRATE operator to review how data is transformed through a sequence of Pig Latin statements.
- You can run ILLUSTRATE with a relation or a Pig script.
+
Use the ILLUSTRATE operator to review how data is transformed through a sequence of Pig Latin statements.
+ ILLUSTRATE allows you to test your programs on small datasets and get faster turnaround times.
ILLUSTRATE accesses the ExampleGenerator algorithm which can select an appropriate and concise set of example data automatically. It does a better job than random sampling would do; for example, random sampling suffers from the drawback that selective operations such as filters or joins can eliminate all the sampled data, giving you empty results which will not help with debugging.
+The algorithm works by retrieving a small sample of the input data and then propagating this data through the pipeline. However, some operators, such as JOIN or FILTER, can eliminate tuples from the data - and this could result in no data following through the pipeline. To address this issue, the algorithm will automatically generate example data, in near real-time. Thus, you might see data propagating through the pipeline that was not found in the original input data, but this data changes nothing and ensures that you will be able to examine the semantics of your Pig Latin statements.
-
With the ILLUSTRATE operator you can test your programs on small datasets and get faster turnaround times. The ExampleGenerator algorithm uses Pig's local mode (rather than Pig's mapreduce mode) which means that illustrative example data is generated in near real-time.
-
-
+
As shown in the examples below, you can use ILLUSTRATE to review a relation or an entire Pig script.
+
Example - Relation
This example demonstrates how to use ILLUSTRATE with a relation. Note that the LOAD statement must include a schema (the AS clause).