pig-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Pig Wiki] Update of "GettingStarted" by AndrzejBialecki
Date Sat, 03 Nov 2007 13:15:16 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The following page has been changed by AndrzejBialecki:

The comment on the change is:
Getting Started web page, with some additions about the "local" mode.

New page:
= Getting Started =

== Requirements ==

   1. '''Java 1.6.x.''' preferably from Sun. Set JAVA_HOME to the root of your Java installation.
   2. '''Ant''' build tool: [http://ant.apache.org/].
   3. To run unit tests, you also need '''JUnit''': [http://junit.sourceforge.net/].
   4. To run pig programs, you need access to a '''Hadoop cluster''': [http://lucene.apache.org/hadoop/].
It's also possible to run pig in "local" mode, with severely limited performance - this mode
doesn't require setting up a Hadoop cluster.

== Building Pig ==

   1. Check out pig code from svn: `svn co http://svn.apache.org/repos/asf/incubator/pig/trunk`.
   2. Build the code from the top directory: `ant`. If the build is successful, you should
see `pig.jar` created in that directory.

== Running Pig Programs ==

There are two ways to run pig. The first way is by using `pig.pl` that can be found in the
scripts directory of your source tree. Using the script would require having Perl installed
on your machine. You can use it by issuing the following command: `pig.pl -cp pig.jar:HADOOPSITEPATH`
where HADOOPSITEPATH is the directory in which `hadoop-site.xml` file for your Hadoop cluster
is located. Example:

`pig.pl -cp pig.jar:/hadoop/conf`

The second way to do this is by using java directly:

`java -cp pig.jar:HADOOPSITEPATH org.apache.pig.Main`

This starts pig in the default map-reduce mode. You can also start pig in "local" mode:

`java -cp pig.jar org.apache.pig.Main -x local`


`java -jar pig.jar -x local`

Regardless of how you invoke pig, the commands that are specified above will take you to an
interactive shell called grunt where you can run DFS and pig commands. The documentation about
grunt will be posted on wiki soon. If you want to run Pig in batch mode, you can append your
pig script to either of the commands above. Example:

{{{pig.pl -cp pig.jar:/hadoop/conf myscript.pig}}}


{{{java -cp pig.jar:/hadoop/conf myscript.pig}}}

View raw message