cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From brandonwilli...@apache.org
Subject [2/3] git commit: Update pig readme
Date Sat, 26 May 2012 15:50:17 GMT
Update pig readme


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2dc27a17
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2dc27a17
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2dc27a17

Branch: refs/heads/trunk
Commit: 2dc27a17567fa448aae335e74cc46ab94339eba4
Parents: db68e03
Author: Brandon Williams <brandonwilliams@apache.org>
Authored: Sat May 26 10:50:00 2012 -0500
Committer: Brandon Williams <brandonwilliams@apache.org>
Committed: Sat May 26 10:50:00 2012 -0500

----------------------------------------------------------------------
 examples/pig/README.txt |   19 +++++++++++++++++--
 1 files changed, 17 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2dc27a17/examples/pig/README.txt
----------------------------------------------------------------------
diff --git a/examples/pig/README.txt b/examples/pig/README.txt
index 3bdbf10..57b8f57 100644
--- a/examples/pig/README.txt
+++ b/examples/pig/README.txt
@@ -1,7 +1,8 @@
 A Pig storage class that reads all columns from a given ColumnFamily, or writes
 properly formatted results into a ColumnFamily.
 
-Setup:
+Getting Started
+===============
 
 First build and start a Cassandra server with the default
 configuration and set the PIG_HOME and JAVA_HOME environment
@@ -31,7 +32,6 @@ for input and output:
 * PIG_OUTPUT_RPC_PORT : the port thrift is listening on for writing
 * PIG_OUTPUT_PARTITIONER : cluster partitioner for writing
 
-
 Then you can run it like this:
 
 examples/pig$ bin/pig_cassandra -x local example-script.pig
@@ -70,3 +70,18 @@ Which will copy the ColumnFamily.  Note that the destination ColumnFamily
must
 already exist for this to work.
 
 See the example in test/ to see how schema is inferred.
+
+Advanced Options
+================
+
+The following environment variables default to false but can be set to true to enable them:
+
+PIG_WIDEROW_INPUT:  this enables loading of rows with many columns without
+                    incurring memory pressure.  All columns will be in a bag and indexes
are not
+                    supported.
+
+PIG_USE_SECONDARY:  this allows easy use of secondary indexes within your
+                    script, by appending every index to the schema as 'index_$name', allowing
+                    filtering of loaded rows with a statement like "FILTER rows BY index_color
eq
+                    'blue'" if you have an index called 'color' defined.
+


Mime
View raw message