hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "pi song" <pi.so...@gmail.com>
Subject Change in Pig Wiki
Date Mon, 10 Mar 2008 22:27:22 GMT
I saw a change in Pig Wiki frontpage :-

- [http://incubator.apache.org/pig/ Pig] is a platform for analyzing large
data sets. Pig's language, Pig Latin, is a simple query algebra that lets
you express data transformations such as merging data sets, filtering them,
and applying functions to records or groups of records. Users can create
their own functions to do special-purpose processing.

+ [http://incubator.apache.org/pig/ Pig] is a dataflow programming
environment for processing very large files. Pig's language is called Pig
Latin. A Pig Latin program consists of a directed acyclic graph where each
node represents an operation that transforms data. Operations are of two
flavors: (1) relational-algebra style operations such as join, filter,
project; (2) functional-programming style operators such as map, reduce.

Is there any change in philosophy? What is the difference between "a
platform for analyzing large data sets" and "dataflow programming
environment" ?  Does the term "data flow programming environment" imply that
Pig can run across multiple file systems at the same time?

Cheers,
Pi

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message