incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thilo Goetz <>
Subject Re: Incubator Proposal: Pig
Date Mon, 24 Sep 2007 05:59:57 GMT
Niclas Hedhman wrote:
> b) I can't say that I understand the technical merits of the proposal, and 
> just see the headline "analyzing large data sets". And I would like to know 
> the relationship with UIMA's statement "... analyze large volumes of 
> unstructured information..." and hear whether there are overlap, synergies 
> and/or collaboration in view.


I'm not 100% clear on where there could be synergies between
Pig and UIMA.  Map/reduce is a natural distribution
strategy for UIMA, so executing UIMA programs on top of Hadoop
seems natural.  Maybe Pig can help with that and make it easier
somehow.  However, that is not clear to me from the proposal
at this time.

At the same time, I don't really think there is any overlap.
Pig is concerned with computation in a distributed environment,
while UIMA is agnostic in that respect.  On the other hand,
UIMA offers a component model to develop analysis modules and
combine them into processing chains (with an emphasis on reuse).
I do not see from the proposal that Pig is in the business of
defining a component model.

So synergies probably yes, no overlap as far as I can see.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message