Return-Path: Delivered-To: apmail-incubator-general-archive@www.apache.org Received: (qmail 50849 invoked from network); 24 Sep 2007 06:00:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 24 Sep 2007 06:00:23 -0000 Received: (qmail 42957 invoked by uid 500); 24 Sep 2007 06:00:12 -0000 Delivered-To: apmail-incubator-general-archive@incubator.apache.org Received: (qmail 42831 invoked by uid 500); 24 Sep 2007 06:00:12 -0000 Mailing-List: contact general-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@incubator.apache.org Delivered-To: mailing list general@incubator.apache.org Received: (qmail 42820 invoked by uid 99); 24 Sep 2007 06:00:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Sep 2007 23:00:12 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of twgoetz@gmx.de designates 213.165.64.20 as permitted sender) Received: from [213.165.64.20] (HELO mail.gmx.net) (213.165.64.20) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 24 Sep 2007 06:00:13 +0000 Received: (qmail invoked by alias); 24 Sep 2007 05:59:51 -0000 Received: from p5B20210F.dip0.t-ipconnect.de (EHLO [192.168.2.101]) [91.32.33.15] by mail.gmx.net (mp023) with SMTP; 24 Sep 2007 07:59:51 +0200 X-Authenticated: #25330878 X-Provags-ID: V01U2FsdGVkX1/UjO8KYkPUKD7EDv8iBqth0XPcTPc6tNXc0BavyF dbPppGoUMcd82Y Message-ID: <46F7525D.7020506@gmx.de> Date: Mon, 24 Sep 2007 07:59:57 +0200 From: Thilo Goetz User-Agent: Thunderbird 2.0.0.6 (Windows/20070728) MIME-Version: 1.0 To: general@incubator.apache.org Subject: Re: Incubator Proposal: Pig References: <009001c7fa2d$6e76f2f0$6d9015ac@ds.corp.yahoo.com> <200709231606.49621.niclas@hedhman.org> In-Reply-To: <200709231606.49621.niclas@hedhman.org> X-Enigmail-Version: 0.95.3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Y-GMX-Trusted: 0 X-Virus-Checked: Checked by ClamAV on apache.org Niclas Hedhman wrote: [...] > > b) I can't say that I understand the technical merits of the proposal, and > just see the headline "analyzing large data sets". And I would like to know > the relationship with UIMA's statement "... analyze large volumes of > unstructured information..." and hear whether there are overlap, synergies > and/or collaboration in view. Niclas, I'm not 100% clear on where there could be synergies between Pig and UIMA. Map/reduce is a natural distribution strategy for UIMA, so executing UIMA programs on top of Hadoop seems natural. Maybe Pig can help with that and make it easier somehow. However, that is not clear to me from the proposal at this time. At the same time, I don't really think there is any overlap. Pig is concerned with computation in a distributed environment, while UIMA is agnostic in that respect. On the other hand, UIMA offers a component model to develop analysis modules and combine them into processing chains (with an emphasis on reuse). I do not see from the proposal that Pig is in the business of defining a component model. So synergies probably yes, no overlap as far as I can see. --Thilo --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscribe@incubator.apache.org For additional commands, e-mail: general-help@incubator.apache.org