Return-Path: X-Original-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D4BF1D222 for ; Thu, 6 Sep 2012 19:47:07 +0000 (UTC) Received: (qmail 49614 invoked by uid 500); 6 Sep 2012 19:47:07 -0000 Delivered-To: apmail-incubator-crunch-dev-archive@incubator.apache.org Received: (qmail 49581 invoked by uid 500); 6 Sep 2012 19:47:07 -0000 Mailing-List: contact crunch-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: crunch-dev@incubator.apache.org Delivered-To: mailing list crunch-dev@incubator.apache.org Received: (qmail 49374 invoked by uid 99); 6 Sep 2012 19:47:07 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Sep 2012 19:47:07 +0000 Date: Fri, 7 Sep 2012 06:47:07 +1100 (NCT) From: "Kiyan Ahmadizadeh (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: <1856110551.46383.1346960827497.JavaMail.jiratomcat@arcas> In-Reply-To: <119345965.46356.1346960347649.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (CRUNCH-58) Implement PObject in Crunch/Scrunch MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13449983#comment-13449983 ] Kiyan Ahmadizadeh commented on CRUNCH-58: ----------------------------------------- Discussion of implementing PObjects started on the CRUNCH-57 ticket. Josh gave this suggestion for an implementation: -- Kiyan, do you have an opinion on how you want to go about this one? Do you want to take on defining PObject (which in my mind, could just be a simple wrapper that materialized a PCollection and then implemented some abstract function that did a computation on the materialized Iterable) and incorporate it here? -- Josh, I think PObject should be a wrapper around PCollection, but the underlying PCollection should contain only one element (or be treated as such). In other words, it should wrap the result of a distributed computation that reduced/combined a source PCollection into a target PCollection of 1 element. Then PObject could have a getValue method that materialized the underlying PCollection and returned the singleton element found within. I'm not sure if we want to strongly enforce that the underlying PCollection for a PObject contains one element by throwing an exception, or if we simply ignore any element but the first in the underlying PCollection. Your suggestion for "some abstract function that did a computation on the materialized Iterable" doesn't make sense to me, since in my mind a PObject should only care about the first element in its underlying PCollection. Could you clarify? > Implement PObject in Crunch/Scrunch > ----------------------------------- > > Key: CRUNCH-58 > URL: https://issues.apache.org/jira/browse/CRUNCH-58 > Project: Crunch > Issue Type: New Feature > Affects Versions: 0.3.0 > Reporter: Kiyan Ahmadizadeh > Assignee: Kiyan Ahmadizadeh > > FlumeJava has the concept of a PObject, a container for a singleton of type T. It is meant represent the result of a distributed computation that yields a singleton value (for example max, min, and length methods on PCollection). Generally speaking, the result of any computation that combines/reduces a PCollection into a singleton value could be represented by a PObject. > Like PCollection, a PObject defers distributed computation until its value is actually used. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira