Return-Path: Delivered-To: apmail-hadoop-pig-dev-archive@www.apache.org Received: (qmail 35319 invoked from network); 2 Dec 2009 17:52:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Dec 2009 17:52:43 -0000 Received: (qmail 73836 invoked by uid 500); 2 Dec 2009 17:52:43 -0000 Delivered-To: apmail-hadoop-pig-dev-archive@hadoop.apache.org Received: (qmail 73811 invoked by uid 500); 2 Dec 2009 17:52:42 -0000 Mailing-List: contact pig-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: pig-dev@hadoop.apache.org Delivered-To: mailing list pig-dev@hadoop.apache.org Received: (qmail 73801 invoked by uid 99); 2 Dec 2009 17:52:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Dec 2009 17:52:42 +0000 X-ASF-Spam-Status: No, hits=-10.5 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Dec 2009 17:52:40 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 9BB5B234C045 for ; Wed, 2 Dec 2009 09:52:20 -0800 (PST) Message-ID: <1331386185.1259776340632.JavaMail.jira@brutus> Date: Wed, 2 Dec 2009 17:52:20 +0000 (UTC) From: "Alan Gates (JIRA)" To: pig-dev@hadoop.apache.org Subject: [jira] Commented: (PIG-966) Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces In-Reply-To: <1171122994.1253304675981.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PIG-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784931#action_12784931 ] Alan Gates commented on PIG-966: -------------------------------- You can make an argument for putting it in either place. I argue for putting it in for a couple of reasons: It is useful to a large number of potential optimizations. Unlike most other statistics, it can be used in correctness checks (eg the user asked for a merge join, is the data sorted on the join key?) The only downside I can see is that some systems that will understand column names and types won't necessarily understand sortedness (like json). But it's no harder for the loader to figure out sortedness for the schema than it is for the statistics. > Proposed rework for LoadFunc, StoreFunc, and Slice/r interfaces > --------------------------------------------------------------- > > Key: PIG-966 > URL: https://issues.apache.org/jira/browse/PIG-966 > Project: Pig > Issue Type: Improvement > Components: impl > Reporter: Alan Gates > Assignee: Alan Gates > > I propose that we rework the LoadFunc, StoreFunc, and Slice/r interfaces significantly. See http://wiki.apache.org/pig/LoadStoreRedesignProposal for full details -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.