Return-Path: X-Original-To: apmail-mahout-dev-archive@www.apache.org Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 30A92DC2D for ; Thu, 23 May 2013 18:55:22 +0000 (UTC) Received: (qmail 23233 invoked by uid 500); 23 May 2013 18:55:21 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 22982 invoked by uid 500); 23 May 2013 18:55:21 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 22943 invoked by uid 99); 23 May 2013 18:55:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 May 2013 18:55:20 +0000 Date: Thu, 23 May 2013 18:55:20 +0000 (UTC) From: "Jake Mannix (JIRA)" To: dev@mahout.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAHOUT-1227) Vector.iterateNonZero() is super-clumsy to use: add Iterable allNonZero() MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAHOUT-1227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jake Mannix updated MAHOUT-1227: -------------------------------- Description: Currently, our codebase is littered with the following: {code} Iterator it = vector.iterateNonZero(); while (it.hasNext()) { Element e = it.next(); ... {code} wouldn't it be nice to be able to do: {code} for (Element e : vector.allNonZero()) { ... {code} instead? I propose adding an Iterable allNonZero() which allow this syntactic sugar. To make it symmetric with iterateAll, let's also add Iterable all(), and implement the simply in AbstractVector. The first diff adding this is very non-invasive - new methods added to interface, implemented in the three classes from which all Vector implementations derive (AbstractVector, NamedVector, and DelegatingVector). User code should just work, unless they've implemented their own vector without subclassing one of these three (yikes). Next diff, which is more invasive, would remove "extends Iterable" from Vector, because using the foreach of a Vector itself is very rarely what the caller really means to do (it's the all-iterator, very bad for the more common sparse use case). To achieve the same effect, the caller chooses between vector.all() and vector.allNonZero(), and then they're being crystal clear what they mean. Lastly, I'd propose we make iterateAll() and iterateAllNonZero() protected methods on AbstractVector, so that we are forced to remove all the clumsy places where we do Iterator it = ... all throughout the codebase. I suspect there will be very few places left that really want the raw iterator, but if there are any, it can be gotten by calling vector.(all/allNonZero).iterator() (feature-request/api fix suggestion idea courtesy of Andy Schlaikjer, formalized as a proposal and posted up here by me, Jake Mannix) was: Currently, our codebase is littered with the following: [code] Iterator it = vector.iterateNonZero(); while (it.hasNext()) { Element e = it.next(); ... [code] wouldn't it be nice to be able to do: [code] for (Element e : vector.allNonZero()) { ... [code] instead? I propose adding an Iterable allNonZero() which allow this syntactic sugar. To make it symmetric with iterateAll, let's also add Iterable all(), and implement the simply in AbstractVector. The first diff adding this is very non-invasive - new methods added to interface, implemented in the three classes from which all Vector implementations derive (AbstractVector, NamedVector, and DelegatingVector). User code should just work, unless they've implemented their own vector without subclassing one of these three (yikes). Next diff, which is more invasive, would remove "extends Iterable" from Vector, because using the foreach of a Vector itself is very rarely what the caller really means to do (it's the all-iterator, very bad for the more common sparse use case). To achieve the same effect, the caller chooses between vector.all() and vector.allNonZero(), and then they're being crystal clear what they mean. Lastly, I'd propose we make iterateAll() and iterateAllNonZero() protected methods on AbstractVector, so that we are forced to remove all the clumsy places where we do Iterator it = ... all throughout the codebase. I suspect there will be very few places left that really want the raw iterator, but if there are any, it can be gotten by calling vector.(all/allNonZero).iterator() (feature-request/api fix suggestion idea courtesy of Andy Schlaikjer, formalized as a proposal and posted up here by me, Jake Mannix) > Vector.iterateNonZero() is super-clumsy to use: add Iterable allNonZero() > ---------------------------------------------------------------------------------- > > Key: MAHOUT-1227 > URL: https://issues.apache.org/jira/browse/MAHOUT-1227 > Project: Mahout > Issue Type: Improvement > Environment: all > Reporter: Andy Schlaikjer > Assignee: Jake Mannix > Fix For: 0.8 > > > Currently, our codebase is littered with the following: > {code} > Iterator it = vector.iterateNonZero(); > while (it.hasNext()) { > Element e = it.next(); > ... > {code} > wouldn't it be nice to be able to do: > {code} > for (Element e : vector.allNonZero()) { > ... > {code} > instead? > I propose adding an Iterable allNonZero() which allow this syntactic sugar. To make it symmetric with iterateAll, let's also add Iterable all(), and implement the simply in AbstractVector. > The first diff adding this is very non-invasive - new methods added to interface, implemented in the three classes from which all Vector implementations derive (AbstractVector, NamedVector, and DelegatingVector). User code should just work, unless they've implemented their own vector without subclassing one of these three (yikes). > Next diff, which is more invasive, would remove "extends Iterable" from Vector, because using the foreach of a Vector itself is very rarely what the caller really means to do (it's the all-iterator, very bad for the more common sparse use case). To achieve the same effect, the caller chooses between vector.all() and vector.allNonZero(), and then they're being crystal clear what they mean. > Lastly, I'd propose we make iterateAll() and iterateAllNonZero() protected methods on AbstractVector, so that we are forced to remove all the clumsy places where we do Iterator it = ... all throughout the codebase. I suspect there will be very few places left that really want the raw iterator, but if there are any, it can be gotten by calling vector.(all/allNonZero).iterator() > (feature-request/api fix suggestion idea courtesy of Andy Schlaikjer, formalized as a proposal and posted up here by me, Jake Mannix) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira