mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: cardinality vs size
Date Sat, 12 Feb 2011 18:57:55 GMT
Actually, I think that most of us understand that size refers to the
dimension of the vector (by analogy with ArrayList).

How about we go with a strong convention that size() returns dimensionality
and change the constructor args for RASV.  The real problem here is that
second argument.

Then if we need to, we can come up with an accessor that gives us back the
allocated capacity of a vector.  For DenseVector, that would be equal to
size().  For RASV it would start at the initialCapacity and grow as needed
but always be <= size() + epsilon and >= the number of non-zeros.  For some
other sparse formats, it might be equal to the current number of non-zeros.

On Sat, Feb 12, 2011 at 8:52 AM, Weishung Chung <weishung@gmail.com> wrote:

> I believe most of us understand that Vector.size() and Matrix.size() refer
> to the size of the vector or matrix, so it's not that a big deal.
> But I would recommend just rename the size in the constructor to
> initialCapacity which would be clear to most of us that it refers to the
> initialCapacity of the internal backing map. Just my two cents :D
>
> RandomAccessSparseVector(int cardinality, int size)
>
>
> On Sat, Feb 12, 2011 at 5:03 AM, Sebastian Schelter <ssc@apache.org>
> wrote:
>
> > You're right, I forgot about that. We'd have to rename Vector.size() to
> > Vector.dimension() to be consistent... And maybe Matrix.size() too?
> >
> > Makes the refactoring a little bit more complicated. I think we should
> also
> > keep Vector.size() and Matrix.size() as deprecated methods for a little
> time
> > so we don't break any uncommitted patches.
> >
> > What do you think?
> >
> > --sebastian
> >
> >
> > On 12.02.2011 03:29, Ted Dunning wrote:
> >
> >> It's a great idea.
> >>
> >> Changing any accessor names is a bit of a bigger deal, but still
> >> probably a good idea if we get consensus.
> >>
> >> On Fri, Feb 11, 2011 at 4:46 PM, Sebastian Schelter <ssc@apache.org
> >> <mailto:ssc@apache.org>> wrote:
> >>
> >>    Any objections to that? I'd go for a quick refactoring without a
> >>    jira if no one objects.
> >>
> >>
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message