mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: solr-recommender, recent changes to ToItemVectorsMapper
Date Mon, 05 Aug 2013 00:54:41 GMT
On Sun, Aug 4, 2013 at 5:34 PM, Pat Ferrel <pat@occamsmachete.com> wrote:

> Actually this brings up another point that I've harped on before. It sure
> would be nice to have a vector representation where you could attache
> arbitrary data to items or vectors. Not so memory efficient but it makes
> things like ID translation and timestamping actions trivial. If these could
> be attached and survive all the Mahout jobs there would be no need for the
> in-memory hashmap I'm using to translate IDs and the actions could be
> timestamped or other metadata could be attached. At present I guess
> everyone knows that only weights are attached to actions/matrix values and
> in some cases names to rows/vectors in DRMs.
>

This is where we started, actually.  The memory cost was fairly massive for
arbitrary objects being attached to sparse matrices.  The problem is that
the cost of the annotations isn't amortized very far in long-tail
situations.

If we restrict our attention to text annotations, then a heavily compressed
form might well be feasible.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message