mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Reading Vectors Created from a Lucene Index
Date Wed, 30 Jun 2010 13:11:09 GMT
Kris,

Can you try the patch at https://issues.apache.org/jira/secure/attachment/12448396/MAHOUT-379-lucene.patch

Thanks,
Grant

On Jun 30, 2010, at 8:53 AM, Grant Ingersoll wrote:

> 
> On Jun 30, 2010, at 8:39 AM, Grant Ingersoll wrote:
> 
>> 
>> On Jun 29, 2010, at 1:54 PM, Kris Jack wrote:
>> 
>>> Hi everyone,
>>> 
>>> I have been using mahout to generate vectors from a lucene index using:
>>> 
>>> $MAHOUT_HOME/bin/mahout lucene.vector
>>> 
>>> In doing so, mahout creates an output file that has new ids for my
>>> documents, that are completely unlike my original --idField, that is a
>>> string.  How can I relate the new ids to my original ids?  Is there is a
>>> method that allows me to output the vectors with the original --idField
>>> values that appear in the lucene index rather than the new doc ids?
>> 
>> 
>> Hmm, it seems the --idField stuff has been commented out, likely with the change
of labels.
>> 
> 
> I've brought the issue up over on dev@, as it is a bug.

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search


Mime
View raw message