lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Solt, Ill├ęs" <illes.s...@gmail.com>
Subject unique term identifiers
Date Mon, 18 Jan 2010 14:07:35 GMT
Hi,

 I am looking for a way to represent term frequency data in a vector 
space, thus using unique integer identifiers instead of string. This 
would allow feeding tools like LIBSVM from a Lucene index.

 A small example: TermFreqVector.toString() produces "{TITLE: one/3, 
two/4}". What I am looking for is "1:3 2:4", where 1 and 2 are arbitrary 
identifiers, sortedness is not an issue.

 The task can obviously be solved using some java Map, but it should be 
less efficient then using native Lucene methods.

 I am using 2.9.1, my index can be considered constant.


Thanks,
Illes Solt



 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message