lucene-pylucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andi Vajda <va...@apache.org>
Subject Re: Indexing Non-Textual Data
Date Wed, 06 Apr 2011 22:11:22 GMT

  Hi,

On Wed, 6 Apr 2011, Chris Spencer wrote:

> I'm new to PyLucene, so forgive me if this is a newbie question. I have a
> dataset composed of several thousand lists of 128 integer features, each
> list associated with a class label. Would it be possible to use Lucene as a
> classifier, by indexing the label with respect to these integer features,
> and then classify a new list by finding the most similar labels with Lucene?

I believe there is support in Lucene for indexing numeric values using a 
Trie. Please ask on java-user@lucene.apache.org (subscribe first by sending 
mail to jave-user-subscribe@lucene.apache.org). There are many more Lucene 
experts with answers there.

For example, this class may be relevant:
http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/document/NumericField.html

Andi..

>
> I've been going through the PyLucene samples, but they only seem to involve
> indexing text, not continuous features (understandably). Could anyone point
> me to an example that indexes non-textual data?
>
> I think the project Lire (http://www.semanticmetadata.net/lire/) is using
> Lucene to do something similar to this, although with an emphasis on image
> features. I've dug into their code a little, but I'm not a strong Java
> programmer, so I'm not sure how they're pulling it off, nor how I might
> translate this into the PyLucene API. In your opinion, is this a practical
> use of Lucene?
>
> Regards,
> Chris
>

Mime
View raw message