lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From prasenjit mukherjee <prasen....@gmail.com>
Subject Re: using lucene to find neighbouring points in an n-dimensional space
Date Mon, 24 Oct 2011 01:33:22 GMT
Any pointers/suggestions on my approach ?


On 10/22/11, prasenjit mukherjee <prasen.bea@gmail.com> wrote:
> My use case is the following :
> Given an n-dimensional vector ( only +ve quadrants/points ) find its
> closest neighbours. I would like to try out with lucene's default
> ranking. Here is how a typical document will look like :
> <term-id:term-weight> ( or <dimension-id:dimension:weight> same thing
> )
>
> doc1 = 1245:15 3490:20 8856:20 etc.
>
> As reflected in the above example the number of dimensions is high ( ~
> 50K ) and the length of vectors are small ( < 40 ).
>
> I am thinking of constructing a  BooleanQuery in the following way (
> for doc1 as Query ) :
>
> BooleanQuery bq = new BooleanQuery()
> bq.add (new TermQuery(new Term("field", "1245") ),
> BooleanClause.Occur.SHOULD ) ;
> bq.add (new TermQuery(new Term("field", "3490") ),
> BooleanClause.Occur.SHOULD ) ;
> bq.add (new TermQuery(new Term("field", "8856") ),
> BooleanClause.Occur.SHOULD ) ;
>
> The problem is how do I pass the dimension-value ( 15, 20, 20 etc. )
> in the TermQuery.
>
> One solution is to pass as many TermQueries as the diemension value,
> but was thinking if there is any better way to pass the
> dimension-weight. I can probably do the same during indexing as
> latency is not an issue during indexing time.
>
> Any help is greatly appreciated.
>
> -Thanks,
> Prasenjit
>

-- 
Sent from my mobile device

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message