lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arnon Mazza <arnon...@yahoo.com>
Subject Re: Join between indexes
Date Thu, 02 Feb 2012 20:56:44 GMT
Thanks, that's a very nice feature.
 
Would it also enable joining on the docId level, meaning that part of a document is kept
in some index and another part of the same document is kept in another index ?
 
In the example that was given in the articles & comments link, that could be for instance:
articles index:
- docId=1: "(1) this (2) paper (3) is (4) about (5) lucene". (numbers are positions in the
doc).
comments index:
- docId=1: "(3) very (4) recommended".
 
So that one would be able to know that the comment "very recommended" was written next to
the word "paper".
(Conceptually the query could be: articles.paper NEAR comments."very recommended").
 
Is this also part of the feature ?
 
Thanks,
Arnon.

From: Francisco A. Lozano <flozano@gmail.com>
To: java-user@lucene.apache.org 
Sent: Wednesday, February 1, 2012 7:56 PM
Subject: Re: Join between indexes

Wow, thanks for pointing this out, didn't know such a feature was in progress.

I see a mention that there are some chances this will be released in
3.6... crossing my fingers :)

Francisco A. Lozano



On Wed, Feb 1, 2012 at 17:09, Simon Willnauer
<simon.willnauer@googlemail.com> wrote:
> maybe this link will help: http://bit.ly/AhwIw6
>
> simon
>
> On Wed, Feb 1, 2012 at 3:05 PM, Arnon Mazza <arnon.ma@yahoo.com> wrote:
>> Assume we have a Lucene index over which several types of analyses are performed.
>>
>> Assume that the conclusions of some analysis require that new tokens be added
to existing documents in the index.
>> For example, a repeating pattern p (sequence of words) that appears in a large
part of the documents should be tagged in every document in its exact position.
>>
>> Now it is required to execute proximity queries involving standard terms and also
the pattern p (e.g. find all documents in which the word "hello" is adjacent to the pattern
p).
>>
>> Is there a way of achieving this without re-indexing all the documents where the
pattern p was found ?
>> In other words, is it possible to maintain a separate index that would keep only
patterns->docIds/positions, and then join between the two indexes ?
>>
>> If not, is there a plan to support this in the future ?
>>
>> Thanks,
>> Arnon.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message