lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Albert Vila <...@imente.com>
Subject Re: Clustering question: searching two diferent indexes
Date Wed, 23 Jun 2004 07:30:22 GMT
Thanks Otis, but I can merge two indexes with different fields?

My big index has this fields, code, title, content, language and date. I 
add the new documents incrementally.

The clustering index only contains the fields code, and cluster. Merging 
the big index with the clustering one will preserve the order of the big 
one? For example, if I have the following indexes:
Big index
code_1, title_1, content_1, language_1, date_1
code_2, title_2, content_2, language_2, date_2
...

Clustering index
code_1, cluster_1
code_2, cluster_2
...

then the new merged index will be:

Merged index
code_1, title_1, content_1, language_1, date_1, cluster_1
code_2, title_2, content_2, language_2, date_2, cluster_2
...

If I can do that then fine, but I think the merging process uses the 
lucene internal ID to match the documents. I wanna use the code field to 
do that matching, is that possible?. I cannot be sure the lucene 
internal ID's are the same for the same codes in both indexes.

Thanks again,

Albert


Otis Gospodnetic wrote:

>(re-directing to lucene-user list)
>
>Albert,
>
>If I understand your question correctly... You could run a query like
>the one you gave on both indices, but if one of them contains documents
>that have only one of those fields (cluster), then there will never be
>any matches in the second index.
>
>However, why not leave your big index along, add documents to a new,
>smaller index, and then merge them periodically.  I may be off with
>this; it sounds like this is what you want to do, but I'm not certain I
>understood you fully.
>
>Otis
>
>--- Albert Vila <avp@imente.com> wrote:
>  
>
>>Hi all,
>>
>>I was wondering If I can search using the MultiSearcher over two 
>>diferent indexes at the same time (with diferent fields).
>>I've got one big index, with the code, title, content, language, etc 
>>fields (new documents are added incrementally). Now, I have to
>>introduce 
>>a clustering field. The problem is that I have to update the whole
>>index 
>>each time the clusters change, and I have no enought time to do it (I
>>
>>wanna check for new clusters every 10 minuts and I spent 25 minutes
>>to 
>>reindex the whole index).
>>A query example could be: language:0 and title:java and cluster:0
>>
>>Can I leave the big index whitout any changes and create a new index 
>>with only the following fields, code and cluster, and perform the 
>>searches using this two indexes? I think I cannot do that without 
>>changing the code. It would need a postprocess, matching all
>>returning 
>>codes from index 1 with index 2.
>>
>>Anyone have a solution for this problem? I would appreciate that.
>>    
>>
>
>
>
>  
>

-- 
Albert Vila
Director de proyectos I+D
http://www.imente.com
902 933 242
[iMente “La información con más beneficios”]


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message