lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Smith <john_smith9...@yahoo.com>
Subject Re: ParallelReader
Date Mon, 10 Oct 2005 20:14:29 GMT
Sorry to bug people on this again and again.
I might be missing something or confused totally, But what is the use case for a ParallelReader
if the use case is not addressing the situation where we have a index changing frequently(
meaning deletes and reindex)  and index not changing , but has same number of docs. Wouldn't
people want to stick with just one index in any case?
 
Any comments or response appreciated.
 
JZ

John Smith <john_smith9910@yahoo.com> wrote:
A while ago I had asked a question on what would be a good solution for a situation mentioned
below and I was pointed in the direction of Parallel Reader. Looks like that will not work.
Thank you for alerting me on this.

So other than delete and reindex the document to a single index, there is no way of addressing
the situation.

JZ


Eyal wrote:Run a search on "Lucene ParallelReader" in google - You'll find something
Doug Cutting wrote that I believe is what you're looking for.

Eyal 


> -----Original Message-----
> From: John Smith [mailto:john_smith9910@yahoo.com] 
> Sent: Thursday, August 11, 2005 21:12 PM
> To: java-user@lucene.apache.org
> Subject: Updating existing documents in index: Solutions 
> 
> 
> Hi all
> 
> 
> 
> This is a slightly long email. Pardon me.
> 
> 
> 
> As Lucene does not allow for updating an existing document in 
> the index, the only option is to delete and reindex the 
> message.When you have too many updates, this gets a little 
> cumbersome. In our case, as such the actual content of the 
> document being indexed does 
> 
> not change, but the fields around the content, like say 
> "LastReadby" or something like Folder associated with it etc 
> change. These are all fields that have been indexed as a part 
> of the original document in the index.
> 
> 
> 
> I have been contemplating putting these "commonly changing 
> fields" into one index and allow for delete and reindex on 
> this index alone and keep the static data in another index. 
> DocumentID will be a stored field and will be stored in both 
> the static and dynamic index, as a way of identifying the document.
> 
> 
> 
> Static index: Contains content of document indexed and 
> documentID stored.
> 
> Dynamic index: Contains all fields about the document which 
> change frequently indexed and documentID stored.
> 
> 
> 
> 
> 
> Questions
> 
> 
> 
> 1. First of all, is there a better solution to this 
> frequently changing fields having to be reindexed ?
> 
> 
> 
> 2. Let's say I go with the 2 index approach, 
> 
> 
> 
> Example query: Content: "Hello world" AND Folder:Folder1 AND 
> LastReadBy: jane. If we execute these queries on our static 
> and dynamic indexes, they will obviously fail to get hits.
> 
> 
> 
> 
> 
> Let's say I have a way of splitting my queries such that 
> all content queries go to static (content) index only and 
> queries on other fields go to the dynamic index, basically 
> allow for queries to come in such a way that it is always a 
> AND between the dynamic index result set and static index 
> result set. So on the results set, I would have to retrieve 
> the document ID and make sure we have the same documentID in 
> both the result sets, in order for it to be a match.
> 
> In cases where the result sets are really huge from 
> both the queries, then even to get the number of hits, I will 
> have to retrieve each and every document from the results, in 
> order to get the documentID for comparison. Queries can get 
> really slow.
> 
> 
> 
> Has anyone faced similar problems, If so what was your solution?
> 
> Any comments/thoughts will be appreciated.
> 
> 
> 
> Thank you
> 
> JS
> 
> 
> 


Daniel Naber wrote:
On Montag 10 Oktober 2005 20:24, John Smith wrote:

> My understanding is ParallelReader works for situations where you have a
> static index and a dynamic index.

That's no correct. Quoting the documentation:

It is up to you to make sure all indexes
are created and modified the same way. For example, if you add
documents to one index, you need to add the same documents in the
same order to the other indexes. Failure to do so will result in
undefined behavior.

Regards
Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------
Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
		
---------------------------------
 Yahoo! Music Unlimited - Access over 1 million songs. Try it free.
Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message