lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: Distributed Indexing
Date Wed, 02 Feb 2011 03:52:05 GMT
Another use case is that N indexers operate independently, all pulling
data from the  same database. Each has a separate query to get the
documents in its policy.

On Tue, Feb 1, 2011 at 12:38 PM, Upayavira <uv@odoko.co.uk> wrote:
>
> On Tue, 01 Feb 2011 19:04 +0000, "Alex Cowell" <alxcwll@gmail.com> wrote:
>
> I noticed there is a comment in the
> org.apache.solr.servlet.DirectSolrConnection class which reads, "//Find a
> way to turn List<ContentStream> into File/SolrDocument". Did anyone find a
> way to do this?
>
> Turns out that comment was left over from some experimenting one of our team
> was doing. But I suppose the question still stands.
>
> Addressing the "retrieve the unique ID from the document" issue, does it
> matter if the unique ID you do the hash on is the actual uniqueKey of the
> document? Surely as long as you generate some value unique for each document
> to index (for example, the name of the doc/stream + the current time) it
> would still distribute the documents as we expect?
>
>
> Well, one requirement I've heard for this is for it to be deterministic.
> That is, a document will always go to the same shard, and you can work out
> at any point in time where a particular document is.
>
> Once you've parsed the document to a SolrInputDocument, surely you can get
> the ID/uniqueKey out? I'll do some digging tomorrow AM.
>
> Upayavira
>
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source



-- 
Lance Norskog
goksron@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message