lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <>
Subject Re: Distributed Indexing
Date Wed, 02 Feb 2011 03:52:05 GMT
Another use case is that N indexers operate independently, all pulling
data from the  same database. Each has a separate query to get the
documents in its policy.

On Tue, Feb 1, 2011 at 12:38 PM, Upayavira <> wrote:
> On Tue, 01 Feb 2011 19:04 +0000, "Alex Cowell" <> wrote:
> I noticed there is a comment in the
> org.apache.solr.servlet.DirectSolrConnection class which reads, "//Find a
> way to turn List<ContentStream> into File/SolrDocument". Did anyone find a
> way to do this?
> Turns out that comment was left over from some experimenting one of our team
> was doing. But I suppose the question still stands.
> Addressing the "retrieve the unique ID from the document" issue, does it
> matter if the unique ID you do the hash on is the actual uniqueKey of the
> document? Surely as long as you generate some value unique for each document
> to index (for example, the name of the doc/stream + the current time) it
> would still distribute the documents as we expect?
> Well, one requirement I've heard for this is for it to be deterministic.
> That is, a document will always go to the same shard, and you can work out
> at any point in time where a particular document is.
> Once you've parsed the document to a SolrInputDocument, surely you can get
> the ID/uniqueKey out? I'll do some digging tomorrow AM.
> Upayavira
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source

Lance Norskog

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message