lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Upayavira" ...@odoko.co.uk>
Subject Re: Distributed Indexing
Date Wed, 02 Feb 2011 08:31:07 GMT


On Tue, 01 Feb 2011 19:52 -0800, "Lance Norskog" <goksron@gmail.com>
wrote:
> Another use case is that N indexers operate independently, all pulling
> data from the  same database. Each has a separate query to get the
> documents in its policy.

But surely in this case, you are externalising the policy, and Solr
doesn't need to know about it? I.e. your indexers are deciding what goes
in what shard, not Solr?

Upayavira

> On Tue, Feb 1, 2011 at 12:38 PM, Upayavira <uv@odoko.co.uk> wrote:
> >
> > On Tue, 01 Feb 2011 19:04 +0000, "Alex Cowell" <alxcwll@gmail.com> wrote:
> >
> > I noticed there is a comment in the
> > org.apache.solr.servlet.DirectSolrConnection class which reads, "//Find a
> > way to turn List<ContentStream> into File/SolrDocument". Did anyone find a
> > way to do this?
> >
> > Turns out that comment was left over from some experimenting one of our team
> > was doing. But I suppose the question still stands.
> >
> > Addressing the "retrieve the unique ID from the document" issue, does it
> > matter if the unique ID you do the hash on is the actual uniqueKey of the
> > document? Surely as long as you generate some value unique for each document
> > to index (for example, the name of the doc/stream + the current time) it
> > would still distribute the documents as we expect?
> >
> >
> > Well, one requirement I've heard for this is for it to be deterministic.
> > That is, a document will always go to the same shard, and you can work out
> > at any point in time where a particular document is.
> >
> > Once you've parsed the document to a SolrInputDocument, surely you can get
> > the ID/uniqueKey out? I'll do some digging tomorrow AM.
> >
> > Upayavira
> >
> > ---
> > Enterprise Search Consultant at Sourcesense UK,
> > Making Sense of Open Source
> 
> 
> 
> -- 
> Lance Norskog
> goksron@gmail.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message