lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Performance question
Date Thu, 14 Jul 2011 11:50:35 GMT
Searching billions of anything is likely to be challenging. Mark
Miller's document at
http://www.lucidimagination.com/content/scaling-lucene-and-solr looks
well worth a read.

> -if i search on last week's index and the individual index (this needs to be
> opened at search request!?) will it be faster than using a single huge index
> for all groups, for all weeks?

Too many variables to say.


> -is* IndexSearcher searcher= new
> IndexSearcher(IndexReader.open(writer,false));*  read only?

Surely searchers are read only, by definition.


> -How can i give NearRealTime acces to an IndexWriter started in another
> application.

Sounds impossible.


> -How can i store alldocuments from results. Something like AllDocs
> (equivalent to TopDocs) of AllDocsCollector(
> TopDocsCollector).

Not clear what you are asking here, but you can pass whatever you like
as the max doc count to the assorted search methods, and do whatever
you want with the results.  Storing all docs from search results on a
massive index doesn't sound a very clever idea.


> I understood that Tweeter submitted their code on realTime architecture to
> lucene, can i get my hands on that ?

No idea.


--
Ian.


On Wed, Jul 13, 2011 at 10:09 AM, Mihai Caraman <caraman.mihai@gmail.com> wrote:
> Hello,
>
> My name is Mihai and I'm trying to write a java (later I'll need to port it
> to pylucene) search on billions of mentions like twitter statuses. Mentions
> are grouped by some containing keywords.
>
> I'm thinking of partitioning the index for faster results as follows:
>
>                              common index for the past week
>
> common index for earlier small groups  |  individual indexes for very large
> groups
>
> My questions are:
>
> -if i search on last week's index and the individual index (this needs to be
> opened at search request!?) will it be faster than using a single huge index
> for all groups, for all weeks?
> -is* IndexSearcher searcher= new
> IndexSearcher(IndexReader.open(writer,false));*  read only? if not how can i
> build numerous near-real-time readers on same writer(index)?
> -How can i give NearRealTime acces to an IndexWriter started in another
> application.
> -How can i store alldocuments from results. Something like AllDocs
> (equivalent to TopDocs) of AllDocsCollector(
> TopDocsCollector).
>
> I understood that Tweeter submitted their code on realTime architecture to
> lucene, can i get my hands on that ?
>
> Thank you in advance,
> Mihai
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message