lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mihai Caraman <caraman.mi...@gmail.com>
Subject Performance question
Date Wed, 13 Jul 2011 09:09:54 GMT
Hello,

My name is Mihai and I'm trying to write a java (later I'll need to port it
to pylucene) search on billions of mentions like twitter statuses. Mentions
are grouped by some containing keywords.

I'm thinking of partitioning the index for faster results as follows:

                              common index for the past week

common index for earlier small groups  |  individual indexes for very large
groups

My questions are:

-if i search on last week's index and the individual index (this needs to be
opened at search request!?) will it be faster than using a single huge index
for all groups, for all weeks?
-is* IndexSearcher searcher= new
IndexSearcher(IndexReader.open(writer,false));*  read only? if not how can i
build numerous near-real-time readers on same writer(index)?
-How can i give NearRealTime acces to an IndexWriter started in another
application.
-How can i store alldocuments from results. Something like AllDocs
(equivalent to TopDocs) of AllDocsCollector(
TopDocsCollector).

I understood that Tweeter submitted their code on realTime architecture to
lucene, can i get my hands on that ?

Thank you in advance,
Mihai

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message