lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott <m.scott.ti...@gmail.com>
Subject Re: Searching documents on big index by using ParallelMultiSearcher is slow...
Date Wed, 04 Oct 2006 12:59:26 GMT
My index increases periodically.

Now 1 sec for 10G indexes.
I am worried that futurely, how about response time
  for 20G, 30G,,, and 50G indexes?

I'll try remote Hits (result set) object and the SearchMaster merges
  top N of them.

Thank you.

Erick Erickson wrote:
> OK, you're now officially beyond my competence, so I'll have to wait for
> people who actually know <G>....
> 
> Although if I read your stats right, you're getting approximately 1 sec
> response time over 10M documents on a 10G index. That's not bad at all. 
> What
> kind of response time do you need?
> 
> On 10/3/06, Scott <m.scott.tiger@gmail.com> wrote:
>>
>> Hi,
>>
>> > Well, the first question is always "are you opening/closing your
>> > IndexSearchers for each request on your remote machines?". This is
>> always a
>> > no-no. This is also a question for your single-searcher version.
>>
>> Yes I know, each search slave (RMI server) have single instance
>>   of IndexSearcher and it's open once when RMI server starts.
>>
>> > What is your performance if you only go to one server? I'd start by
>> finding
>>
>> A performance on one server with FULL index (not divided by 10)
>>   is about 2500 ms.
>> On one server with splitted index (divided by 10) is about 50 ms.
>>
>> And on ParallelMultiSearcher with 10 of remote searchable,
>>   each RemoteSearchable returns in about 50 - 100 ms,
>>   and ParallelMultiSearcher returns also 50 - 100 ms, because of
>>   threading.
>> but Hits Searcher.search(Query, Sort) responds in about 500 - 1000 ms.
>>
>> I think that Searcher.search with Sort reads all of SortFields from
>>   IndexReader and it's bottleneck.
>>
>> Are there results of high performance distributed Lucene with
>> ParallelMultiSearcher?
>> Or need hadoop?
>>
>> Erick Erickson wrote:
>> > Well, the first question is always "are you opening/closing your
>> > IndexSearchers for each request on your remote machines?". This is
>> always a
>> > no-no. This is also a question for your single-searcher version.
>> >
>> > What is your performance if you only go to one server? I'd start by
>> finding
>> > out what happens when you forget all the ParallelMultiSearcher stuff,
>> all
>> > the RMI stuff etc, and just see what your performance is on one of your
>> > index parts locally. Once that is answered, extend to RMI, then the
>> > Parallel...., at each step seeing if your performance degrades
>> > unacceptably.
>> > That'll at least give you a clue what part of the process is the 
>> biggest
>> > problem.
>> >
>> > And without knowing a LOT more about your searches, and your index, 
>> it's
>> > kind of hard to come up with solutions <G>....
>> >
>> > Best
>> > Erick
>> >
>> > On 10/3/06, Scott <m.scott.tiger@gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> I have a question about ParallelMultiSearcher performance.
>> >>
>> >> I want to search documents on about 10 gigabytes of index.
>> >> (The index has 10,000,000 documents.)
>> >>
>> >> I get very slow performance using IndexSearcher with ONE index
>> normally.
>> >> Then I tried to use ParallelMultiSearcher with 10 servers of remote
>> >> searchable.
>> >>
>> >> Index:
>> >> Each search slaves have 1/10 of index.
>> >> (ONE index divided to 10 servers.)
>> >>
>> >> Search slave:
>> >> Each search slaves start remote searchable RMI server,
>> >> and wait connecting from search master.
>> >>
>> >> Search master:
>> >> The search master use Naming.lookup() to get remote searchable.
>> >> Get 10 remote searchables from each search slaves and build
>> >> ParallelMultiSearcher.
>> >> Then search.
>> >>
>> >> Any solution?
>> >>
>> >> --
>> >> Scott
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>> -- 
>> Scott
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 

-- 
Scott

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message