lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gerrit Jansen van Vuuren (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (LUCENE-3653) Lucene Search not scalling
Date Sun, 18 Dec 2011 02:43:31 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171760#comment-13171760
] 

Gerrit Jansen van Vuuren edited comment on LUCENE-3653 at 12/18/11 2:43 AM:
----------------------------------------------------------------------------

|There is contention, but will not slowdown your search. Please keep synchronization there.
every RAMFile is 
|only opened once and then contention is gone. 

I'm not clear on this one, you mean every RAMFile is opened once per search? or Will be reused
across all searches? If the later all threads will block on RAMFile always, this is not minor
but major, especially taking into account that I want to move from 200 concurrent requests
to 8000.

|Not everything what your profiler shows as contention is one,only the first query will have
some minor contention.

:)  yes indeed. Thats why I've run several tests with warmups and try to fish out based on
call frequency, total time spent and total time a Thread  was blocking on a certain monitor.

|On the time-line of your profiler output I see no improvement in speed. How much faster does
your code get?

I've not profiled for individual search speed, but rather for contention, doing the tests
on my laptop first and will move these changes to a production system during next week to
test total search time with loading.

With the current version i.e. with the synchronization in there the search times go up drastically,
as I've mentioned above. On our production deploy and indexes search times go from 3ms on
single requests to 200-500ms and more on average when 200 concurrent requests are made. 

 
                
      was (Author: gerritjvv):
    |There is contention, but will not slowdown your search. Please keep synchronization there.
every RAMFile is |only opened once and then contention is gone. 

I'm not clear on this one, you mean every RAMFile is opened once per search? or Will be reused
across all searches? If the later all threads will block on RAMFile always, this is not minor
but major, especially taking into account that I want to move from 200 concurrent requests
to 8000.

|Not everything what your profiler shows as contention is one,only the first query will have
some minor contention.
:)  yes indeed. Thats why I've run several tests with warmups and try to fish out based on
call frequency, total time spent and total time a Thread  was blocking on a certain monitor.

|On the time-line of your profiler output I see no improvement in speed. How much faster does
your code get?

I've not profiled for individual search speed, but rather for contention, doing the tests
on my laptop first and will move these changes to a production system during next week to
test total search time with loading.

With the current version i.e. with the synchronization in there the search times go up drastically,
as I've mentioned above. On our production deploy and indexes search times go from 3ms on
single requests to 200-500ms and more on average when 200 concurrent requests are made. 

 
                  
> Lucene Search not scalling
> --------------------------
>
>                 Key: LUCENE-3653
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3653
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Gerrit Jansen van Vuuren
>         Attachments: App.java, LUCENE-3653-VirtualMethod+AttributeSource.patch, LUCENE-3653-VirtualMethod+AttributeSource.patch,
lucene-unsync.diff, profile_1_a.png, profile_1_b.png, profile_1_c.png, profile_1_d.png, profile_2_a.png,
profile_2_b.png, profile_2_c.png
>
>
> I've noticed that when doing thousands of searches in a single thread the average time
is quite low i.e. a few milliseconds. When adding more concurrent searches doing exactly the
same search the average time increases drastically. 
> I've profiled the search classes and found that the whole of lucene blocks on 
> org.apache.lucene.index.SegmentCoreReaders.getTermsReader
> org.apache.lucene.util.VirtualMethod
>   public synchronized int getImplementationDistance 
> org.apache.lucene.util.AttributeSourcew.getAttributeInterfaces
> These cause search times to increase from a few milliseconds to up to 2 seconds when
doing 500 concurrent searches on the same in memory index. Note: That the index is not being
updates at all, so not refresh methods are called at any stage.
> Some questions:
>   Why do we need synchronization here?
>   There must be a non-lockable solution for these, they basically cause lucene to be
ok for single thread applications but disastrous for any concurrent implementation.
> I'll do some experiments by removing the synchronization from the methods of these classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message