lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gerrit Jansen van Vuuren (Issue Comment Edited) (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (LUCENE-3653) Lucene Search not scalling
Date Sun, 18 Dec 2011 01:27:30 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171742#comment-13171742
] 

Gerrit Jansen van Vuuren edited comment on LUCENE-3653 at 12/18/11 1:25 AM:
----------------------------------------------------------------------------

Thanks, I'll keep an eye on the things happening in the trunk.

Creating a Single Tokenizer does help, but the thread blocking still happens because of the
synchronization used in several classes.

I've made some quick changes during the last few days on lucene-3.5, and have included the
diff. (this is not an svn diff sorry).

I agree, if anybody has to decide between, concurrency or storing things twice then concurrency
wins, eventually all the cache data will be available to all threads, and the overhead goes
away. But with synchronization the overhead never goes away.

Some other points of contention are:
 RAMFile : all methods are synchronized.
 RAMInputStream: clone() 
             This method came up during the profiling allot. I changed it from calling clone
to: just create an new instance directly.

 
 
I'll try to cleanup some of the code and add a better diff.


                
      was (Author: gerritjvv):
    Thanks, I'll keep an eye on the things happening in the trunk.

Creating a Single Tokenizer does help, but the thread blocking still happens because of the
synchronization used in several classes.

I've made some quick changes during the last few days on 3.5, and have included the diff.
(this is not an svn duff sorry).

I agree, if anybody has to decide between, concurrency or storing things twice then concurrency
wins, eventually all the cache data will be available to all threads, and the overhead goes
away. But with synchronization the overhead never goes away.

Some other points of contention are:
 RAMFile : all methods are synchronized.
 RAMInputStream: clone() 
             This method came up during the profiling allot. I changed it from calling clone
to: just create an new instance directly.

 
 
I'll try to cleanup some of the code and add a better diff.


                  
> Lucene Search not scalling
> --------------------------
>
>                 Key: LUCENE-3653
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3653
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Gerrit Jansen van Vuuren
>         Attachments: App.java, LUCENE-3653-VirtualMethod+AttributeSource.patch, LUCENE-3653-VirtualMethod+AttributeSource.patch,
lucene-unsync.diff, profile_1_a.png, profile_1_b.png, profile_1_c.png, profile_1_d.png, profile_2_a.png,
profile_2_b.png, profile_2_c.png
>
>
> I've noticed that when doing thousands of searches in a single thread the average time
is quite low i.e. a few milliseconds. When adding more concurrent searches doing exactly the
same search the average time increases drastically. 
> I've profiled the search classes and found that the whole of lucene blocks on 
> org.apache.lucene.index.SegmentCoreReaders.getTermsReader
> org.apache.lucene.util.VirtualMethod
>   public synchronized int getImplementationDistance 
> org.apache.lucene.util.AttributeSourcew.getAttributeInterfaces
> These cause search times to increase from a few milliseconds to up to 2 seconds when
doing 500 concurrent searches on the same in memory index. Note: That the index is not being
updates at all, so not refresh methods are called at any stage.
> Some questions:
>   Why do we need synchronization here?
>   There must be a non-lockable solution for these, they basically cause lucene to be
ok for single thread applications but disastrous for any concurrent implementation.
> I'll do some experiments by removing the synchronization from the methods of these classes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message