Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-dev@lucene.apache.org
Message-ID: <12163273.1171685646735.JavaMail.jira@brutus>
Date: Fri, 16 Feb 2007 20:14:06 -0800 (PST)
From: "Paul Cowan (JIRA)" <jira@apache.org>
To: java-dev@lucene.apache.org
Subject: [jira] Updated: (LUCENE-806) Synchronization bottleneck in
 FieldSortedHitQueue with many concurrent readers
In-Reply-To: <9797706.1171683305520.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/LUCENE-806?page=3Dcom.atlassia=
n.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Cowan updated LUCENE-806:
------------------------------

    Attachment: lucene-806.patch

Just to clarify, the specific issue here is that RuleBasedCollator (the onl=
y concrete implementation of Collator in the JDK, and the one that is alway=
s returned by Collator.getInstance(Locale)) has a synchronized compare(), m=
eaning that if you have many threads building FieldSortedHitQueues with lar=
ge results and locale-sensitive sorting, and they share a Collator, they en=
d up waiting for each other in that method (which can obviously be called t=
ens of thousands of times during a very large search). The way to get a Col=
lator is by calling Collator.getInstance(Locale), which makes it look like =
this problem is the JDK's fault; however, Collator.getInstance(Locale) actu=
ally returns a clone() of the object from the cache. The caching mechanism =
seems to be to prevent having to rebuld the rule tables, rather than the ob=
jects themselves. Therefore, the JDK version balances performance and threa=
d safety quite well.=20

On the Lucene end, though, the FieldSortedHitQueue implements its own cachi=
ng mechanism, meaning that the generated ScoreDocComparators are cached (wi=
th no way to disable this behaviour, even if you wanted to). Therefore, one=
 you've got your comparator (the unique key being {reader, fieldname, type,=
 locale, factory}), every search using sorting on the same field in the sam=
e way on the same reader will use the same Collator, possibly causing a syn=
chronization bottleneck. Even providong your own factory to the SortField d=
oesn't REALLY help, as they're cached one level 'above' that (you can work =
around this; see below)

Attached is a patch which provides a 'quick and dirty' way of dealing with =
this. NOTE: THIS PATCH IS NOT PRODUCTION QUALITY, it's just a proof of conc=
ept. If people like the idea, I'll tidy it up substantially.

This works by adding a flag, usePerThreadLocaleComparators, set by a static=
 method, to FieldSortedHitQueue. If this flag is NOT set, behaviour remains=
 the same. If it's set to true, however, createValue calls a per-thread ver=
sion of comparatorStringLocale, which returns a simple wrapper ScoreDocComp=
arator which delegates to the one created by comparatorStringLocale, using =
a ThreadLocal to make sure they're not shared between threads.

For demonstration purposes, I have added a quick demo main() method to Fiel=
dSortedHitQueue, which does a simple timing test -- 20 threads each inserti=
ng 20000 dummy documents into a 200-element FieldSortedHitQueue. Note that =
it uses CountDownLatches to coordinate the threads, so this dummy test will=
 only run under Java 5. Sorry, but as a quick demo it will do for now. By c=
hanging the values of
    final int threadCount =3D 20;
    final int docCount =3D 10000;
    final int queueSize =3D 200;
you can change the parameters I mentioned above. However, the figures seem =
to roughly the same proportion regardless of how high or low those numbers =
are, within reason; the parameters provided are  enough to spend a LOT of t=
ime waiting for locks; making them higher doesn't really make that much dif=
ference. If anything, making the queuesize larger makes the new version of =
the code (with the flag set) look better in comparison. On my dev machine (=
1.8 GHz Celeron laptop) the test as-is gives the following figures:
   Shared=3D5219ms
   PerThread=3D2140ms
this is a pretty substantial difference, and (I think) makes it worth pursu=
ing this further. Your mileage may vary, but between 2x-4x faster seems typ=
ical for anywhere above, say, 5 threads, 1000 docs, and queue size of 50.

So if people are happy for me to proceed down this path, I'm equally happy =
to tidy up and produce a cleaner, documented etc. version of this patch. Ho=
wever, the more I look at this, the more I'd like to refactor this code -- =
it's not the nicest code in Lucene, and I think it could be tidied up (pers=
onally). My proposal would be something along the lines of changing all tho=
se static methods in FieldSortedHitQueue (comparatorXXXX) to be implementat=
ions of SortComparatorSource. There'd be a StringComparatorSource, AutoComp=
aratorSource, etc. Everything in FSHQ would be made to deal with the SortCo=
mparatorSources only, abstracting out all the hard work. The logic of the c=
reate() method  could be replaced by a PerFieldComparatorSource, which prod=
uces one or more of the others depending on the field type, much as it does=
 now. Everything else could be implemented (possibly) using the Decorator p=
attern to implement new SortComparatorSources. Namely, a CachingComparatorS=
ource would use some sort of caching mechanism (possibly the FieldCacheImpl=
.Cache, as it does now, though that seems like an odd coupling) to cache th=
e SortComparatorSources produced by the PerFieldComparatorSource; then we'r=
e basically back where we are now. Along comes the PerThreadComparatorSourc=
e, which uses ThreadLocals to do basically what my patch above does. All th=
ese classes would be available externally, for people to wrap around their =
own SortComparatorSources when setting up their SortFields; if no factory i=
s provided in the SortField, things work much as they do now.

What do people think? Is the quick and dirty way (a) worthwhile, and (b) go=
od enough? Should I look at implementing the bigger, fancier patch which wi=
ll be more work and more complicated but ultimately (I think) make FieldSor=
tedHitQueues much cleaner? Or is there another alternative (for example, an=
other low-impact option would be doing something like what  I've done now, =
but instead using a ThreadLocal in a Comparator implementation, which could=
 mean that the API for FieldSortedHitQueue doesn't need to change at all). =
Or is this not worth making part of the source tree, given that there are w=
ays around it (supplying your own SortComparatorSource which manages its ow=
n ThreadLocals).  The performance gain IS substantial, though...

Apologies for length, this is quite a confusing area and I wanted to make s=
ure I hadn't forgotten anything.

> Synchronization bottleneck in FieldSortedHitQueue with many concurrent re=
aders
> -------------------------------------------------------------------------=
-----
>
>                 Key: LUCENE-806
>                 URL: https://issues.apache.org/jira/browse/LUCENE-806
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.0
>            Reporter: Paul Cowan
>            Priority: Minor
>         Attachments: lucene-806.patch
>
>
> The below is from a post by (my colleague) Paul Smith to the java-users l=
ist:
> ---
> Hi ho peoples.
> We have an application that is internationalized, and stores data from ma=
ny languages (each project has it's own index, mostly aligned with a single=
 language, maybe 2).
> Anyway, I've noticed during some thread dumps diagnosing some performance=
 issues, that there appears to be a _potential_ synchronization bottleneck =
using Locale-based sorting of Strings.  I don't think this problem is the r=
oot cause of our performance problem, but I thought I'd mention it here.  H=
ere's the stack dump of a thread waiting:
> "http-1001-Processor245" daemon prio=3D1 tid=3D0x31434da0 nid=3D0x3744 wa=
iting for monitor entry [0x2cd44000..0x2cd45f30]
>         at java.text.RuleBasedCollator.compare(RuleBasedCollator.java)
>         - waiting to lock <0x6b1e8c68> (a java.text.RuleBasedCollator)
>         at org.apache.lucene.search.FieldSortedHitQueue$4.compare(FieldSo=
rtedHitQueue.java:320)
>         at org.apache.lucene.search.FieldSortedHitQueue.lessThan(FieldSor=
tedHitQueue.java:114)
>         at org.apache.lucene.util.PriorityQueue.upHeap(PriorityQueue.java=
:120)
>         at org.apache.lucene.util.PriorityQueue.put(PriorityQueue.java:47=
)
>         at org.apache.lucene.util.PriorityQueue.insert(PriorityQueue.java=
:58)
>         at org.apache.lucene.search.FieldSortedHitQueue.insert(FieldSorte=
dHitQueue.java:90)
>         at org.apache.lucene.search.FieldSortedHitQueue.insert(FieldSorte=
dHitQueue.java:97)
>         at org.apache.lucene.search.TopFieldDocCollector.collect(TopField=
DocCollector.java:47)
>         at org.apache.lucene.search.BooleanScorer2.score(BooleanScorer2.j=
ava:291)
>         at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.ja=
va:132)
>         at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.ja=
va:110)
>         at com.aconex.index.search.FastLocaleSortIndexSearcher.search(Fas=
tLocaleSortIndexSearcher.java:90)
> .....
> In our case we had 12 threads waiting like this, while one thread had the=
 lock on the RuleBasedCollator.  Turns out RuleBasedCollator's.compare(...)=
 method is synchronized.  I wonder if a ThreadLocal based collator would be=
 better here... ?  There doesn't appear to be a reason for other threads se=
arching the same index to wait on this sort.  Be just as easy to use their =
own.  (Is RuleBasedCollator a "heavy" object memory wise?  Wouldn't have th=
ought so, per thread)
> Thoughts?
> ---
> I've investigated this somewhat, and agree that this is a potential probl=
em with a series of possible workarounds. Further discussion (including pro=
of-of-concept patch) to follow.

--=20
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org