lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Cowan (JIRA)" <>
Subject [jira] Commented: (LUCENE-806) Synchronization bottleneck in FieldSortedHitQueue with many concurrent readers
Date Tue, 03 Apr 2007 07:37:32 GMT


Paul Cowan commented on LUCENE-806:

Otis, you're probably right -- it may not be wise to tackle two birds with one stone. My concern
is that if I do this a quick and dirty way, it may involve exposing an API to enable/disable
this behaviour which a subsequent refactor would then remove, and I'd obviously rather keep
the API stable.

I'm about to attach 3 patches, with varying levels of effect on the code. I'd be interested
to see what people think is the best approach given the possible refactor.

Hoss, I've had a look at your patch, and rather like it. That's kind of tackling a slightly
different problem; that's cleaning up the FieldCache (which is a great idea) whereas cleaning
up FSHQ is only incidentally related to FieldCache. It uses it (and if it was broken up, each
comparator source would be using your much cleaner API) but I think the two coexist quite
happily. I'd like to see both, in other words!

> Synchronization bottleneck in FieldSortedHitQueue with many concurrent readers
> ------------------------------------------------------------------------------
>                 Key: LUCENE-806
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.0.0
>            Reporter: Paul Cowan
>            Priority: Minor
>         Attachments: lucene-806-proposed-direction.patch, lucene-806.patch
> The below is from a post by (my colleague) Paul Smith to the java-users list:
> ---
> Hi ho peoples.
> We have an application that is internationalized, and stores data from many languages
(each project has it's own index, mostly aligned with a single language, maybe 2).
> Anyway, I've noticed during some thread dumps diagnosing some performance issues, that
there appears to be a _potential_ synchronization bottleneck using Locale-based sorting of
Strings.  I don't think this problem is the root cause of our performance problem, but I thought
I'd mention it here.  Here's the stack dump of a thread waiting:
> "http-1001-Processor245" daemon prio=1 tid=0x31434da0 nid=0x3744 waiting for monitor
entry [0x2cd44000..0x2cd45f30]
>         at
>         - waiting to lock <0x6b1e8c68> (a java.text.RuleBasedCollator)
>         at$
>         at
>         at org.apache.lucene.util.PriorityQueue.upHeap(
>         at org.apache.lucene.util.PriorityQueue.put(
>         at org.apache.lucene.util.PriorityQueue.insert(
>         at
>         at
>         at
>         at
>         at
>         at
>         at
> .....
> In our case we had 12 threads waiting like this, while one thread had the lock on the
RuleBasedCollator.  Turns out RuleBasedCollator' method is synchronized.  I
wonder if a ThreadLocal based collator would be better here... ?  There doesn't appear to
be a reason for other threads searching the same index to wait on this sort.  Be just as easy
to use their own.  (Is RuleBasedCollator a "heavy" object memory wise?  Wouldn't have thought
so, per thread)
> Thoughts?
> ---
> I've investigated this somewhat, and agree that this is a potential problem with a series
of possible workarounds. Further discussion (including proof-of-concept patch) to follow.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message