lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (LUCENE-2504) sorting performance regression
Date Sat, 19 Jun 2010 20:06:23 GMT


Yonik Seeley commented on LUCENE-2504:

My guess is that this is caused by LUCENE-2380, but I opened a separate issue since I'm not
This is the same type of JVM performance issues reported by Mike in LUCENE-2143 and myself
in LUCENE-2380.

  Same test index I used to test faceting: 10M doc index with 5 fields:
   -  f100000_s:  a single valued string field with 100,000 unique values 
   -  f10000_s:   a single valued field with 10,000 unique values
   -  f1000_s:   a single valued field with 1000 unique values
   -  f100_s:   a single valued field with 100 unique values
   -  f10_s:   a single valued field with 10 unique values

URLs I tested against Solr are of the form:

 f100000_s sort only: 101 ms
sort against random field: 101 ms

 f100000_s sort only: 111 ms
sort against random field: 158 ms

This is not due to garbage collection or cache effects.  After you sort against a mix of fields,
the performance is worse forever... you can go back to sorting against  f100000_s only, and
the performance never recovers.

System: Ubuntu on Phenom II 4x3.0GHz, Java 1.6_20

So my guess is that this is caused by the ord lookup going through PagedBytes, and the JVM
not optimizing away the indirection when there is a mix of implementations.

> sorting performance regression
> ------------------------------
>                 Key: LUCENE-2504
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Yonik Seeley
>             Fix For: 4.0
> sorting can be much slower on trunk than branch_3x

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message