lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <>
Subject Analyzers and sorting with a custom analysis chain
Date Sat, 03 Sep 2011 02:26:34 GMT
Hi Everyone,

I've got an Analysis question related to both Lucene and Solr (sorry for the cross posting).

i've created a custom analysis chain part of a field type for the title field in my schema
representing Businesses. 
I've created an addition field called title_sort where I copied the original title into title_sort.
My analysis 
chain is a variant of alphaOnlySort however, what it does it instead of using KeywordTokenizer,
and keeping 
all the tokens together is that I use WhitespaceTokenizer, do stopword analysis, and synonym
stuff, then I wrote 
a custom CombiningFilter (that I can probably contribute back at some point) to recombine
the tokens at the end 
of the analysis chain.

So let's say I have "The Children's Hospital of Los Angeles" as the original title value,
then after this analysis chain 
for the field type used in title_sort, I'm left with childrenshospitallosangeles as a single
token resultant from the chain.
So, when I go to sort the titles in Solr, I use sort=title_sort asc, and I am getting all
kinds of weird results when doing 
a query. However, when I go into the analysis.jsp page for Solr admin, the analysis for the
field type, using that particular 
value comes out as expected, childrenshospitallosangeles so I am totally confused as to why
title sort isn't working.

I'm using solr 1.4.1 and lucene 2.9.3. Any help would be extremely appreciated. Thank you.


Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message