lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Analyzers and sorting with a custom analysis chain
Date Sat, 03 Sep 2011 02:26:34 GMT
Hi Everyone,

I've got an Analysis question related to both Lucene and Solr (sorry for the cross posting).

i've created a custom analysis chain part of a field type for the title field in my schema
representing Businesses. 
I've created an addition field called title_sort where I copied the original title into title_sort.
My analysis 
chain is a variant of alphaOnlySort however, what it does it instead of using KeywordTokenizer,
and keeping 
all the tokens together is that I use WhitespaceTokenizer, do stopword analysis, and synonym
stuff, then I wrote 
a custom CombiningFilter (that I can probably contribute back at some point) to recombine
the tokens at the end 
of the analysis chain.

So let's say I have "The Children's Hospital of Los Angeles" as the original title value,
then after this analysis chain 
for the field type used in title_sort, I'm left with childrenshospitallosangeles as a single
token resultant from the chain.
So, when I go to sort the titles in Solr, I use sort=title_sort asc, and I am getting all
kinds of weird results when doing 
a query. However, when I go into the analysis.jsp page for Solr admin, the analysis for the
field type, using that particular 
value comes out as expected, childrenshospitallosangeles so I am totally confused as to why
title sort isn't working.

I'm using solr 1.4.1 and lucene 2.9.3. Any help would be extremely appreciated. Thank you.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message