lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Minimum Should Match the other way round
Date Thu, 08 Apr 2010 22:59:37 GMT

: However, I got some doubts on this: What about queries that should be
: filtered with the WordDelimiterFilter. This could make a large difference to
: a none-delimiter-filtered MAX_LEN *and* it has got a protwords param. I
: can't instantiate a new WordDelimiterFilter everytime I do a query, so how
: can I put my already instantiated Filters into a cache for such usecases?
: I think solving this problem perhaps would also lead to a possibility to
: make multiword synonyms at query-time possible. 

I honestly don't understand what you're asking here (or most of hte back 
and forth in the rest of this thread for that matter)

The QParser shouldn't really need to know anything about hte analyzer used 
by hte field -- it can just delegate to the superclass for hte parsing, 
then look at the number of clauses in the resulting BooleanQuery (or 
number of Terms in the PhraseQuery depending on how exactly you wnat the 
parsing rules to work) and then add a filter on the numeric field.

where the analyzer matters is in creating that numeric field at index time 
... hence my suggestion of having an analyzer chain that exactly matches 
the field you are interested in, but ending with a TokenCountingFilter -- 
it can take care of creating the "numeric-ish" (padded) field value when 
the docs are indexed.

Alternatley: you can do this in an UpdateProcessor, w/o the 
TokenCountingFilter ... the up side is that if you do it this way you can 
create a true numeric field and you don't have to copy/past the analyzer 
chain -- the down side being that your UpdateProcessor will have to 
"pre-analyze" the field (redundently, the UpdateHandler will do it again 
later) in order to count the tokens.

But even if you go hte UpdateProcessor route, you still don't need to 
"cache" any Filters or anything -- just as the IndexSchema forthe 
FieldType, ask the FieldType for its Analyzer, and analyze.


View raw message