lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Pugh <ep...@opensourceconnections.com>
Subject Re: Should analysis.jsp honor maxFieldLength
Date Tue, 24 Aug 2010 18:29:15 GMT
I created a patch file at https://issues.apache.org/jira/browse/SOLR-2086.  I went with the
simplest approach since I didn't want to confuse things by having extra filters being added
to what the user created.  However, either approach would work!

On Aug 24, 2010, at 12:18 PM, Robert Muir wrote:

> 
> On Tue, Aug 24, 2010 at 12:03 PM, Eric Pugh <epugh@opensourceconnections.com> wrote:
> Hi all,
> 
> I have maxFieldLength set to 10000 in solrconfig.xml, but was playing around with really
large document (The King James Bible) in analysis.jsp.   I hacked analysis.jsp to show me
the number of terms at each filter, and the headers, but without turning everything on by
checkboxing verbose.
> 
> My results shown at this screenshot: http://img.skitch.com/20100824-t36rq45i2wfimwyd53gwiqebdy.png
seem to confirm that maxFieldLength is NOT honored by the analysis.jsp.
> 
> 
> Separate from whether or not analysis.jsp should do this (I happen to think the closer
to "reality" it is, the better), I think the easiest implementation would be to wrap the entire
stream with LimitTokenCountFilter:
> 
> /**
>  * This TokenFilter limits the number of tokens while indexing. It is
>  * a replacement for the maximum field length setting inside {@link org.apache.lucene.index.IndexWriter}.
>  */
>  
> If i remember, its not exactly the same as the maxFieldLength, but its pretty close.
> 
> -- 
> Robert Muir
> rcmuir@gmail.com

-----------------------------------------------------
Eric Pugh | Principal | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com
Co-Author: Solr 1.4 Enterprise Search Server available from http://www.packtpub.com/solr-1-4-enterprise-search-server
Free/Busy: http://tinyurl.com/eric-cal









Mime
View raw message