lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Lucene-java Wiki] Update of "LuceneCaveats" by RenaudWaldura
Date Fri, 29 Jun 2007 22:17:27 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The following page has been changed by RenaudWaldura:

  Dissimilar or incompatible analyzers lead to mysterious search
  behavior. See: ["LuceneFAQ"], "Why am I getting no hits / incorrect hits?".
- === Large documents are truncated by default ===
+ === Documents are truncated by default ===
- The indexer will be default truncate documents to {{{IndexWriter.DEFAULT_MAX_FIELD_LENGTH}}}
+ The indexer by default truncates documents to {{{IndexWriter.DEFAULT_MAX_FIELD_LENGTH}}}
- or 10,000 terms in Lucene 2.0. This limit 
+ or 10,000 terms in Lucene 2.0. 
- can easily be changed with 
- [
+ Rule of thumb: an average page of English text contains about 250 words. (Source: [
Google Answers].) This means only about 40 pages are indexed by default. If any of your documents
are longer than this (and you want them indexed), you should raise the limit with [
  === Stopwords are removed ===

View raw message