lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uwe Schindler <...@thetaphi.de>
Subject Re: How to Perform a Full Text Search on a Number with Leading Zeros or Decimals?
Date Fri, 28 Jun 2013 19:39:42 GMT
You can add PatternReplaceFilter (http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceFilter.html)
to replace the tokens only consisting of digits by their vsrisnt with leading zeroes removed.

Uwe



Jack Krupansky <jack@basetechnology.com> schrieb:
>The user could use a regular expression query to match the numbers, but
>
>otherwise, you will have to write some specialized token filter to
>recognize 
>numeric tokens and generate extra tokens at the same position for each
>token 
>variant that you want to search for.
>
>-- Jack Krupansky
>
>-----Original Message----- 
>From: Todd Hunt
>Sent: Friday, June 28, 2013 2:18 PM
>To: java-user@lucene.apache.org
>Subject: How to Perform a Full Text Search on a Number with Leading
>Zeros or 
>Decimals?
>
>I have an application that is indexing the text from various reports
>and 
>forms that are generated from our core system.  The reports will
>contain 
>dollar amounts and various indexes that contain all numbers, but have 
>leading zeros.
>
>If a document contains that following text that is stored in one Lucene
>
>document field:
>
>"Account 00000012345 owes $321.98"
>
>What analyzer can be used to index this text and allow the user to find
>this 
>document by searching on:
>
>12345
>
>OR
>
>321
>
>???
>
>We are currently using a StandardAnalyzer which works well for most of
>our 
>use cases, but not one like this.
>
>I realize that I could create my own token filter to convert any text
>that 
>can be represented by an Integer or Long, with leading zeros or not,
>and 
>convert the value to a normal looking integer without leading zeros. 
>But 
>I'd prefer to reuse and existing analyzer or technique to achieve the
>same 
>results.
>
>Thank you.
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message