lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 26763] New: - [PATCH] Language guesser contribution
Date Sat, 07 Feb 2004 18:57:51 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=26763>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=26763

[PATCH] Language guesser contribution

           Summary: [PATCH] Language guesser contribution
           Product: Lucene
           Version: unspecified
          Platform: Other
        OS/Version: Other
            Status: NEW
          Severity: Enhancement
          Priority: Other
         Component: Other
        AssignedTo: lucene-dev@jakarta.apache.org
        ReportedBy: halleux.jf@skynet.be


Hello,

I'd like to contribute this language guesser to Lucene. 

It contains language guessing interfaces and classes as well as trigram 
specific classes and some language reference files I generated myself using the 
trigram file generation utily in there. I included a unit test as well.

I didn't do any extensive tests on guessing quality and performance but I would 
tend to think that they are both OK for a first pass.

I thought about writing a custom Analyzer for this but realized that this 
wouldn't be the way to go and that probably the language decision should be 
left to the developper, definitely when the Analyzer is used to tokenize a 
query.

Have fun,

Jean-Fran├žois Halleux

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message