lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dino Korah" <dcko...@gmail.com>
Subject EmailAddressAnalyzer & TokenStreams
Date Wed, 20 Aug 2008 16:58:11 GMT
Hi guys,
 
If I am to tokenize an email address like "John Smith" <
<mailto:J.Smith@london.gb.world.net> J.Smith@london.gb.world.net>  into
 
    [ <mailto:J.Smith@london.gb.world.net> J.Smith@london.gb.world.net]
    [John]
    [Smith]
    [J.Smith]
    [london.gb.world.net]
    [gb.world.net]
    [world.net]
    [world]
    [net]
 
Is it possible to have a different Position increment for each of these
tokens? If it is, could you please help me with the same sample, with
numbers against each token.
 
Also could you please point me to a skeleton code for a custom Tokenizer.
 
Many Thanks
Dino
 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message