lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Svensson <si...@devhost.se>
Subject Re: Tokenize a string
Date Fri, 15 Jun 2012 12:23:57 GMT
             var analyzer = new StandardAnalyzer(Version.LUCENE_29);
             var textReader = new StringReader("hola mi nombre es Vicente");
             var tokenStream = analyzer.TokenStream("field", textReader);
             var terms = new List<String>();
             var termAttribute = 
(TermAttribute)tokenStream.GetAttribute(typeof(TermAttribute));
             while(tokenStream.IncrementToken()) {
                 terms.Add(termAttribute.Term());
             }

             // terms = { "hola", "mi", "nombre", "es", "vicente" ]

On 2012-06-15 14:01, vicente garcia wrote:
> Hi, I have a little doubt.
>
> I'd like to tokenize a string. Something like this:
>
> StandardAnalyzer analyzer = new StandardAnalyzer("hola mi nombre es Vicente");
>
> List<string>  tokens = analyzer.GetTokens();
>
> And tokens is: [hola] [mi] [nombre] [es] [Vicente]
>
> is this possible?
>
> Thanks :)
>
>

Mime
View raw message