lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: Recover special terms from StandardTokenizer
Date Fri, 11 Dec 2009 09:09:19 GMT
How about getting the original token stream and then converting c++ to
cplusplus or anyother such transform. Or perhaps you might look at
using/extending(in the non java sense) some other tokenized!

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Fri, Dec 11, 2009 at 11:00 AM, Weiwei Wang <ww.wang.cs@gmail.com> wrote:

> Hi, all,
>     I designed a ftp search engine based on Lucene. I did a few
> modifications to the StandardTokenizer.
> My problem is:
>  C++ is tokenized as c from StandardTokenizer and I want to recover it from
> the TokenStream from StandardTokenizer
>
> What should I do?
>
> --
> Weiwei Wang
> Alex Wang
> 王巍巍
> Room 403, Mengmin Wei Building
> Computer Science Department
> Gulou Campus of Nanjing University
> Nanjing, P.R.China, 210093
>
> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message