lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weiwei Wang <ww.wang...@gmail.com>
Subject Re: Recover special terms from StandardTokenizer
Date Fri, 11 Dec 2009 13:43:21 GMT
Thanks, Koji

On Fri, Dec 11, 2009 at 7:59 PM, Koji Sekiguchi <koji@r.email.ne.jp> wrote:

> MappingCharFilter can be used to convert c++ to cplusplus.
>
> Koji
>
> --
> http://www.rondhuit.com/en/
>
>
>
> Anshum wrote:
>
>> How about getting the original token stream and then converting c++ to
>> cplusplus or anyother such transform. Or perhaps you might look at
>> using/extending(in the non java sense) some other tokenized!
>>
>> --
>> Anshum Gupta
>> Naukri Labs!
>> http://ai-cafe.blogspot.com
>>
>> The facts expressed here belong to everybody, the opinions to me. The
>> distinction is yours to draw............
>>
>>
>> On Fri, Dec 11, 2009 at 11:00 AM, Weiwei Wang <ww.wang.cs@gmail.com>
>> wrote:
>>
>>
>>
>>> Hi, all,
>>>    I designed a ftp search engine based on Lucene. I did a few
>>> modifications to the StandardTokenizer.
>>> My problem is:
>>>  C++ is tokenized as c from StandardTokenizer and I want to recover it
>>> from
>>> the TokenStream from StandardTokenizer
>>>
>>> What should I do?
>>>
>>> --
>>> Weiwei Wang
>>> Alex Wang
>>> 王巍巍
>>> Room 403, Mengmin Wei Building
>>> Computer Science Department
>>> Gulou Campus of Nanjing University
>>> Nanjing, P.R.China, 210093
>>>
>>> Homepage: http://cs.nju.edu.cn/rl/weiweiwang
>>>
>>>
>>>
>>
>>
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Weiwei Wang
Alex Wang
王巍巍
Room 403, Mengmin Wei Building
Computer Science Department
Gulou Campus of Nanjing University
Nanjing, P.R.China, 210093

Homepage: http://cs.nju.edu.cn/rl/weiweiwang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message