lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Reply Split Search Word
Date Mon, 08 Aug 2005 12:51:13 GMT
On Aug 8, 2005, at 7:44 AM, Karthik N S wrote:
>    I  would like to reformat the Question  slightly ,
>    Words without double Quotes may also be present in the String.
>    Also I have to apply the STOP - Analyzer  to filter out common  
> English
> words appearing within.
>
> Do u mind giving me a bit of src hint for the same...
> [ I am googled out of ideas ]

Well, the Analyzers chapter in Lucene in Action will give you all the  
basics you need to write custom pieces of an analyzer :)

A custom Tokenizer is where you'll want to start with, perhaps,  
though there are other possibilities such as tokenizing quotes  
separately and then having a filter buffer when they are encountered  
and combine multiple tokens into one when within quotes.  The only  
hints I have would be examples from within Lucene's codebase itself.   
You'll need to maintain some state (whether within quotes or not),  
but otherwise it should be relatively straightforward.

To provide many more hints would require implementing it myself, I'm  
afraid.

     Erik


>
>
> with regards
> karthik
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: Monday, August 08, 2005 4:23 PM
> To: java-user@lucene.apache.org
> Subject: Re: Reply Split Search Word
>
>
> To have an analyzer split that string into 1-5 as you have listed
> will require you write a custom Analyzer to tokenize with double
> quotes in mind like that.
>
>      Erik
>
> On Aug 8, 2005, at 12:06 AM, Karthik N S wrote:
>
>
>> Hi
>>
>> Luceners
>>
>> Apologies.....
>>
>> As I have already replied,Using Analysis I have tried on all  
>> Analyzers
>> (including Standard Analyzer)
>> But not able to achive the required COMPLETS WORD Split.
>>
>> My I/p String would be a lengthy one as below
>>
>> String sKey = "\"" + "Dough Cutting" + "\"" +  "  " +  "Otis
>> Gospodnetic"  +
>> "   " +
>>               "\"" + "Erik Hatcher" + "\""  + "  " +    "Authors of
>> " + "\"" +
>> "Lucene In Action" + "\"";
>>
>> The required split of complete words should return
>>
>>    1) "Dough Cutting"
>>    2) Otis Gospodnetic
>>    3) "Erik Hatcher"
>>    4) Authors of
>>    5) "Lucene In Action"
>>
>> Plz Note :- Words with "\"" are complete split words....
>>
>> I am shure some Analyzer code inside Lucene is handling this task.
>>
>>
>> som how can one achive this task..
>>
>> with regards
>> Karthik
>>
>> -----Original Message-----
>> From: Mordo, Aviran (EXP N-NANNATEK) [mailto:aviran.mordo@lmco.com]
>> Sent: Friday, August 05, 2005 7:58 PM
>> To: java-user@lucene.apache.org
>> Subject: RE: Split Search Word
>>
>>
>> The StandardAnalyzer should work just fine with it, It will break the
>> search string to 5 search terms.
>>
>> HTH
>>
>> Aviran
>> http://www.aviransplace.com
>>
>>   _____
>>
>> From: Karthik N S [mailto:karthik@controlnet.co.in]
>> Sent: Friday, August 05, 2005 1:57 AM
>> To: LUCENE
>> Subject: Split Search Word
>>
>>
>>
>> Hi Luceners
>>
>> Apologies.....
>>
>> I  have along Search String as given below...
>>
>>
>>
>> SearchWord =  "\"" + "Dough Cutting" + "\"" +  "  " +  "Otis
>> Gospodnetic"  +  "   " + "\"" + "Erik Hatcher" + "\""  + "  " +
>>                            "Authors of " + "\"" + "Lucene In Action"
>> +"\"";
>>
>> And prior to searching the Index ,I need the Words to be Split.
>>
>> SearchWord   =
>>
>>    1)   "\"" + "Dough Cutting" + "\""
>>    2)   "Otis Gospodnetic"
>>    3)  "\"" + "Erik Hatcher" + "\""
>>    4)  "Authors of "
>>    5) "\"" +"Lucene In Action" +"\""
>>
>> I am shure some Analyzer within Lucene is performin the task.
>> So some body please Tell me Howto
>>
>> [ I already used Analysis/Paralysis code to check ,but no help ]
>>
>>
>>
>>
>> WITH WARM REGARDS
>> HAVE A NICE DAY
>> [ N.S.KARTHIK]
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message