lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yamo93 <yam...@gmail.com>
Subject Re: Question on ElisionFilter with d'
Date Thu, 26 Jul 2012 08:10:49 GMT
Hi,

Sorry I forgot the most important : i use lucene 3.6.

Here is my code : tokenStream = new ElisionFilter(Version.LUCENE_36, 
tokenStream);

I looked at the source code of ElisionFilter, and DEFAULT_ARTICLES 
doesn't contain "d" and "c", in order to manage terms like /"d'une/" or 
"/c'est"/.

A possible workaround would be to call this constructor 
ElisionFilter(Version matchVersion, TokenStream input, Set<?> articles).

But i don't understand why this "d" and "c" are not present in default 
articles.

Yann.

On 07/26/2012 03:52 AM, Jack Krupansky wrote:
> The filter should work (remove the letter and apostrophe).
>
> Could you supply an exact code fragment that shows the literal term, 
> the code invoking the filter, and the exact literal output?
>
> And, which release of Lucene?
>
> -- Jack Krupansky
>
> -----Original Message----- From: yamo93
> Sent: Wednesday, July 25, 2012 9:56 AM
> To: java-user@lucene.apache.org
> Subject: Re: Question on ElisionFilter with d'
>
> Thanks for replying,
>
> The problem is that the filter don't remove d' (and c' too).
> Shall i open an issue on jira ?
>
> On 07/25/2012 04:36 PM, Ian Lea wrote:
>> I bet it's expected.  From http://en.wikipedia.org/wiki/Elision_(French)
>>
>> In written French, elision (both phonetic and orthographic) is
>> obligatory for the following words:
>> ...
>>
>> the preposition de
>>   ...
>>   Le père d'Albert vient d'arriver.
>>
>>
>>
>> So surely the removal of d' is correct.
>>
>>
>> -- 
>> Ian.
>>
>>
>> On Wed, Jul 25, 2012 at 2:01 PM, yamo93 <yamo93@gmail.com> wrote:
>>> Hello,
>>>
>>> I'm using ElisionFilter to index french text.
>>> The filter works but ignore the d letter followed by an apostrophe 
>>> (example:
>>> d'une).
>>>
>>> Is-it an expected behaviour or is it an issue ?
>>>
>>> Regards,
>>> Yann.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message