lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: Is there an EdgeSingleFilter already?
Date Sun, 17 Mar 2013 17:27:19 GMT
Hi xavier,

Cool, thanks for the feedback, I'll commit later today (unless somebody objects), so it will
be part of the Lucene/Solr 4.3 release.

Steve

On Mar 17, 2013, at 1:21 PM, xavier jmlucjav <jmlucjav@gmail.com> wrote:

> Steve, worked like a charm.
> thanks!
> 
> 
> On Sun, Mar 17, 2013 at 7:37 AM, Steve Rowe <sarowe@gmail.com> wrote:
> 
>> See https://issues.apache.org/jira/browse/LUCENE-4843
>> 
>> Let me know if it works for you.
>> 
>> Steve
>> 
>> On Mar 16, 2013, at 5:35 PM, xavier jmlucjav <jmlucjav@gmail.com> wrote:
>> 
>>> I read too fast your reply, so I thought you meant configuring the
>>> LimitTokenPositionFilter. I see you mean I have to write one, ok...
>>> 
>>> 
>>> 
>>> On Sat, Mar 16, 2013 at 10:33 PM, xavier jmlucjav <jmlucjav@gmail.com
>>> wrote:
>>> 
>>>> Steve,
>>>> 
>>>> Yes, I want only "one", "one two", and "one two three", but nothing
>> else.
>>>> Cool if this can be achieved without java code even better, I'll check
>> that
>>>> filter.
>>>> 
>>>> I need this for building a field used for suggestions, the user
>>>> specifically wants no match only from the edge.
>>>> 
>>>> thanks!
>>>> 
>>>> On Sat, Mar 16, 2013 at 10:22 PM, Steve Rowe <sarowe@gmail.com> wrote:
>>>> 
>>>>> Hi xavier,
>>>>> 
>>>>> It's not clear to me what you want.  Is the "edge" you're referring to
>>>>> the beginning of a field? E.g. raw text "one two three four" with
>>>>> EdgeShingleFilter configured to produce unigrams, bigrams and trigams
>> would
>>>>> produce "one", "one two", and "one two three", but nothing else?
>>>>> 
>>>>> If so, I suspect writing a LimitTokenPositionFilter (which would stop
>>>>> emitting tokens after the token position exceeds a specified limit)
>> would
>>>>> be better, rather than subclassing ShingleFilter.  You could use
>>>>> LimitTokenCountFilter as a model, especially its "comsumeAllTokens"
>> option.
>>>>> I think this would make a nice addition to Lucene.
>>>>> 
>>>>> Also, what do you plan to use this for?
>>>>> 
>>>>> Steve
>>>>> 
>>>>> On Mar 16, 2013, at 5:02 PM, xavier jmlucjav <jmlucjav@gmail.com>
>> wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> I need to use shingles but only keep the ones that start from the
>> edge.
>>>>>> 
>>>>>> I want to confirm there is no way to get this feature without
>>>>> subclassing
>>>>>> ShingleFilter, cause I thought someone would have already encountered
>>>>> this
>>>>>> use case....
>>>>>> 
>>>>>> thanks
>>>>>> xavier
>>>>> 
>>>>> 
>>>> 
>> 
>> 


Mime
View raw message