lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: [jira] Commented: (LUCENE-1320) ShingleMatrixFilter, a three dimensional permutating shingle filter
Date Thu, 04 Sep 2008 11:39:21 GMT
Right, but that's sort of a hassle :)

I'll see what I can do.


4 sep 2008 kl. 04.36 skrev Grant Ingersoll:

> Or just remove the generics, right?
>
> On Sep 3, 2008, at 5:09 PM, Karl Wettin (JIRA) wrote:
>
>>
>>   [ https://issues.apache.org/jira/browse/LUCENE-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628132

>> #action_12628132 ]
>>
>> Karl Wettin commented on LUCENE-1320:
>> -------------------------------------
>>
>> OK. Either remove it or place it in some alternative contrib  
>> module? The first chooise is obviously the easiest.
>>
>>> ShingleMatrixFilter, a three dimensional permutating shingle filter
>>> -------------------------------------------------------------------
>>>
>>>               Key: LUCENE-1320
>>>               URL: https://issues.apache.org/jira/browse/LUCENE-1320
>>>           Project: Lucene - Java
>>>        Issue Type: New Feature
>>>        Components: contrib/analyzers
>>>  Affects Versions: 2.3.2
>>>          Reporter: Karl Wettin
>>>          Assignee: Karl Wettin
>>>          Priority: Blocker
>>>           Fix For: 2.4
>>>
>>>       Attachments: LUCENE-1320.txt, LUCENE-1320.txt, LUCENE-1320.txt
>>>
>>>
>>> Backed by a column focused matrix that creates all permutations of  
>>> shingle tokens in three dimensions. I.e. it handles multi token  
>>> synonyms.
>>> Could for instance in some cases be used to replaces 0-slop phrase  
>>> queries with something speedier.
>>> {code:java}
>>> Token[][][]{
>>> {{hello}, {greetings, and, salutations}},
>>> {{world}, {earth}, {tellus}}
>>> }
>>> {code}
>>> passes the following test  with 2-3 grams:
>>> {code:java}
>>> assertNext(ts, "hello_world");
>>> assertNext(ts, "greetings_and");
>>> assertNext(ts, "greetings_and_salutations");
>>> assertNext(ts, "and_salutations");
>>> assertNext(ts, "and_salutations_world");
>>> assertNext(ts, "salutations_world");
>>> assertNext(ts, "hello_earth");
>>> assertNext(ts, "and_salutations_earth");
>>> assertNext(ts, "salutations_earth");
>>> assertNext(ts, "hello_tellus");
>>> assertNext(ts, "and_salutations_tellus");
>>> assertNext(ts, "salutations_tellus");
>>> {code}
>>> Contains more and less complex tests that demonstrate offsets,  
>>> posincr, payload boosts calculation and construction of a matrix  
>>> from a token stream.
>>> The matrix attempts to hog as little memory as possible by seeking  
>>> no more than maximumShingleSize columns forward in the stream and  
>>> clearing up unused resources (columns and unique token sets). Can  
>>> still be optimized quite a bit though.
>>
>> -- 
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message