lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Danil ŢORIN <torin...@gmail.com>
Subject Re: Small Vocabulary
Date Tue, 07 Aug 2012 11:15:50 GMT
To avoid wildcard queries, you can write a TokenFilter that will
create both tokens "ADJ" and "ADJ:brown" in same position.
so you can use you index for both lookups without doing wildcard.


On Tue, Aug 7, 2012 at 12:31 PM, Carsten Schnober
<schnober@ids-mannheim.de> wrote:
> Hi Danil,
>
>>> Just transform your input like "brown fox" into "ADJ:brown|<your
>>> payload> NOUN:fox|<other payload>"
>>
>> I understand that this denotes "ADJ" and "NOUN" to be interpreted as the
>> actual token and "brown" and "fox" as payloads (followed by <other
>> payload>), right?
>
> Sorry for replying to myself, but I've realised only now that you
> probably meant to replace the full token string ("brown") by "ADJ:brown"
> and use the payload otherwise, right? Regarding incoming queries, this
> method makes it necessary to perform a Wildcard query (e.g. "NOUN:*")
> when I am not interested in the actual text ("brown") -- which may
> happen more or less frequently -- am I right? However, this might be an
> acceptable trade-off...
> Best regards,
> Carsten
>
>
> --
> Institut für Deutsche Sprache | http://www.ids-mannheim.de
> Projekt KorAP                 | http://korap.ids-mannheim.de
> Tel. +49-(0)621-43740789      | schnober@ids-mannheim.de
> Korpusanalyseplattform der nächsten Generation
> Next Generation Corpus Analysis Platform
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message