lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: What is "flexible indexing" in Lucene 4.0 if it's not the ability to make new postings codecs?
Date Thu, 13 Dec 2012 22:03:11 GMT
I should not have added that note. The Opennlp patch gives a concrete 
example of adding an annotation to text.

On 12/13/2012 01:54 PM, Glen Newton wrote:
> It is not clear this is exactly what is needed/being discussed.
>
>  From the issue:
> "We are also planning a Tokenizer/TokenFilter that can put parts of
> speech as either payloads (PartOfSpeechAttribute?) on a token or at
> the same position."
>
> This adds it to a token, not a span. 'same position' does not suggest
> it also records the end position.
>
> -Glen
>
> On Thu, Dec 13, 2012 at 4:45 PM, Lance Norskog <goksron@gmail.com> wrote:
>> Parts-of-speech is available now, in the indexer.
>>
>> LUCENE-2899 adds OpenNLP to the Lucene&Solr codebase. It does
>> parts-of-speech, chunking and Named Entity Recognition. OpenNLP is an Apache
>> project for natural-language processing.
>>
>> Some parts are in Solr that could be in Lucene.
>>
>> https://issues.apache.org/jira/browse/lucene-2899
>>
>>
>> On 12/12/2012 12:02 PM, Wu, Stephen T., Ph.D. wrote:
>>>>> Is there any (preliminary) code checked in somewhere that I can look
at,
>>>>> that would help me understand the practical issues that would need to
be
>>>>> addressed?
>>>> Maybe we can make this more concrete: what new attribute are you
>>>> needing to record in the postings and access at search time?
>>> For example:
>>>    - part of speech of a token.
>>>    - syntactic parse subtree (over a span).
>>>    - semantically normalized phrase (to canonical text or ontological
>>> code).
>>>    - semantic group (of a span).
>>>    - coreference link.
>>>
>>> stephen
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message