lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Braun (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (LUCENE-6061) Add Support for something different than Strings in Highlighting (FastVectorHighlighter)
Date Mon, 17 Nov 2014 18:05:33 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-6061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14214909#comment-14214909
] 

Martin Braun edited comment on LUCENE-6061 at 11/17/14 6:04 PM:
----------------------------------------------------------------

Ok, here it goes:

I do Part Of Speech Analysis on marked up texts like this:

Words are marked up in the text with <prefix>P<stem>S<suffix>.

I have the following fields:

- one field stores the original version of the text
- one field stores the text with only the prefixes of words
- one field stores the text with only the stems of words
- one field stores the text with only the suffix of words

I keep all Attribute data present on the fields but remove all unnecessary tokens (i.e. words
with no prefix in the prefix field)

I want to be able to search on the prefix field and highlight the match in the original version
because the prefix field is of not much help for the user.

My Utility version of the FastVectorHighlighter in the Github repo supports this behaviour
as it uses the FastVectorHighlighter (the code doing the heavy lifting is currently copied
from it) and adds the support for rendering Highlights into arbitrary objects via the ObjectEncoder<T>
interface.


was (Author: s4ke):
Ok, here it goes:

I do Part Of Speech Analysis on marked up texts like this:

Words are marked up in the text with <prefix>P<stem>S<suffix>.

I have the following fields:

- one field stores the original version of the text
- one field stores the text with only the prefixes of words
- one field stores the text with only the stems of words
- one field stores the text with only the suffix of words

I keep all Attribute data present on the fields but remove all unnecessary tokens (i.e. words
with no prefix in the prefix field)

I want to be able to search on the prefix field and highlight the match in the original version
because the prefix field is of not much help for the user.

My Utility version of the FastVectorHighlighter in the Github repo supports this behaviour.

> Add Support for something different than Strings in Highlighting (FastVectorHighlighter)
> ----------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6061
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6061
>             Project: Lucene - Core
>          Issue Type: Wish
>          Components: core/search, modules/highlighter
>    Affects Versions: Trunk
>            Reporter: Martin Braun
>            Priority: Critical
>              Labels: FastVectorHighlighter, Highlighter, Highlighting
>             Fix For: 4.10.2, 5.0, Trunk
>
>
> In my application I need Highlighting and I stumbled upon the really neat FastVectorHighlighter.
One problem appeared though. It lacks a way to render the Highlights into something different
than Strings, so I rearranged some of the code to support that:
> https://github.com/Hotware/LuceneBeanExtension/blob/master/src/main/java/de/hotware/lucene/extension/highlight/FVHighlighterUtil.java
> Is there a specific reason to only support String[] as a return type? If not, I would
be happy to write a new class that supports rendering into a generic Type and rewire that
into the existing class (or just do it as an addition and leave the current class be).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message