lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: How to store custom token attribute in Lucene Index ?
Date Wed, 04 Jun 2014 17:54:15 GMT
You can only use the PayloadAttribute at the moment. In general the way to go is to add another
TokenFilter at the end of your indexing chain, that converts all those attributes to a single
Payload (serializing them). On the search side, there are multiple possibilities to access
the payloads (all position relation queries like span queries can use them). But in most cases
you have to write a custom query.

Please note: Payloads are saved per position, so it means a payload is saved for every term
and position (if the same term happens to be 5 times in a document, 5 payloads are saved in
index, one for each position).

It is currently not possible to attach payloads to terms only (if one term has always the
same payload). If you want to do that in your index, you can also add another TokenFilter
at the end, that appends your attribute to the term (like "term#customAttribute"). While quriying,
the query analyzer will do the same and will find the same term/attribute combination. In
that case, default queries work (they just have to use the analyzer to produce the correct
term to query for).


Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen

> -----Original Message-----
> From: Stephane Fellah []
> Sent: Wednesday, June 04, 2014 7:35 PM
> To:
> Subject: How to store custom token attribute in Lucene Index ?
> Hi,
> I want to create a Lucene analyzer for RDF nodes. RDF nodes can have
> multiple types (uri, bnode, plain literal, plain literal with language, typed
> literal with datatype). While analyzing the term, I want to create a
> RDFNodeTypeAttribute, LanguageAttribute and DatatypeAttribute to store
> respectively the type of RDF node, the language of the literal and the
> datatype attribute. My question is how these attributes can be stored in
> Lucene index. Do I have to write a custom Codecs ? Do I have to use the
> PayloadAttribute ? How can I leverage these attributes once stored in the
> index for my search ?
> Thank you for your help
> --
> Stephane Fellah
> Chief  Knowledge Scientist
> Image Matters LLC
> +(571) 502 8478

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message