lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Tomás Atria <jtat...@gmail.com>
Subject Indexing strategies for metadata fields
Date Wed, 24 May 2017 17:30:03 GMT
Hello all,

I'm trying to come up with a reasonable indexing strategy for my document's
metadata, and I'm seeing some weird undocumented behaviours.

My original approach was to build fields like these:

FieldType ft = new FieldType();
ft.setDocValuesType( DocValuesType.SORTED );
ft.setIndexOptions( IndexOptions.DOCS );
ft.setStored( true );

thinking that it would be useful to have the doc's metadata available for
searching, sorting and faceting and for value retrieval from doc numbers.
However, for some mysterious reason, having fields with the above
configuration resulted in an empty index after merging segments:

$ ls indexDir
write.lock segments_1

even though I could see the index writer creating index files in the dir
during indexing.

Independently of which combination of FieldType options I used, adding
those fields to documents when indexing always produced the same empty
index after merge/commit, but I would have to test more thoroughly to be
sure of this.

SO: first of all, *any clue why the merge is wiping out the index when and
only when these docValues/stored fields are added?* Should I try to
reproduce a and file a bug?

Second: *is it possible to have one field that is both docValues and
stored? and Indexed?* Why not? And if not, shouldn't FieldType warn you
about this (like Field warns about non-stored, non-indexed fields)?

Finally: *if this is not possible, what is the suggested strategy to make a
given metadata field accesible from documents and useful for
sorting/faceting?* Adding it twice?

thanks!
jta


-- 

sent from a phone. please excuse terseness and tpyos.

enviado desde un teléfono. por favor disculpe la parquedad y los erroers.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message