lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hans Lund <ha.l...@gmail.com>
Subject Re: FieldValueQuery
Date Thu, 08 Dec 2016 15:42:52 GMT
That would be a solution for sure - but it has the drawback of doubling the
indexed fields pr document.
Looking at the field stats where this is needed we have around 600 fields
pr "document" -
Most of them already having doc values and adding 600 new fields instead of
15 BinaryDocValueField also seems like a 'non' pretty solution?
adding to the concrete complexity - Document creation is done in a
plug-able manner, so fields can be added from someone else code ;-)

As of now I just extended the indexwriter - analyzing the IndexableFields
during updateDocument and adding the BinarydocValueField where needed,
it works but looping through the fields collecting fieldNames that needs a
docValue ... hmm.

A better approach could be extending the Document, letting the iterator do
the inspection and emit the needed marker fields ?
(It would make testing the StringField vs BinaryDocValueField strategy very
simple;-))

What are the drawbacks of having such a 'marker' docValue having no actual
value?

Hans Lund




On Thu, Dec 8, 2016 at 2:51 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Unlike for doc values fields, Lucene does not store this information
> (which documents have a given indexed field) efficiently and so there
> is no query for it.
>
> If this is important to you, you could add another field for each
> indexed field?  E.g. if the document has field foo, you would also
> index has_field_foo e.g. as a StringField with the same text token
> like "1".  Then at search time you can do a TermQuery on
> has_field_foo:1.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Wed, Dec 7, 2016 at 8:39 AM, Hans Lund <ha.lund@gmail.com> wrote:
> > Hi All
> >
> > As far as I can see FieldValueQuery ends up with fetching Bits from
> > DocValues.
> >
> > But I'm having the need for similar functionality for Fields without
> > DocValue like String and TextFields and was wondering if some has had the
> > same issue and found a good solution.
> >
> > I'm also having problems with figuring out what the purpose of the query
> is
> > from usage perspective as it is a highly specialized query for questions
> > like find docs that can be sort on field "foo".
> >
> > For now I've circumvented it by extending the IndexWriter and within the
> > addDocument method create a new binaryDocValueField
> > with empty ByteRefs for all IndexableField having DocValueType ==
> > DocValueTypes.NONE.
> >
> > It works but is not a pretty solution, but is there any alternatives?
> >
> > /Hans Lund
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message