lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Tomás Atria <jtat...@gmail.com>
Subject Re: Single string automaton causes NPE on Terms.intersect( CompiledAutomaton, BytesRef term )
Date Fri, 25 Mar 2016 23:11:51 GMT
Ok, digging a little more, I found that the problem mentioned above seems
to be caused by FieldReader overriding the intersect( CompiledAutomaton,
BytesRef )
<https://lucene.apache.org/core/5_5_0/core/org/apache/lucene/index/Terms.html#intersect(org.apache.lucene.util.automaton.CompiledAutomaton,%20org.apache.lucene.util.BytesRef)>
method
in Terms.

The overriden method checks to see if the compiled automaton is not
AUTOMATON_TYPE.NORMAL, and if it isn't, throws an IllegalArgumentException
and instructs one to use CompiledAutomaton.getTermsEnum( Terms ) instead:
    if (compiled.type != CompiledAutomaton.AUTOMATON_TYPE.NORMAL) {
      throw new IllegalArgumentException("please use
CompiledAutomaton.getTermsEnum instead");
    }

which, of course, works perfectly, so I'm doing that now and the problem is
no more.

However, the method in FieldReader just assumes that the compiled automaton
is AUTOMATON_TYPE.NORMAL, which causes the above NPE, because the
runAutomaton of a non-normal CompiledAutomaton is set to null in the
constructor, lines 191 to 209:

IntsRef singleton = Operations.getSingleton(automaton);

if (singleton != null) {
  // matches a fixed string
  type = AUTOMATON_TYPE.SINGLE;
  commonSuffixRef = null;
  runAutomaton = null; // <- HERE!
  this.automaton = null;
  this.finite = null;

  if (isBinary) {
    term = StringHelper.intsRefToBytesRef(singleton);
  } else {
    term = new BytesRef(UnicodeUtil.newString(singleton.ints,
singleton.offset, singleton.length));
  }
  sinkState = -1;
  return;
}

Not to pretend I have any idea of what I'm talking about, but given that
the user has relatively little control on which implementation of Terms we
get at runtime (this user at least), shouldn't the overriding method in
FieldReader also check the AUTOMATON_TYPE and throw an equally informative
IllegalArgumentException instead, just for the sake of consistency?

Sorry if all of the above is a little off topic for this list :)

Best,
jta


On Fri, Mar 25, 2016 at 4:33 PM, José Tomás Atria <jtatria@gmail.com> wrote:

> Hello again!
>
> I'm playing around some more with Lucene's automata, and I've bumped into
> something unexpected but can't figure out if its a bug or an error on my
> part.
>
> briefly: Is it possible to use a single string automaton (i.e. the result
> of Automata.makeString( String ) )  to intersect a Terms instance? I keep
> getting NPE's on every attempt at doing this... e.g. this code:
>
> // where "term" is a term known to exist in someField
> CompiledAutomaton ca = new CompiledAutomaton( Automata.makeString( "term"
> ) );
> Terms terms = leafReader.terms( someField );
> TermsEnum tEnum = terms.intersect( ca, null );
>
> results in:
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.lucene.codecs.blocktree.IntersectTermsEnum.<init>(IntersectTermsEnum.java:127)
> at
> org.apache.lucene.codecs.blocktree.FieldReader.intersect(FieldReader.java:185)
>
> I assume I'm doing something wrong (I am aware that using an automaton for
> a single term may be a bad idea, but bear with me), but the fact that it's
> throwing an NPE prompted me to come and ask...
>
> Maybe there's a problem with encodings?
>
> Any help greatly appreciated.
> jta.
>
> --
> entia non sunt multiplicanda praeter necessitatem
>



-- 
entia non sunt multiplicanda praeter necessitatem

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message