lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Single string automaton causes NPE on Terms.intersect( CompiledAutomaton, BytesRef term )
Date Tue, 29 Mar 2016 15:52:39 GMT
Oh, you just create a free Jira account here:
https://issues.apache.org/jira/browse/LUCENE

Then open an issue, and take that code fragment you have that hits the
NPE attach it to the issue.  We can iterate from there?

A "test case" is just a chunk of code added into one of our test
source files (there are many, look under e.g. lucene/core/src/test/...
*.java) that "fails" when run, and then passes (no exception) when we
fix the issue.

Mike McCandless

http://blog.mikemccandless.com


On Mon, Mar 28, 2016 at 3:48 PM, José Tomás Atria <jtatria@gmail.com> wrote:
> Hi Mike,
>
> I'd be happy to, but I have never used JIRA before and I don't entirely
> understand what you mean by adding a test case as a patch (academic
> programmer here, we are notoriously ignorant of established development
> practices :P).
>
> thanks!
> jta
>
> On Fri, Mar 25, 2016 at 7:54 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> Hi José,
>>
>> Can you please open a Jira issue about this, and add a test case as a
>> patch, if you can?  I think it's bad you hit an NPE!  Not sure how
>> best to fix it, but we can iterate on the issue.
>>
>> Thanks!
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Fri, Mar 25, 2016 at 7:11 PM, José Tomás Atria <jtatria@gmail.com>
>> wrote:
>> > Ok, digging a little more, I found that the problem mentioned above seems
>> > to be caused by FieldReader overriding the intersect( CompiledAutomaton,
>> > BytesRef )
>> > <
>> https://lucene.apache.org/core/5_5_0/core/org/apache/lucene/index/Terms.html#intersect(org.apache.lucene.util.automaton.CompiledAutomaton,%20org.apache.lucene.util.BytesRef)
>> >
>> > method
>> > in Terms.
>> >
>> > The overriden method checks to see if the compiled automaton is not
>> > AUTOMATON_TYPE.NORMAL, and if it isn't, throws an
>> IllegalArgumentException
>> > and instructs one to use CompiledAutomaton.getTermsEnum( Terms ) instead:
>> >     if (compiled.type != CompiledAutomaton.AUTOMATON_TYPE.NORMAL) {
>> >       throw new IllegalArgumentException("please use
>> > CompiledAutomaton.getTermsEnum instead");
>> >     }
>> >
>> > which, of course, works perfectly, so I'm doing that now and the problem
>> is
>> > no more.
>> >
>> > However, the method in FieldReader just assumes that the compiled
>> automaton
>> > is AUTOMATON_TYPE.NORMAL, which causes the above NPE, because the
>> > runAutomaton of a non-normal CompiledAutomaton is set to null in the
>> > constructor, lines 191 to 209:
>> >
>> > IntsRef singleton = Operations.getSingleton(automaton);
>> >
>> > if (singleton != null) {
>> >   // matches a fixed string
>> >   type = AUTOMATON_TYPE.SINGLE;
>> >   commonSuffixRef = null;
>> >   runAutomaton = null; // <- HERE!
>> >   this.automaton = null;
>> >   this.finite = null;
>> >
>> >   if (isBinary) {
>> >     term = StringHelper.intsRefToBytesRef(singleton);
>> >   } else {
>> >     term = new BytesRef(UnicodeUtil.newString(singleton.ints,
>> > singleton.offset, singleton.length));
>> >   }
>> >   sinkState = -1;
>> >   return;
>> > }
>> >
>> > Not to pretend I have any idea of what I'm talking about, but given that
>> > the user has relatively little control on which implementation of Terms
>> we
>> > get at runtime (this user at least), shouldn't the overriding method in
>> > FieldReader also check the AUTOMATON_TYPE and throw an equally
>> informative
>> > IllegalArgumentException instead, just for the sake of consistency?
>> >
>> > Sorry if all of the above is a little off topic for this list :)
>> >
>> > Best,
>> > jta
>> >
>> >
>> > On Fri, Mar 25, 2016 at 4:33 PM, José Tomás Atria <jtatria@gmail.com>
>> wrote:
>> >
>> >> Hello again!
>> >>
>> >> I'm playing around some more with Lucene's automata, and I've bumped
>> into
>> >> something unexpected but can't figure out if its a bug or an error on my
>> >> part.
>> >>
>> >> briefly: Is it possible to use a single string automaton (i.e. the
>> result
>> >> of Automata.makeString( String ) )  to intersect a Terms instance? I
>> keep
>> >> getting NPE's on every attempt at doing this... e.g. this code:
>> >>
>> >> // where "term" is a term known to exist in someField
>> >> CompiledAutomaton ca = new CompiledAutomaton( Automata.makeString(
>> "term"
>> >> ) );
>> >> Terms terms = leafReader.terms( someField );
>> >> TermsEnum tEnum = terms.intersect( ca, null );
>> >>
>> >> results in:
>> >> Exception in thread "main" java.lang.NullPointerException
>> >> at
>> >>
>> org.apache.lucene.codecs.blocktree.IntersectTermsEnum.<init>(IntersectTermsEnum.java:127)
>> >> at
>> >>
>> org.apache.lucene.codecs.blocktree.FieldReader.intersect(FieldReader.java:185)
>> >>
>> >> I assume I'm doing something wrong (I am aware that using an automaton
>> for
>> >> a single term may be a bad idea, but bear with me), but the fact that
>> it's
>> >> throwing an NPE prompted me to come and ask...
>> >>
>> >> Maybe there's a problem with encodings?
>> >>
>> >> Any help greatly appreciated.
>> >> jta.
>> >>
>> >> --
>> >> entia non sunt multiplicanda praeter necessitatem
>> >>
>> >
>> >
>> >
>> > --
>> > entia non sunt multiplicanda praeter necessitatem
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> entia non sunt multiplicanda praeter necessitatem

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message