jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Guggisberg" <stefan.guggisb...@gmail.com>
Subject Re: MatchAllScorer calculateDocFilter() flaw
Date Mon, 06 Aug 2007 11:55:57 GMT
hi ard,

On 8/6/07, Ard Schrijvers <a.schrijvers@hippo.nl> wrote:
> I guess since attachements seem to be removed, I should file a jira issue for patches?

yes, posting a jira issue is the preferred way of reporting bugs and
sending patches.

thanks!
stefan

>
> Regards Ard
>
> >
> > Hello,
> >
> > First of all,I am not sure what the policy according
> > resolving/indication flaws/bugs in Jackrabbit is? Is creating
> > a JIRA issue the way to go or mailing the dev list like I do now?
> >
> > Anyway, there is a flaw in
> > MatchAllScorer.calculateDocFilter(). When you have just two
> > nodes, with different properties, like "myprop" and
> > "myprop2", and you have an xpath String xpath =
> > "//*[@myprop], you get both nodes back (to be precise, you'll
> > get every node that has a property that startswith "myprop")
> >
> >
> > You can reproduce it by changing the
> > SimpleQueryTest.testIsNotNull() a little:
> >
> > Change
> >
> > bar.setProperty("text", "the quick brown fox jumps over the
> > lazy dog.");
> >
> > to
> >
> > bar.setProperty("mytextwhichstartswithmytext", "the quick
> > brown fox jumps over the lazy dog.");
> >
> > Now the test with xpath =
> > "//*[@jcr:primaryType='nt:unstructured' and @mytext]"; fails
> > because 2 results. I did test for the trunk and tag 1.3.1 and
> > both have the same problem. I have attached
> > MatchAllScorer.java.patch in this mail, or should I create a
> > JIRA issue for this?
> >
> > Furthermore I would like to discuss a different
> > implementation for the MatchAllScorer, because IMHO the
> > current calculateDocFilter() becomes slow pretty fast (see
> > bottom email the code part i am referring to: if you have
> > 100.000 docs with "mytext" property, and you query  [@mytext]
> > the loop below is executed at least 100.000 times). I think
> > it might be out of scope for the user-list, or is the
> > user-list the place to discuss something like this?
> >
> > Regards Ard
> >
> > --------------------------------------------------------------
> > ---------
> >
> > TermEnum terms = reader.terms(new Term(FieldNames.PROPERTIES, field));
> >         try {
> >             TermDocs docs = reader.termDocs();
> >             try {
> >                 while (terms.term() != null
> >                         && terms.term().field() ==
> > FieldNames.PROPERTIES
> >                         && terms.term().text().startsWith(field)) {
> >                     docs.seek(terms);
> >                     while (docs.next()) {
> >                         docFilter.set(docs.doc());
> >                     }
> >                     terms.next();
> >                 }
> >             } finally {
> >                 docs.close();
> >             }
> >         } finally {
> >             terms.close();
> >         }
> >
> > --------------------------------------------------------------
> > ---------
> >
> > --
> >
> > Hippo
> > Oosteinde 11
> > 1017WT Amsterdam
> > The Netherlands
> > Tel  +31 (0)20 5224466
> > -------------------------------------------------------------
> > a.schrijvers@hippo.nl / ard@apache.org / http://www.hippo.nl
> > --------------------------------------------------------------
> >
>

Mime
View raw message