jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Slashes in wildcard query string do not work
Date Thu, 26 Feb 2009 21:13:12 GMT
Hi,

wildcards in jcr:contains are a bit tricky ;)

On Thu, Feb 26, 2009 at 10:47, PThiemann
<philipp.thiemann@googlemail.com> wrote:
> When searching for the following I do get a correct result:
> //element(*,
> custom:file)[jcr:contains(custom:extendedProperties/@XYZ,'F/OSAM')]/custom:extendedProperties/rep:excerpt(.)

as with content that gets indexed, also 'F/OSAM' will be analyzed
before it is evaluated by the query handler. the result of the
analyzing process depends on the configured analyzer. the default
implementation will create two tokens 'f' and 'osam'

> When searching for the next query string I do not get a result. Although
> using wildcards in my query:
> //element(*,
> custom:file)[jcr:contains(custom:extendedProperties/@XYZ,'F/OS*')]/custom:extendedProperties/rep:excerpt(.)

here the wildcard prevents the use of the analyzer because it is
impossible to run an analyzer on a just a prefix of many possible
strings. the resulting query will search for tokens that start with
'f/os'. obviously neither 'f' for 'osm' match here.

> Now there is the strange thing. When I search (leaving out the /) for the
> following I can see my result again.
> //element(*, custom:file)[jcr:contains(custom:extendedProperties/@XYZ,'F
> OS*')]/custom:extendedProperties/rep:excerpt(.)

this in turn creates two tokens again for searching: 'f' and 'os*',
which both match the tokens that were indexed.

> Is the slash not indexed by lucene or do I have to escape the slash for
> Jackrabbit for not being recognized as path delimiter?

this is basically a limitation when you use a wildcard in the jcr
contains clause.

as a rule of thumb you should avoid jcr:contains when your search
includes any special character.

regards
 marcel

Mime
View raw message