jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ard Schrijvers <a.schrijv...@onehippo.com>
Subject Re: Problems with hyphen in JSR-170 XPath query using jcr:contains
Date Mon, 30 Aug 2010 07:30:22 GMT
Hello,

On Fri, Aug 27, 2010 at 9:06 PM, H. Wilson <wilsonh@randdss.com> wrote:
>  OK, well I got the spaces part figured out, and will post it for anyone who
> needs it. Putting quotes around the spaces unfortunately did not work.
>  During testing, I determined that if you performed the following query for
> the exact fullName property:
>
>    filter.addContains ( @fullName,
> '"+Text.escapeIllegalXpathSearchChars(".North.South.East.West Land"));
>
> It would return nothing. But tweak it a little and add a wildcard, and it
> would return results:
>
>   filter.addContains ( @fullName,
>   '"+Text.escapeIllegalXpathSearchChars(".North.South.East.West Lan*"));

This does not make sense...see below

>
> But since I did not want to throw in wild cards where they might not be
> wanted, if a search string contained spaces, did not contain wild cards and
> the user was not concerned with case sensitivity, I used the fn:lower-case.
> So I ended up with the following excerpt (our clients wanted options for
> case sensitive and case insensitive searching) .
>
> public OurParameter[] getOurParameters (boolean performCaseSensitiveSearch,
> String searchTerm, String srchField ) { //srchField in this case was
> fullName
>
>   .....
>
>   if ( performCaseSensitiveSearch) {
>
>       //jcr:like for case sensitive
>       filter.orJCRExpression ("jcr:like(@" + srchField +",
> '"+Text.escapeIllegalXpathSearchChars (searchTerm)+"')");
>
>   }
>   else {
>
>       //only use fn:lower-case if there is spaces, with NO wild cards
>
>       if ( searchTerm.contains (" ")&&  !searchTerm.contains ("*")&&
>  !searchTerm.contains ("?") ) {
>
>           filter.addJCRExpression ("fn:lower-case(@"+srchField+") =
> '"+Text.escapeIllegalXpathSearchChars(searchTerm.toLowerCase())+"'");
>
>       }
>
>       else {
>
>           //jcr:contains for case insensitive
>           filter.addContains ( srchField,
> Text.escapeIllegalXpathSearchChars(searchTerm));
>
>       }
>
>   }

This seems to me a workaround around the real problem, because, it
just doesn't make sense to me. Can you inspect the tokens that are
created by your analyser. Make sure you inspect the tokens during
indexing (just store something) and during searching: just search in
the property. I am quite sure you'll see the issue then. Perhaps
something with Text.escapeIllegalXpathSearchChars though it seems that
it should leave spaces untouched

Regards Ard


>   ....
>
> }
>
>
> Hope that helps anyone who needs it.
>
> H. Wilson
>
>>> OK so it looks like I have one other issue. Using the configuration as
>>> posted below and sticking to my previous examples, with the addition of
>>> one
>>> with whitespace. With the following three in our repository:
>>>
>>>   .North.South.East.WestLand
>>>   .North.South.East.West_Land
>>>   .North.South.East.West Land    //yes that's a space
>>>
>>> ...using a jcr:contains, with exact name search with NO wild cards: the
>>> first two return properly, but the last one yields no result.
>>>
>>>   filter.addContains(@fullName,
>>>
>>> '"+org.apache.jackrabbit.util.Text.escapeIllegalXpathSearchChars(".North.South.East.West
>>> Land") +"'));
>>
>> I think the space in a contains is seen as an AND by the
>> Jackrabbit/Lucene QueryParser. I should test this however as I am not
>> sure. Perhaps you can put quotes around it, not sure if that works
>> though
>>
>> Regards Ard
>>
>>> According to the Lucene documentation, KeywordAnalyzer should be creating
>>> one token, plus combined with escaping the Illegal Characters (i.e.
>>> spaces),
>>> shouldn't this search work? Thanks again.
>>>
>>> H. Wilson
>

Mime
View raw message