jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Long <l...@magillem.com>
Subject Re: How to handle the colon character within fulltext search?
Date Fri, 25 Jun 2010 13:35:31 GMT
Le 25/06/2010 14:19, Ard Schrijvers a écrit :
> Hello Gary,
>
> in the end, the part in the contains function gets delegated to the
> Lucene QueryParser. So, you can use Lucene query syntax in contains,
> for example query time boosting like 'myterm^10'  (unless it does not
> get swallowed by the xpath/sql parser of jackrabbit, like the ~ fuzzy
> char).
>
> Anyways, a colon means in lucene query parser that you search within a
> specific field, see [1] at *Fields*
>
> At the end of that page, it is explained how to escape special chars ( use \ )
>
> However, prefixing is again with a wildcard does not seem to work when
> I test it: I did not test it directly against lucene, so hard to say
> whether this is a lucene queryparser constraint in combination with
> query expansion for the wildcard or a jackrabbit issue.
>
> That said, I think in the end you do not want to use the prefix
> wildcard anyways: You'll run into terrible performance and memory
> useage problems: A general inverted indexes problem (which you can
> circumvent by indexing every term inverted as well...but that is not
> done by jackrabbit of course)
>
> Anyways, the working solution to your problem is to use 'like'. You
> are not doing a free text search actually (free text is on lucene
> terms, not on sentences)
>
> The xpath equivalent that works is for example:
>
> //*[jcr:like(@myprop, 'my:colon having sentence')]
>
> Though again, the jcr:like has bad scaling wrt performance and memory
>
> Regards Ard
>
> [1] http://lucene.apache.org/java/2_4_0/queryparsersyntax.html
>
> On Fri, Jun 25, 2010 at 1:59 PM, Gary Long<long@magillem.com>  wrote:
>    
>> Le 25/06/2010 12:17, Alexander Klimetschek a écrit :
>>      
>>> On Fri, Jun 25, 2010 at 11:42, Gary Long<long@magillem.com>    wrote:
>>>
>>>        
>>>> Hello there :)
>>>>
>>>> I'm using the fulltext search feature of Jackrabbit and i'm facing a
>>>> little
>>>> problem with the colon character (:). For example, if I search for a mail
>>>> which subject is "Tr : Tr : your response", I can't find it. If I search
>>>> for
>>>> "your response" the e-mail is found.
>>>>
>>>> my sql query is :
>>>>
>>>> SELECT * FROM mnt:resource WHERE (contains(jcr:text, '*tr: tr: your
>>>> response*') OR contains(jcr:name, '*tr: tr: your response*');
>>>>
>>>>          
>>> You should escape the query for the contains/jcr:contains function
>>> using the Text.escapeIllegalXpathSearchChars helper from
>>> jackrabbit-jcr-commons:
>>>
>>> http://wiki.apache.org/jackrabbit/EncodingAndEscaping#Escaping_values_in_queries
>>>
>>> Regards,
>>> Alex
>>>
>>>
>>>        
>> I tried this method but it didn't do anything : /
>>
>> Here is my code :
>>
>> String param = "Tr: Tr: your response";
>> String escapedParam =
>> org.apache.jackrabbit.util.Text.escapeIllegalXpathSearchChars(param);
>> String query = SELECT * FROM mnt:resource WHERE (contains(jcr:text, '*"+
>> escapedParam +"*') OR contains(jcr:name, '*"+ escapedParam +"*').
>>
>> In debug mode, I looked at the value of textQuery in the query and it is
>> still "Tr: Tr your response". The colon character doesn't seems to be
>> escaped. : /
>>
>> Regards,
>> Gary
>>
>>
>>
>>
>>      

Hello :)

I'll try to use xpath instead of sql to run the query but there is 
something I'm note sure about:  While using xpath, is it possible to 
specify multiple jcr:like or multiple jcr:contains constrains in a 
single query?

I read the documentation on [1] but there is no specific example.

How would you translate the following sql query in xpath :

SELECT * FROM mnt:resource WHERE (contains(jcr:text, 'my:sentence') OR 
contains(jcr:name, 'my:sentence'))
AND jcr:path LIKE '/projects*'
AND jcr:type <> null;

I have the begining : 
/jcr:root/project//element(*,mnt:resource)[jcr:contains(@jcr:text, 
'my:sentence')] ... and I don't know how to write the OR :-\ ?!

Thank you for your help :)

Regards,
Gary

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message