jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From KÖLL Claus <C.KO...@TIROL.GV.AT>
Subject AW: FullText Search Problem
Date Fri, 30 Nov 2007 07:07:01 GMT
hi ,

thanks for the informations ...
it will be fine if someone else jump in and looks if this is a bug 
i will not achieve something special .. the exception comes from daily work
somebody tries to search for this and reported me the exception
i know that  >>1) 'test!' is equal to 'test'
but the endusers not :-)


so either i will filter some characters from the search string or jackrabbit should handle
it.
i think the second one will be better

BR,
claus

-----Ursprüngliche Nachricht-----
Von: Ard Schrijvers [mailto:a.schrijvers@hippo.nl] 
Gesendet: Donnerstag, 29. November 2007 20:40
An: users@jackrabbit.apache.org
Betreff: RE: FullText Search Problem




> hi users,
> 
> i want to make a fulltext search like this ...
> /jcr:root/tirolgvat[1]//element(*, nt:base)[jcr:contains(., 'test!')]
> 
> then i get this exception...
> javax.jcr.RepositoryException: Exception building query: 
> org.apache.jackrabbit.core.query.lucene.fulltext.ParseExceptio
> n: Encountered "<EOF>" at line 1, column 6.

Yes, you are correct. It seems that in LuceneQueryBuilder at

Object visit(TextsearchQueryNode node, Object data) {

it breaks at

Query context = parser.parse(query.toString()); 

where the parser is o.a.j.core.query.lucene.fulltext.QueryParser. It
seems to break on string ending with a "!". Unfortunately, I do not have
insight in how the QueryParser works. Perhaps somebody else knows where
to look in the QueryParser .

OTOH, beside that this is possibly a bug, what are you trying to achieve
with your query? "jcr:contains(., 'test!')", even when it would not
break will simple return the same as "jcr:contains(., 'test')". This is
because the query is eventually parsed with a lucene analyzer, and
string are tokenized on "!" (at least if your analyzer sees ! as a
delimiter , which the default analyzer in jackrabbit does, which you are
probably using). So assuming you use
org.apache.lucene.analysis.standard.StandardAnalyzer (see [1] workspace
config)

1) 'test!' is equal to 'test' 
2) 'te!st' is equal to 'te' OR 'st' (the or is depending on default OR
or AND setting though)
3) 'te#st' is equal to 'te' OR 'st'

You might think it is strange, but you have to realize that you text is
also indexed with this same analyzer.

Hope it is clear,'

Regards Ard

[1] http://jackrabbit.apache.org/doc/config.html

> 
> i know the problem is the "!" sign
> 
> i tried to encode it first with the ISO9075 Class but then 
> the query works but i get no results
> 
> any hints are welcome :-)
> 
> BR,
> claus
> 
> 

Mime
View raw message