chemistry-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jens Hübel <>
Subject Text Search Parser added
Date Tue, 05 Jul 2011 14:30:24 GMT
Hi, Chemistries


just a quick note. Yesterday I have checked-in the code for parsing text search queries in
a CONTAINS statement. Please check your servers if it breaks something.


The text search parser is implemented as a completely separated parser and lexer in a separate
grammar. Using it is optional. You can configure the parser in a way that you either get a
CONTAINS string literal as before or a parsed tree. There are some new support methods helping
with unescaping. The text search parser is integrated with our parsing framework for simpler
query integration.


One component that needs review is the JCR connector. Integrating the parser breaks some tests
so I changed the code to use the compatibility mode. In case the JCR connector can benefit
I added a code template how to integrate the full text parser. This needs to be completed.
In case this does not make sense for the JCR connector please remove my added code.


The InMemory server uses the full text parser and is able to do a (very simplistic) full text
search now. It does not do any kind of preprocessing, so it makes only sense for plain text
files. If you store HTML content and search for 'body' you will get a hit for every document.
It does not use any kind of index generation, it uses a grep like search. Don't expect therefore
great performance. Currently there is no ranking implemented. See the unit tests for details.





  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message