jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Orlando Palis <orlando.pa...@gmail.com>
Subject Re: jackrabbit 2.6.0 Full Text Search
Date Wed, 12 Jun 2013 21:51:47 GMT
*The following JCR-SQL2 queries don't work either:*

5) SELECT * FROM [nt:resource] as resource WHERE CONTAINS(resource.*,
'This')

6) SELECT * FROM [nt:resource] as resource WHERE CONTAINS(resource.*,
'This*')

7) SELECT * FROM [nt:resource] as resource WHERE CONTAINS(resource.*,
'*This*')



On Thu, Jun 6, 2013 at 5:58 PM, Orlando Palis <orlando.palis@gmail.com>wrote:

> Hi Folks,
>
> I'm new to jackrabbit and I'm trying out full-text search using jackrabbit
> 2.6.0. (with tika 1.3) . I have a custom node type that allows me to store
> some custom properties and multiple html files (stored as binary) .  I have
> the following configurations:
>
> *workspace.xml:*
>
> <?xml version="1.0" encoding="UTF-8"?>
> <Workspace name="default">
>         <!--
>             virtual file system of the workspace:
>             class: FQN of class implementing the FileSystem interface
>         -->
>         <FileSystem
> class="org.apache.jackrabbit.core.fs.db.OracleFileSystem">
>             <param name="dataSourceName" value="ds1"/>
>             <param name="schemaObjectPrefix" value="fs_${wsp.name}_"/>
>         </FileSystem>
>         <!--
>             persistence manager of the workspace:
>             class: FQN of class implementing the PersistenceManager
> interface
>         -->
>         <PersistenceManager
> class="org.apache.jackrabbit.core.persistence.pool.OraclePersistenceManager">
>             <param name="dataSourceName" value="ds1"/>
>             <param name="schemaObjectPrefix" value="pm_${wsp.name}_"/>
>         </PersistenceManager>
>         <!--
>             Search index and the file system it uses.
>             class: FQN of class implementing the QueryHandler interface
>         -->
>         <SearchIndex
> class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
>             <param name="path" value="${wsp.home}/index"/>
>             <param name="analyzer"
> value="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
>             <param name="queryClass"
> value="org.apache.jackrabbit.core.query.QueryImpl"/>
>             <param name="excerptProviderClass"
> value="org.apache.jackrabbit.core.query.lucene.DefaultHTMLExcerpt"/>
>             <param name="supportHighlighting" value="true"/>
>             <param name="tikaConfigPath"
> value="${wsp.home}/tika-config.xml"/>
>         </SearchIndex>
> </Workspace>
>
>
> *tika-config.xml:*
>
> <?xml version="1.0" encoding="UTF-8"?>
> <properties>
>     <mimeTypeRepository
> resource="/org/apache/tika/mime/tika-mimetypes.xml" magic="false"/>
>     <parsers>
>            <parser name="parse-html"
> class="org.apache.tika.parser.html.HtmlParser">
>                <mime>text/html</mime>
>                <mime>application/xhtml+xml</mime>
>                <mime>application/x-asp</mime>
>            </parser>
>     </parsers>
> </properties>
>
> *JCR-SQL2 queries tested:*
>
> 1) SELECT * FROM [nt:file] as file WHERE CONTAINS(file.*, 'This')
>
> 2) SELECT * FROM [nt:file] as file WHERE CONTAINS(file.*, 'This*')
>
> 3)
> SELECT file.*, resource.* FROM [nt:file] AS file
> INNER JOIN [nt:resource] AS resource ON ISCHILDNODE(resource, file)
> WHERE resource.[jcr:mimeType] = 'text/html'
> AND CONTAINS(file.*, 'This')
>
> 4)
> SELECT file.*, resource.* FROM [nt:file] AS file
> INNER JOIN [nt:resource] AS resource ON ISCHILDNODE(resource, file)
> WHERE resource.[jcr:mimeType] = 'text/html'
> AND CONTAINS(file.*, 'This*')
>
> *Result:*
> Nothing seems to work.  If I remove the CONTAINS() clause from the
> queries, I am able to get rows from all the queries above and for query #3
> & #4 I can see that the field resource.[jcr:data] has the text ("This") I
> am searching for when I dump the result to the log file.  I've also tried
> deleting the index folder so that the repository will be re-indexed but I
> am still not able to do full-text search successfully.
>
> What am I missing?  In addition, is there any documentation on how to
> configure tika (tika-config.xml)?
>
>
> Thanks and Regards,
> Orlando
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message