jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Torsten Stolpmann <stolpm...@verit.de>
Subject Re: jcr sql2 - contains() full text search not working
Date Tue, 10 Jul 2012 20:43:02 GMT
Hi Carl,

AFAIK the jackrabbit repository *should* autodetect the content type of
your binary data and choose the right indexer for your files
automatically with a standard configuration.

It might help if you post your repository.xml here so others may have a
look.

Does your classpath contain all required dependent libraries (tika-core,
tika-parsers)? Whats your jackrabbit version?

Hope this helps,

Torsten

On 10.07.2012 22:26, Furst, Carl wrote:
> Thanks, Torsten..
> 
> Checked Luke... And by golly ... The field Was not there.. Why? jcr:data
> is a binary property.. Can't do fulltext searching on binary fields..
> 
> DOH!
> Carl Furst
> 
> 
> 
> 
> 
> On 6/27/12 2:29 PM, "Torsten Stolpmann"<stolpmann@verit.de>  wrote:
> 
>> It might be a good idea to check if jackrabbit/lucene has created the
>> correct search indices, just to be sure. Luke
>> (http://code.google.com/p/luke/) might help you with this.
>>
>> Per default index files are kept in repository/index if I recall
>> correctly.
>>
>> Torsten
>>
>> On 27.06.2012 20:06, Furst, Carl wrote:
>>> So given the below I tried to use
>>>
>>> 'inclu*' and 'include*' and still no results so I'm going to start
>>> looking
>>> into perhaps maybe some of these reasons as why:
>>>
>>>
>>> https://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2
>>> BA
>>> C8_incorrect_hits.3F
>>>
>>> Of course it could just be that the parser is not parsing the '*'.
>>>
>>> Thanks again,
>>>
>>>
>>>
>>> Carl Furst
>>>
>>>
>>>
>>>
>>>
>>> On 6/27/12 1:59 PM, "Furst, Carl"<Carl.Furst@mlb.com>   wrote:
>>>
>>>> Thanks Torsten,
>>>>
>>>> So even using JQOM would not help here. I'll read up more on lucine and
>>>> find out more. My main stumbling block here was where the query was
>>>> being
>>>> executed. Was it on the Derby level or the Lucine level..
>>>>
>>>> This has cleared that part of it up for me as well.
>>>>
>>>> Thanks again,
>>>>
>>>> Carl Furst
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 6/27/12 1:50 PM, "Torsten Stolpmann"<stolpmann@verit.de>   wrote:
>>>>
>>>>> Hi Carl,
>>>>>
>>>>> per default the underlying Lucene implementation does not match
>>>>> leading
>>>>> wildcards for performance reasons. See also:
>>>>>
>>>>> https://wiki.apache.org/lucene-java/LuceneFAQ#What_wildcard_search_supp
>>>>> or
>>>>> t
>>>>> _is_available_from_Lucene.3F
>>>>>
>>>>> So just matching '*' will not work, but eg. 'i*' might give you the
>>>>> results you were looking for.
>>>>>
>>>>> Sadly enough I did not find any reference to this in the JackRabbit
>>>>> documentation.
>>>>>
>>>>> Took me quite a while to find that too.
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Torsten
>>>>>
>>>>> On 27.06.2012 17:19, Furst, Carl wrote:
>>>>>> I'm probably missing something here but everything I've read so far
>>>>>> leads
>>>>>> me to believe this should work..
>>>>>>
>>>>>> I have nodes in a repositoy of type nt:folder and nt:file. nt:file
>>>>>> has
>>>>>> a
>>>>>> child node jcr:content of type nt:resource which has a child property
>>>>>> called jcr:data
>>>>>>
>>>>>> There are many cases where the jcr:data column has the world
>>>>>> 'include'
>>>>>> in
>>>>>> it. They are jsp files so, yes, I know this word exists in several
>>>>>> files.
>>>>>>
>>>>>> So here's the sql I use:
>>>>>>
>>>>>> select * from [nt:resource] where  contains([jcr:data], 'include');
>>>>>>
>>>>>> Here's the sql that is returned from q.getStatement() :
>>>>>>
>>>>>> SELECT [nt:resource].* FROM [nt:resource] WHERE
>>>>>> CONTAINS([nt:resource].[jcr:data], 'include');
>>>>>>
>>>>>> Here is a sample text in jcr:data to search on.
>>>>>>
>>>>>> <%@ include file="..."
>>>>>>
>>>>>>
>>>>>> ... More jsp here..
>>>>>> <%/jsp:include...
>>>>>>
>>>>>> Yet it doesn┬╣t find it. I feel I'm missing something.. Do I need
to
>>>>>> add
>>>>>> a
>>>>>> "searchable" mixin or something?
>>>>>>
>>>>>> Any ideas why this is not being found?
>>>>>>
>>>>>> It used to be that apache had the cdn file for jackrabbit node types
>>>>>> was
>>>>>> readily available. Does anyone know where I can find the cdn file
for
>>>>>> jackrabbit node types?
>>>>>>
>>>>>> jcr:content is unstructured, but I explicitly make the type
>>>>>> nt:resource
>>>>>> (otherwise the statement would would not be parsed, Query object
>>>>>> would
>>>>>> throw an error, like "table not found," right? Because the type is
a
>>>>>> table). So the type is right.. The field is right.. The search is
not
>>>>>> working.
>>>>>>
>>>>>>
>>>>>> I'm using Jackrabbit without any special configuration. Just the
war
>>>>>> in
>>>>>> a
>>>>>> simple tomcat deployment. So it's sitting on top of Derby and Lucine.
>>>>>>
>>>>>>
>>>>>> Any help would be appreciated.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Carl Furst
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> **********************************************************
>>>>>>
>>>>>> MLB.com: Where Baseball is Always On
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> **********************************************************
>>>>
>>>> MLB.com: Where Baseball is Always On
>>>
>>>
>>>
>>>
>>>
>>>
>>> **********************************************************
>>>
>>> MLB.com: Where Baseball is Always On
>>
> 
> 
> 
> 
> 
> 
> **********************************************************
> 
> MLB.com: Where Baseball is Always On


Mime
View raw message