jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rfr <ere...@gmail.com>
Subject Re: Strange fulltext search behaviour
Date Thu, 28 Feb 2013 13:57:12 GMT
Replying to myself with further analysis.

The data I'm searching for is a "code" for a "file". So I assumed it
may be an analyzer problem.

I have thus configured lucene to index the property with the keyword
analyzer in my indexConfiguration.xml file:

<analyzers>
	<analyzer class="org.apache.lucene.analysis.KeywordAnalyzer">
		<property>gns:numeroLabel</property>
	</analyzer>
</analyzers>

After reindexing my content, I browsed my indexes with Luke and saw
that each code is indexed as a single token, what I expected.

I then tried to perform a search inside Luke, configuring the search
to search on "gns:numeroLabel" and to use the KeywordAnalyser.

I searched for L*640 or L\/*640 and Luke founds some documents (note
that in Luke, you have to escape the forntslash

I went back into my code, a searched for L*640 and found no results :(

I think the slashes are really causing some problems, but I can't
identify where ... The only thing I'm not sure is that Jackrabbit
correctly use the KeywordAnalyzer to analyze my query which is a
fullTextSearch() on the property.

Thanks for your help!


Regards,

Fred

On Tue, Feb 26, 2013 at 2:16 PM, rfr <erefer@gmail.com> wrote:
> Hello!
>
> In our application, some nodes have "record numbers" in the form
> [A-Z]/([A-Z]|[0-9])+/[0-9]{9}
>
> For exemple:
>
> N/620/000002032
> M/AKA/000000235
>
> or
>
> L/AMA/0000000100
>
> If I perform a full text search on this values, the search find my nodes.
>
> If i try a search like N/*2032 or M/AK*235, I'm also able to retrieve my nodes.
>
> But for an unknown reason, if I search for L/* or even
> L/AMA/000000?00, the system does not find any node and seems to not
> even search for something.
>
> The record number is stored in a string property and I'm sure the
> nodes are indexed since other queries on the same nodes works for
> other search tokens.
>
> It is like the "L/..." format is causing some troubles to the indexer
> or the search code.
>
> Any pointers? Can someone test this behaviour to see if it is reproductible?
>
> Thanks a lot!
>
> Regards,
>
> Fred

Mime
View raw message