lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shlomy Reinstein <srein...@gmail.com>
Subject Re: Searching for a strict prefix
Date Mon, 24 May 2010 13:54:18 GMT
Hi,

Thanks for the help.
BTW, if anyone is interested: I tried the same with KeywordAnalyzer -
added a field with value "m_szName", and tried to find it using
"FieldName:m_szN*" but failed. Someone in the Lucene IRC channel
showed me why - QueryParser, by default, lowercases all expanded terms
(e.g. those with '*' in them), so it actually looked for "m_szn*".

Thanks again,
Shlomy

On Mon, May 24, 2010 at 4:37 PM, Ian Lea <ian.lea@gmail.com> wrote:
> I bet it's that underscore in m_sz.  Different analyzers do different
> things with different punctuation characters.  I can never remember
> which does exactly what - it'll be in the javadocs or Lucene In Action
> or somewhere on the web.
>
> You can check what exactly has been indexed by using Luke - always a good idea.
>
> In this case you probably want an analyzer that just downcases the tag
> names.  Easy enough to build your own if there isn't an existing one
> that meets your needs.
>
>
> --
> Ian.
>
>
> On Mon, May 24, 2010 at 1:14 PM, Shlomy Reinstein <sreinst1@gmail.com> wrote:
>> Hi,
>>
>> Thanks. By "strict prefix", I meant a prefix of the name
>> (case-insensitive). What you suggest ("tagname: updateC*") was the
>> first thing I tried, but it happens to work only partially. In my
>> case, I have a lot of names beginning with "m_sz<something>", e.g.
>> "m_szComment", "m_szName". Trying a query like "tagname: m_szC*" will
>> not find the "m_szComment" value. I just wrote the wrong example here.
>>
>> Shlomy
>>
>> On Mon, May 24, 2010 at 11:32 AM, Ian Lea <ian.lea@gmail.com> wrote:
>>> StandardAnalyzer should work fine, mark the field as indexed, no need
>>> to store it unless you want to retrieve it for display.
>>>
>>> Query via QueryParser using "tagname: updateC*" or programatically via
>>> PrefixQuery.
>>>
>>>
>>> Although I'm not sure exactly what you mean by "strict prefix".  If
>>> you mean that the prefix should match exactly on case, punctuation
>>> etc. maybe use KeywordAnalyzer instead.
>>>
>>>
>>>
>>> --
>>> Ian.
>>>
>>>
>>> On Sun, May 23, 2010 at 3:03 PM, Shlomy Reinstein <sreinst1@gmail.com>
wrote:
>>>> Hi,
>>>>
>>>> I have a Lucene index that contains source code tags (a tag can be any
>>>> named source code element - function, class, variable). Each document
>>>> contains a field with the tag name and some additional information.
>>>> I'd like to be able to perform strict prefix queries. E.g. if I have a
>>>> tag named "updateChildren", I'd like to be able to write a query like
>>>> "updateC*" and get the list of tags that have this prefix
>>>> (updateChildren being one of them).
>>>>
>>>> Is there a way to do this? Please suggest how I should store the field
>>>> for this purpose - which analyzer to use, whether to keep it stored,
>>>> etc.
>>>>
>>>> Thanks,
>>>> Shlomy
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message