lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <karl.wri...@nokia.com>
Subject RE: Solr query question
Date Wed, 28 Apr 2010 23:13:05 GMT
That's certainly an option, and I had thought of it already, but the downside is that you won't
be able to search for documents that *aren't* indexed via LCF under that model.  Which is
why I wanted to try to make the other approach fly.

FWIW, I was also told by a colleague that, because this is a *negative* query, you don't have
this problem because Solr does some kind of optimization that prevents the wildcard from being
expanded.  I am hoping that this is more than just an urban legend. ;-)

Karl

________________________________________
From: ext Earwin Burrfoot [earwin@gmail.com]
Sent: Wednesday, April 28, 2010 6:52 PM
To: dev@lucene.apache.org
Subject: Re: Solr query question

The best way to match documents that have no values for a specific
field, is to have a special term in that (or another) field, that you
add to the index when, well, a document has no values for that field.
Let's call this term - NULL. You then directly match on it with a
TermFilter/Query.
With your approach, if your field has lots of unique terms, you're in
for one slow query.

On Thu, Apr 29, 2010 at 02:40,  <karl.wright@nokia.com> wrote:
> Adding to the getFilters() list seems reasonable - although, to be fair, my code does
seem to work as intended when the component is added "last".  I'll do some experimentation
and see what model things work most consistently with.
>
> TermRangeQuery doesn't seem to map readily to the functionality I'm looking for.  I basically
want to match documents that have no values for a specific field.  TermRange implies that
I know the potential set of values, and I don't at that point (unless I can use null or something
for the start/end strings in the range?)  Plus it is a Query, not a Filter, so if I use it
I'd also need to wrap it with FilterWrappedQuery or some such, no?  Is there any benefit in
using Query objects over Filter objects, or visa versa?  (I *am* trying to be sure that the
security filter does not affect relevance, for what it's worth...)
>
> Karl
>
> ________________________________________
> From: ext Erik Hatcher [erik.hatcher@gmail.com]
> Sent: Wednesday, April 28, 2010 5:54 PM
> To: connectors-dev@incubator.apache.org
> Cc: dev@lucene.apache.org
> Subject: Re: Solr query question
>
> Rather than rewriting the original query, add a filter query (fq param
> on the HTTP interface).  I think in the API you'll be using
> rb.getFilters() and adding a filter to List returned.
>
> Running your component last won't work (will it?), as it needs to be
> run before the "query" component to take effect.
>
> Re: WildcardFilter - I think you want TermRangeQuery there instead.
>
>        Erik
>
>
>
> On Apr 28, 2010, at 5:35 PM, <karl.wright@nokia.com> wrote:
>
>> Turns out that, for the standard requestHandler, running this
>> SearchComponent first causes its rewritten query to be lost.
>> Running last fixed the problem.  (I'd *love* to know why that would
>> be necessary.)
>>
>> But I'd still like comment as to whether the WildcardFilter
>> construct is expected to be efficient in this context, or not. ;-)
>>
>> Karl
>>
>>
>> ________________________________________
>> From: Wright Karl (Nokia-S/Cambridge)
>> Sent: Wednesday, April 28, 2010 2:57 PM
>> To: connectors-dev@incubator.apache.org
>> Subject: Solr query question
>>
>> Hi Solr-knowledgeable folks,
>>
>> The LCF Solr SearchComponent plugin I'm developing doesn't quite
>> work.  The query I'm trying to do is:
>>
>> -(allow_token_document:*) and -(deny_token_document:*) and <the
>> user's search>
>>
>> The result I'm seeing is that everything in the user's search
>> matches, unlike what I see in the admin UI, where the above query
>> works perfectly.
>>
>> The code I'm using to do the negative wildcard searches is as follows:
>>
>>  public void prepare(ResponseBuilder rb) throws IOException
>>  {
>>      BooleanFilter bf = new BooleanFilter();
>>
>>
>>      // No authenticated user name; only return 'public' documents
>> (those with no security tokens at all)
>>      // That query is:
>>      // (fieldAllowShare is empty AND fieldDenyShare is empty AND
>> fieldAllowDocument is empty AND fieldDenyDocument is empty)
>>
>>      // We're trying to map to:  -(fieldAllowShare:*) , which should
>> be pretty efficient in Solr because it is negated.  If this turns
>> out not to be so, then we should
>>      // have the SolrConnector inject a special token into these
>> fields when they otherwise would be empty, and we can trivially
>> match on that token.
>>
>>      bf.add(new FilterClause(new WildcardFilter(new
>> Term(fieldAllowDocument,"*")),BooleanClause.Occur.MUST_NOT));
>>      bf.add(new FilterClause(new WildcardFilter(new
>> Term(fieldDenyDocument,"*")),BooleanClause.Occur.MUST_NOT));
>>
>>     // Concatenate with the user's original query.
>>     FilteredQuery query = new FilteredQuery(rb.getQuery(),bf);
>>     rb.setQuery(query);
>>  }
>>
>>
>> Any hints welcome!
>> Karl
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>



--
Kirill Zakharenko/Кирилл Захаренко (earwin@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message