lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jamie Johnson <>
Subject Re: Using payloads and user provided data in score
Date Wed, 22 Jul 2015 23:12:50 GMT
Looks like this may be what I'm looking for


I have not tried this yet but looks promising.

Assuming this works, thinking about your suggestion I would need to rewrite
the users query with the appropriate fields, are there any utilities for
doing this?  I'd be looking to rewrite a fielded query like +field:value
possibly to something like +( field.secure2:value)

Again thanks for suggestions
On Jul 22, 2015 5:20 PM, "Jamie Johnson" <> wrote:

> I answered my own question, looks like the field infos are always read
> within the IndexSearcher so that cost is already being paid.
> I would potentially have to duplicate information in multiple fields if it
> was present at multiple authorization levels, is there a limit to the
> number of fields within a document?  I'm also concerned this might skew my
> search results as terms that had more authorizations would appear in more
> fields and would result in more matches on query.  I'll play with this a
> little but I am still wondering about my original question.
> On Wed, Jul 22, 2015 at 4:45 PM, Jamie Johnson <> wrote:
>> I had thought about this in the past, but thought it might be too
>> expensive.  I guess in a search component I could look up all of the fields
>> that are in the index and only run queries against fields they should be
>> able to see once I know what is in the index (this is what you're
>> suggesting right?).
>> My concern would be that the number of fields per document would grow too
>> large to support this.  Our controls aren't simple like user or admin they
>> are complex combinations of authorizations so I would think there might be
>> a large number of fields that are generated using this approach.  Would
>> retrieving all field infos from Solr be expensive on each request to see
>> what they should be able to query?
>> On Wed, Jul 22, 2015 at 4:19 PM, Erick Erickson <>
>> wrote:
>>> Why don't you handle it all at the app level? Here's what I mean:
>>> I'm assuming that you're using edismax here, but the same principle
>>> applies if not.
>>> Your handler (say the "/select" handler) has a "qf" parameter which
>>> defines
>>> the fields that are searched over in the absence of a field qualifier,
>>> e.g.
>>> q=whatever&qf=title,description
>>> causes the search term to be looked for in the two fields "title" and
>>> "description"
>>> You can also set up the qf fields in the "/select" handler as one of
>>> the items in
>>> the <defaults> section....
>>> But, the qf param in the <defaults> section is just that... a default.
>>> So individual
>>> queries can override it. What I have in mind is that you'd look up the
>>> user's
>>> field-access list and append that list as necessary to the query and
>>> just pass it
>>> on through.
>>> Things to watch out for:
>>> 1> if the user specifies a field, you'll have to strip that off if
>>> they don't have rights,
>>> i.e. q=field1:whatever whenever
>>> ignores the qf parameter for "whatever" but does respect the qf param
>>> for "whenever".
>>> 2> If you have some kind of date field say that you want to facet
>>> over, you'd have
>>> to control that.
>>> 3> if you have a "bag of words" where you use copyField to add a bunch
>>> of field's
>>> data to an uber-field then the user can infer some things from that
>>> info, so you probably
>>> don't want to be careful about what copyFields you use.
>>> Best,
>>> Erick
>>> On Wed, Jul 22, 2015 at 12:21 PM, Jamie Johnson <>
>>> wrote:
>>> > I am looking for a way to prevent fields that users shouldn't be able
>>> to
>>> > know exist from contributing to the score.  The goal is to provide a
>>> way to
>>> > essentially hide certain fields from requests based on an access level
>>> > provided on the query.  I have managed to make terms that users
>>> shouldn't
>>> > be able to see not impact the score by implementing a custom Similarity
>>> > class that looks at the terms payloads and returns 0 for the score if
>>> they
>>> > shouldn't know the field exists.  The issue however is that I don't
>>> have
>>> > access to the request at this point so getting the users access level
>>> is
>>> > proving problematic.  Is there a way to get the current request that is
>>> > being processed via some thread local variable or something similar
>>> that
>>> > Solr maintains?  If not is there another approach that I could be
>>> using to
>>> > access information from the request within my Similarity
>>> implementation?
>>> > Any thoughts on this would be greatly appreciated.
>>> >
>>> > -Jamie

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message