lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Delete / filter / hide query results
Date Thu, 15 Jan 2009 22:21:09 GMT

: can't be part of a field or something like this. So let's say that the only
: way to know if a user has access rights is by calling something like
: accessRights(sessionID, docID) where docID is stored in a field.

first tip: 'stored' valuves are going to be really inefficient to deal 
with on every request, at a bare minimum make sure this field is indexed 
and make all of your custom code access it using the FieldCache.

: I then decided to use a custom SearchComponent called right after index
: querying (before faceting is done) but for what I have read it's not a good
: idea to delete results because they are stored in more than one place and it
: could break the caching system (I suppose that if I delete results for user
: A they will be deleted for user B too if he makes the same query although he
: does have access rights). Anyway I don't really understand where results are
: stored in ResponseBuilder; DocSet / DocList are pretty obscur.

A DocSet is an unordered set of documents -- in the context of a query 
it's the set of all documents matching that query.  A DocList is an 
ordered (sub-)list of documents with some metadata about the whole list -- 
in the context of a query it's the "page" of documents being returned ot 
the user; ie: docs 11-20 of 5478.  (this is all pretty well mentioned in 
the docs)

if you want to modify the DocList/DocSet included in query response, it's 
fairly easy to do -- the key is just that you shouldn't modify the 
existing DocSet/DocList objects becuase they are probably stored in the 
cache, but you are free to construct new instances and replace the ones in 
the response ... the FacetComponent will use your replacement DocSet when 
it comes.

note that applying your access control to the DocSet will be easy, because 
it's a complete set of unordered docs, you can remove anyting you want.  
but the DocList has a lot more itneresting use cases to worry about.   if 
the DocList is 11-20 or 5478 total matches, and you wnat to remove 2 you 
have to go search for what the next 2 would be to make sure you still 
return 10.  but you also have to worry about wether the orriginal 11-20 
that the QueryComponent generated were right in the first place.  when the 
user made his first request for 1-10, your security component might have 
pulled out 3, but the QueryComponent didn't know that when it picked 
11-20, so you are already off by 3 from where you should be.

this is why post-processing access control tends to be a bad idea (beyond 
just extra goodies like faceting) ... things get a lot cleaner if you 
ensure your access controls get applied at query time.

you should cosider implementing your access controls as a new type of 
query and using it as a filter ... with the new ValueSource parser hooks 
you could implement your logic as a "function" that takes a sessionId as 
input and reuse all of the existing query code.


-Hoss


Mime
View raw message