lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: indexing documents (or pieces of a document) by access controls
Date Tue, 12 Jun 2007 14:27:35 GMT
Hello Nate,

IMHO, you will not be able to do this in solr unless you accept pretty hard constraints on
your ACLs (I will get back to this in a moment). IMO, it is not possible to index documents
along with ACLs. ACLs can be very fine grained, and the thing you describe, ACL specific parts
of a document....well, I wouldn't know how you would index this. (imagine you change the ACL
for a specific user. How do you know what to re-index and what not. Suppose you add a user?
I really do not think it is possible based on fine grained ACLs). 

You also should realize you are trying to find an answer to an extremely complex problem:
authorisation in an index (I am trying to develop facetted navigation in combination with
authorisation in a lucene index in jackrabbit, but I think this is not the place to discuss
it)

So, in your case, if you want to use solr and some way of ACLs, I think basically you can
only manage this if:

1) you ACLs are some sort of paths in a hiearchical based structure, where you index the hierarchical
structure along with the content. Then when quering you have to include the folders that user
is allowed to see

2) you need to keep bitset for each user which documents are allowed (but, you have even ACLs
inside documents). Also, keeping bitsets up2date for many users is almost impossible, because

- lucene document ids possible change after merging segments
- updating documents might mean updating many many bitsets if you have many users

For these reasons, I do not think you can achieve with solar what you want, unless you are
going to work with something like: updating the index and ACL bitsets once a day.

Regards Ard


Can anyone give me some advice on breaking a document up and indexing it
by access control lists.  What we have are xml documents that are
transformed based on the user viewing it.  Some users might see all of
the document, while other may see a few fields, and yet others see
nothing at all.  The access control lists may be a role the user belongs
to, it may be a list of groups, or even a combination of the two.

I can transform the xml to the plain text that I want to index, and key
it off of the acls and then pass along a list of acls that the user
issuing a query belongs to when searching.  But I guess I'm not really
sure how to do this the best way.

Anyone have any thoughts?

Thanks!
Nate





Mime
View raw message