Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 95631 invoked from network); 27 Apr 2010 21:22:34 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 Apr 2010 21:22:34 -0000 Received: (qmail 96711 invoked by uid 500); 27 Apr 2010 21:22:31 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 96412 invoked by uid 500); 27 Apr 2010 21:22:31 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 96388 invoked by uid 99); 27 Apr 2010 21:22:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Apr 2010 21:22:31 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of karl.wright@nokia.com designates 192.100.122.233 as permitted sender) Received: from [192.100.122.233] (HELO mgw-mx06.nokia.com) (192.100.122.233) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Apr 2010 21:22:23 +0000 Received: from esebh106.NOE.Nokia.com (esebh106.ntc.nokia.com [172.21.138.213]) by mgw-mx06.nokia.com (Switch-3.3.3/Switch-3.3.3) with ESMTP id o3RLLvon028856; Wed, 28 Apr 2010 00:22:00 +0300 Received: from vaebh102.NOE.Nokia.com ([10.160.244.23]) by esebh106.NOE.Nokia.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 28 Apr 2010 00:21:57 +0300 Received: from smtp.mgd.nokia.com ([65.54.30.6]) by vaebh102.NOE.Nokia.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.3959); Wed, 28 Apr 2010 00:21:53 +0300 Received: from NOK-EUMSG-01.mgdnok.nokia.com ([65.54.30.86]) by nok-am1mhub-02.mgdnok.nokia.com ([65.54.30.6]) with mapi; Tue, 27 Apr 2010 23:21:52 +0200 From: To: , CC: , Date: Tue, 27 Apr 2010 23:21:51 +0200 Subject: RE: FW: Solr and LCF security at query time Thread-Topic: FW: Solr and LCF security at query time Thread-Index: AcriOCiVEyU/hzyYRXCt7/5xd+D0MQDylLWQABLIY6Y= Message-ID: References: , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginalArrivalTime: 27 Apr 2010 21:21:53.0287 (UTC) FILETIME=[A83D3970:01CAE64F] X-Nokia-AV: Clean X-Virus-Checked: Checked by ClamAV on apache.org Ok, not hearing back from Peter, I've done some Solr research and written s= ome code that might work. The approach I've taken is most similar to SOLR = 1834, other than the LCF-centric logic. Hopefully there will be a chance t= o try this out in a full end-to-end way on the weekend, after which I will= submit it to the Solr team (where I think it most naturally would be built= and delivered). What it's going to need is either a static or dynamic schema addition to de= fine __ALLOW_TOKEN__document, __DENY_TOKEN__document, __ALLOW_TOKEN__share,= and __DENY_TOKEN__share fields. These should be string, multivalued field= s (I think). It would be great if these could be made a default part of So= lr; similarly, it would be good if the new search component was predelivere= d with Solr and mentioned (even if commented out) in the example solrconfig= .xml file. The only other thing that needs to be done to hook up the searc= h component is to include a configuration parameter describing the base URL= of the LCF authority service. Plus, as I said earlier, we still don't hav= e a canned solution for authentication yet - although I feel that will be s= traightforward. Comments welcome... Karl ________________________________________ From: Wright Karl (Nokia-S/Cambridge) Sent: Tuesday, April 27, 2010 8:20 AM To: connectors-dev@incubator.apache.org; dev@lucene.apache.org Cc: connectors-user@incubator.apache.org; lucene-dev@apache.org Subject: RE: FW: Solr and LCF security at query time Hi Peter, I finally had a moment to review the SOLR 1872 and SOLR 1834 contributions = in detail, and have a couple of SOLR-related questions. Both contributions rely on a SearchComponent to work their magic. However,= it also appears that each modifies the user query in a different way. 183= 4 uses MUST, MUST_NOT, and SHOULD filter items, while 1872 uses standard AN= D and OR filterquery clauses. Both of them are constructed using Solr Filt= erQuery objects. Here are my questions: (1) I am not conversant enough with Solr yet to know the difference between= the different kinds of clause structure. Do you know if there is a differ= ence? For example, is there any possibility that AND/OR clauses can permit= documents to be seen that should not be seen? (MUST and MUST_NOT sound a = lot more definite...) (2) Are Solr FilterQuery objects applied to constructing the query that wil= l be sent to Lucene? Or are they applied by Solr after-the-fact to the res= ultset? Or, is it a combination of the two, depending on the details of yo= ur actual filter clause? I also haven't heard much from you in the last week or so - have you though= t further about what you intend to do, and can you let me know whether you = are still interested in developing an LCF plugin for Solr? Thanks, Karl -----Original Message----- From: ext Peter Sturge [mailto:peter.sturge@googlemail.com] Sent: Thursday, April 22, 2010 12:23 PM To: dev@lucene.apache.org Cc: connectors-dev@incubator.apache.org; connectors-user@incubator.apache.o= rg; lucene-dev@apache.org Subject: Re: FW: Solr and LCF security at query time Hi Karl, See inline... On Thu, Apr 22, 2010 at 4:57 PM, wrote: > Hi Peter, > > The authority connectors don't perform authentication at this time. > In fact, LCF has nothing to do with authentication at all - just authoriz= ation. > The reason for this is because it is almost never the case that > somebody wants to provide multiple credentials in order to be able see th= eir results. > Most enterprises who have multiple repositories authenticate against > AD and then map AD user names to repository user names in order to > access those repositories. If you noted my earlier posts from this > morning, you may have noted that I'm looking at recommending JAAS plus > sun's kerb5 login module for handling the "authenticate against AD" > case, which would cover some 95%+ of the real world authentication needed= out there. > > I did read your earlier post regarding this, and I totally agree with you -= this is best handled 'upstream'. In fact, I use a JAAS plugin in other pla= ces in the product (not Solr) for authentication. > > Yes, the idea is to store SIDs in solr at index time. I don't know > enough about solr to know what kinds of issues this might entail, but > Lucene certainly has a model of metadata that's pretty flexible, so I > don't think this would be difficult at all. Eric Hatcher also seemed > to confirm my suspicions that this would not be a problem. > It's certainly not a problem to store this data in Solr. The problem is mor= e that you don't really *want* to store this data at index time. There are lots of reasons for not wanting to 'hard-code' SID data with docu= ments in the index. Here's just a few: * What happens if/when you want to add explicit user access to some [grou= p of] documents ? (i.e. not via a group) * What happens if you need to revoke or change a user's or group's access= ? * It's difficult to move/replicate the index to another domain * For AD, SIDs are generally not meant to be stored long term outside of = AD, as they can be changed (this doesn't happen often, but it can happen af= ter an AD rebuild, domain type upgrade, data recovery etc.) These and other senarios mean re-indexing the stored data. When the index i= s huge, this is non-trivial (time-wise). There are not uncommon scenarios w= here user/group access control can change multiple times in one day. There might be a way of storing acl data in a payload or similar, but I'm n= ot sure how that would work across millions of [arbitrarily grouped ] docum= ents (I'm not familiar enough with payloads to know if this would be a good= or bad idea). > > This is exactly why I think that we need to do the authentication > upstream of the authority world. > > Agreed. > > If Solr handles arbitrary document metadata, then I think we could > just use that feature. But you know more about it than me, at this > point. It would be great to get an overview of potential ways of doing t= his. > > Payloads, maybe? > > For your particular task, it sounds like you are trying to read from > NTFS and apply security after-the-fact with some acl specification > file. In that case, I'd write a repository connector that was based > on the file system connector (already part of the stable of connectors > for LCF) which reads ACL information from your acl.xml file. Or, if > you prefer a UI for specifying ACL information, you could extend the > connector so that security is configured in the UI without having an > external acl.xml file at all - which would be a nice addition to the > existing file system connector. (Repository connections and jobs are > configured internally in LCF by XML documents stored in the database, > so they can be arbitrarily structured. I'm happy to help you figure > out how to do this if this is what you decide to do.) > > For my particular requirements, there are no files - the data is > generated from the network and stored. After the fact, there is no persistent locatio= n of this data other than in Solr. Storing the acl info using the connector sounds very interesting. Could be = worth looking at in more details. Thanks! > I think we still need to add in the authentication piece to make this > all work for you, so perhaps you can describe how you expect a user to > interact with your system, so I can understand your design issues. > > Thanks, > Karl > > > -----Original Message----- > From: ext Peter Sturge [mailto:peter.sturge@googlemail.com] > Sent: Thursday, April 22, 2010 11:32 AM > To: dev@lucene.apache.org > Cc: connectors-user@incubator.apache.org; lucene-dev@apache.org; > connectors-dev@incubator.apache.org > Subject: Re: FW: Solr and LCF security at query time > > Hi Karl, > > Thanks very much for your detailed explanation - really good! > > As I've thought through some of the implications, I've added comments > below, so I hope they don't seem too jumbled... > > I suppose on the 'authority' side, it works kind of as I envisioned it > would. > > For general Solr access control, there's two layers of security that > need to be addressed: > 1. Authentication - make sure the incoming query is from a valid > user, and the passed-in credentials (hash, certificate etc.) are > correct 2. Query filtering - potentially reduce the number/type of > returned results based on the allow/deny metadata for the > authenticated user > > I can see how the LCF auth connector works for 2., but can it do 1. as > well? > It would be good if this could somehow be integrated into any > container (Tomcat/Jetty et al) authentication that might be configured > (probably related to your previous post). I many ways, it could/should > be that the Authority (AD) part of the connector should only be > concerned with 1. and not 2. (see below). > > So, on the repository side, there is also an LCF connector that > 'closes the loop' to provide the 'what is it I'm trying to control' side = of things. > I understand that LCF doesn't do the mapping - it delegates this task > to the caller, but provides both sides of the equation (authority & > repository). > > >>>>> > - Each file in DirectoryA will have the following > __ALLOW_TOKEN__document attributes inside Solr: "myAD:S-123-456-76890", a= nd "myAD:S-23-64-12345". > - Each file in DirectoryB will have the following > __ALLOW_TOKEN__document attributes inside Solr: "myAD:S-123-456-76890" > <<<<< > I think this is the bit that is worrying me - is this storing the SIDs > into Solr at document index time? This would be a problem for a whole > load of reasons, but maybe I'm missing something here? (see below for > a possible > alternative) > > Basically, what I'm getting at here is that the allow/deny values need > to be stored in one of three places: > 1. In the authority (e.g. inside AD) > 2. In the document metadata (index-time) 3. In external storage > (e.g. acl.xml, NTFS etc.) > > 1. Extending AD is pretty much out, as this causes too many interop > problems 2. 'Hard-coding' acl information in the index makes it > non-portable, resistent to changes, etc. > 3. acl.xml is coupled with a Solr instance, but is easily > ported/replicated. > Storing/retrieving acl information from the source (e.g. NTFS) is > problematic, as the source may not be accessible (it may not even exist). > > I believe 3. or a variant is the way to go on the repo side, which > means the LCF Authority connector is mainly for Authentication (see > above), which is what you want from AD et al integration. > The problem that arises from 'pluggable' authentication is that, if > you're not using a certificate, you have to start with a password, but > the connector only has access to the password hash (unless the pwd is > sent in the query url). I don't know of a way to confirm identities in > AD using only the username and hash (AD does the hash compare). I > believe this is where container-based integration will likely work better= . > > So that I can confirm my understanding...a scenario might be like this: > > We have an AD connector that fetches the SIDs and we can read them etc. > For my environment, where there are no 'files' (there's only a > transient network stream), we have an LCF 'Solr Field Filter Query' > connector that decides which Filter Queries to apply (allow and deny) > for the passed in SID(s). > > For another environment, let's say, NTFS, there might be an 'NTFS' > connector that would provide some kind of mapping of files/folders to > SID(s). Since Solr wouldn't intrinscially know about this, the acl > information would need to be stored somewhere in the index. This would > mean extending the Solr schema and storing metadata at index time. > The alternative is to re-use the 'Solr Field Filter Query' connector > for this as well (and any other document types that might be read in). > This keeps the index 'clean' of acl-specific metadata, and allows for > in-place changes and easy cross-document/index/instance access control. > > > If the above interpretation is [roughly] correct (please let me know > if I've got this wrong!), this would reduce down to having: > 1. One or more LCF Authority connectors (e.g. AD, Documentum, etc.) > (possibly/partly at the container level) > 2. At least an LCF Repository connector for 'acl.xml' > 3. Optional other LCF Repository connectors > > It sounds like you've now finished the first half of 1. by adding the > ability to get the required auth data from a Solr api call. The other > half of 1. will be implementing the LCF interface in the > SolrACLSecurity class, to effectively replace the 'user', 'group' and 'pa= ssword' bits of acl.xml. > > Does the above sound like an accurate interpretation? Just trying to > get a good picture of what work needs doing, where it goes, etc. > > Many thanks! > Peter > > > > > On Thu, Apr 22, 2010 at 2:52 PM, wrote: > > > >>>>>> > > What is the relationship between stored data (documents) and authoritie= s' > > access/deny attributes? (do you have any examples of what an > > access_token value might contain?) <<<<<< > > > > Documents have access/deny attributes; authorities simply provide > > the list of tokens that belong to an authenticated user. Thus, > > there's no access/deny for an authority; that's attached to the > > document (as it is in real-world repositories). > > > > Let's run a quick example, using Active Directory and a Windows file > > system. Suppose that you have a directory with documents in it, > > call it DirectoryA, and the directory allows read access to the > > following > SIDs: > > > > S-123-456-76890 > > S-23-64-12345 > > > > These SIDs correspond to active directory groups, let's call them > > Group1 and Group2, respectively. > > > > DirectoryB also has documents in it, and those documents have just > > the SID S-123-456-76890 attached, because only Group1 can read its cont= ents. > > > > Now, pretend that someone has created an LCF Active Directory > > authority connection (in the LCF UI), which is called "myAD", and > > this connection is set up to talk to the governing AD domain > > controller for this Windows file system. We now know enough to > > describe the document > indexing process: > > > > - Each file in DirectoryA will have the following > > __ALLOW_TOKEN__document attributes inside Solr: > > "myAD:S-123-456-76890", > and "myAD:S-23-64-12345". > > - Each file in DirectoryB will have the following > > __ALLOW_TOKEN__document attributes inside Solr: "myAD:S-123-456-76890" > > > > Now, suppose that a user (let's call him "Peter") is authenticated > > with the AD domain controller. Peter belongs to Group2, so his SIDs > > are > (say): > > > > S-1-1-0 (the 'everyone' SID) > > S-323-999-12345 (his own personal user SID) > > S-23-64-12345 (the SID he gets because he belongs to group 2) > > > > We want to look up the documents in the search index that he can see. > > So, we ask the LCF authority service what his tokens are, and we get > back: > > > > "myAD:S-1-1-0", "myAD:S-323-999-12345", and "myAD:S-23-64-12345" > > > > The documents we should return in his search are the ones matching > > his search criteria, PLUS the intersection of his tokens with the > > document ALLOW tokens, MINUS the intersection of his tokens with the > > document DENY tokens (there aren't any involved in this example). > > So only files that have one of his three tokens as an ALLOW > > attribute would be > returned. > > > > Note that what we are attempting to do is enforce AD's security with > > the search results we present. There is no need to define a whole > > new security mechanism, because AD already has one that people use. > > > > >>>>>> > > One of the key requirements I've worked to adhere to in SOLR-1872 is > > to ensure there are no security or other dependencies of indexed > > data with any external repository - most notably the file system. > > There are many reasons for wanting this, but one of the main ones is > > that Solr-stored data is not always based on file data (or > > accessible > file data). > > In fact, in my particular case, almost none of the indexed data > > comes from files. > > <<<<<< > > > > LCF is all about abstracting from repositories. It's not > > specifically about a file system, although that is a convenient > > example. If you are building your own kind of repository with your > > own security setup, that's fine - but in the LCF world you'd need to > > create an authority connector for your repository (which maybe reads > > your acl.xml file), as well as a repository connector (which hands > > documents to LCF and provides it with the access tokens that make > > security work). Of course, you can something much lighter that > > doesn't include LCF at all if you are just integrating a custom > > repository of your own, but it sounded like you were interested in the = broader problem here. > > > > So, LCF doesn't do "acl mapping" at all. It relies on its various > > connectors to work cooperatively to define access tokens in a way > > that is consistent from authority connector to repository connector > > for a given repository kind. Anybody can write a connector, so the > > beauty of all this is that you can build a system where data from > > many disparate sources is indexed, and security for each is > > simultaneously > enforced. > > > > Karl > > > > > > ------------------------------ > > *From:* ext Peter Sturge [mailto:peter.sturge@googlemail.com] > > *Sent:* Thursday, April 22, 2010 9:24 AM > > > > *To:* dev@lucene.apache.org > > *Cc:* connectors-user@incubator.apache.org; lucene-dev@apache.org; > > connectors-dev@incubator.apache.org > > *Subject:* Re: FW: Solr and LCF security at query time > > > > Hi Karl, > > > > Thanks very much for the diagram - > > Sorry about all the questions, but this raises a few new ones... > > > > What is the relationship between stored data (documents) and authoritie= s' > > access/deny attributes? (do you have any examples of what an > > access_token value might contain?) > > > > One of the key requirements I've worked to adhere to in SOLR-1872 is > > to ensure there are no security or other dependencies of indexed > > data with any external repository - most notably the file system. > > There are many reasons for wanting this, but one of the main ones is > > that Solr-stored data is not always based on file data (or > > accessible > file data). > > In fact, in my particular case, almost none of the indexed data > > comes from files. > > > > This is one reason why SOLR-1872 uses filter queries for its > > access/deny tokens - so that all the required information for access > > control completely resides within the Solr index itself. > > Is the LCF architecture acl 'mapping' between Solr fields (queries) > > and users, some external 'repository' (files) and users, or > > arbitrary > data (e.g. > > either of these)? > > > > I hope that makes sense... > > > > Thanks! > > Peter > > > > > > > > > > On Thu, Apr 22, 2010 at 10:25 AM, wrote: > > > >> Hi Peter, > >> > >> I've attached a diagram that is not in the wiki as of yet, and I'll > >> try to answer your questions. > >> > >> >>>>>> > >> Are the ACCESS_TOKEN and DENY_TOKEN values whatever have been > >> stored for a particular user in the underlying acl store (e.g. > >> Active > Directory)? > >> How does AD and/or LCF handle storing such data in its schema? > >> (does AD needs its schema extended?) Presumably, any such AD fields > >> would need to be queried for effective rights in order to cater for > >> group membership allows and denies. > >> <<<<<< > >> > >> The ACCESS_TOKEN and DENY_TOKEN values are, in one sense, arbitrary > >> strings that represent a contract between an LCF authority > >> connection and the LCF repository connection that picks up the > >> documents (from > wherever). > >> These tokens thus have no real meaning outside of LCF. You must > >> regard them as opaque. > >> > >> The contract, however, states that if you use the LCF authority > >> service to obtain tokens for an authenticated user, you will get > >> back a set that is CONSISTENT with the tokens that were attached to > >> the documents LCF sent to Solr for indexing in the first place. > >> So, you don't have to worry about it, and that's kind of the idea. > >> So you > imagine the following flow: > >> > >> (1) Use LCF to fetch documents and send them to Solr > >> (2) When searching, use the LCF authority service to get the > >> desired user's access tokens > >> (3) Either filter the results, or modify the query, to be sure the > >> access tokens all match up properly > >> > >> For the AD authority, the LCF access tokens consist, in part, of > >> the user's SIDs. For other authorities, the access tokens are > >> wildly > different. > >> You really don't want to know what's in them, since that's the job > >> of the LCF authority to determine. ;-) > >> > >> LCF is not, by the way, joined at the hip with AD. However, in > >> practice, most enterprises in the world use some form of AD single > >> signon for their web applications, and even if they're using some > >> repository with its own idea of security, there's a mapping between > >> the AD users and the repository's users. Doing that mapping is > >> also the job of the LCF authority for that repository. > >> > >> Hope this helps. Also, I'm not expecting time miracles here, so > >> don't sweat the schedule. > >> > >> > >> Karl > >> > >> > >> ________________________________________ > >> From: ext Peter Sturge [peter.sturge@googlemail.com] > >> Sent: Thursday, April 22, 2010 4:27 AM > >> To: dev@lucene.apache.org > >> Cc: connectors-user@incubator.apache.org; lucene-dev@apache.org; > >> connectors-dev@incubator.apache.org > >> Subject: Re: FW: Solr and LCF security at query time > >> > >> Hi Karl, > >> > >> Thanks for the quick turnaround. > >> I'm in the middle of a product release for us, so I fear I won't be > >> as quick as you... :-) > >> > >> I couldn't find a simple flow diagram or similar for LCF with > >> regards security (probably looking in the wrong place). > >> Perhaps you could help on these questions...? > >> > >> In SOLR-1872, the allows and denies are stored (in acl.xml) as > >> sub-queries, which are then used as filter queries in a user's search. > >> > >> Are the ACCESS_TOKEN and DENY_TOKEN values whatever have been > >> stored for a particular user in the underlying acl store (e.g. > >> Active > Directory)? > >> How does AD and/or LCF handle storing such data in its schema? > >> (does AD needs its schema extended?) Presumably, any such AD fields > >> would need to be queried for effective rights in order to cater for > >> group membership allows and denies. > >> > >> I guess I'm just trying to understand the architectural > >> flow/storage/retrieval of data in the various parts of the system, > >> but I admit, I need to do more research on this. > >> After our product release, when I get a few more spare cycles, I > >> can look at it in more detail. > >> > >> Many thanks! > >> Peter > >> > >> > >> > >> On Thu, Apr 22, 2010 at 1:02 AM, >> karl.wright@nokia.com>> wrote: > >> Hi Peter, > >> > >> I just committed the promised changes to the LCF Solr output connector= . > >> > >> ACL metadata will now be posted to the Solr Http interface along > >> with the document as the two following fields: > >> > >> __ACCESS_TOKEN__document > >> __DENY_TOKEN__document > >> > >> There will, of course, potentially be multiple values for each of > >> these two fields. > >> > >> Hope this helps, > >> Karl > >> > >> ________________________________ > >> From: ext Peter Sturge [mailto:peter.sturge@googlemail.com >> peter.sturge@googlemail.com>] > >> Sent: Tuesday, April 20, 2010 6:51 PM > >> > >> To: connectors-user@incubator.apache.org >> connectors-user@incubator.apache.org> > >> Subject: Re: FW: Solr and LCF security at query time > >> > >> Hi Karl, > >> > >> Thanks for the info. I'll have a look at the link and try to take > >> in as much sugar as my insulin levels will handle... > >> It sounds like the necessary interface(s) are already in LCF - just > >> a matter of implementing them in the Solr 1872 plugin. > >> I'll need to digest the LCF stuff to get to grips with it..please > >> bear with me while I do that... > >> > >> When you say: > >> The LCF solr output connection doesn't yet do this, but it is > >> trivial for me to make that happen. > >> Do you mean a mechanism by which solr.war can get url et al info > >> from its parent container (Tomcat, Jetty etc.), or have I > >> misinterpreted > this? > >> > >> > >> Thanks, > >> Peter > >> > >> > >> > >> > >> On Tue, Apr 20, 2010 at 11:05 PM, >> karl.wright@nokia.com>> wrote: > >> Hi Peter, > >> > >> I'm the principal committer for LCF, but I don't know as much about > >> Solr as I ought to, so it sounds like a potentially productive > collaboration. > >> > >> LCF does exactly what you are looking for - the only issue at all > >> is that you need to fetch a URL from a webapp to get what you are > >> looking for. The "plugs" are all inside LCF for different kinds of > >> repositories. Here's a link that might help with drinking the LCF > "koolaid", as it were: > >> https://cwiki.apache.org/confluence/display/CONNECTORS/Lucene+Conne > >> ct > >> ors+Framework+concepts > >> > >> The url would be something like this (on a locally installed > >> tomcat-based LCF instance): > >> > >> > >> http://localhost:8080/lcf-authority-service/UserACLs?username=3Dsomeu > >> se > >> rname@somedomain.com > >> > >> ... and this fetch returns something like: > >> > >> TOKEN:xxxxxxx > >> TOKEN:yyyyyyy > >> TOKEN:zzzzzzz > >> .... > >> > >> ... which represent the amalgamated tokens for all of the defined > >> authorities, and by some strange coincidence ( ;-) ) are compatible > >> with certain pieces of metadata that have been passed into Solr > >> with each document - one set of Allow tokens, and a second set of > >> Deny tokens. The LCF solr output connection doesn't yet do this, > >> but it is trivial for me to make that happen. > >> > >> Does this sound plausible to you? > >> > >> Karl > >> > >> > >> ________________________________ > >> From: ext Peter Sturge [mailto:peter.sturge@googlemail.com >> peter.sturge@googlemail.com>] > >> Sent: Tuesday, April 20, 2010 5:41 PM > >> To: connectors-user@incubator.apache.org >> connectors-user@incubator.apache.org>; dev@lucene.apache.org >> dev@lucene.apache.org> > >> > >> Subject: Re: FW: Solr and LCF security at query time > >> > >> Hi Karl, > >> > >> Integrating LCF to get external token support for SOLR-1872 sounds > >> very interesting indeed. I don't know anything about LCF, but one > >> of the things I was planning for SOLR-1872 is to make acl.xml (or > >> rather its behaviour) 'pluggable' - i.e. it would just be one of a > >> series of plugins that could be used for obtaining back-end > >> authentication > information. > >> > >> If you're good with LCF, perhaps we could work together to build > >> this > in. > >> One of the first things would be defining an interface that would > >> be as easy as possible to plug LCF into. Have you any > >> suggestions/insight on this front? > >> > >> Many thanks, > >> Peter > >> > >> > >> > >> On Tue, Apr 20, 2010 at 4:08 PM, >> karl.wright@nokia.com>> wrote: > >> SOLR-1872 looks exactly like what I was envisioning, from the > >> search query perspective, although instead of the acl xml file you > >> specify LCF stipulates you would dynamically query the > >> lcf-authority-service servlet for the access tokens themselves. > >> That would get you support for AD, Documentum, LiveLink, Meridio, > >> and Memex for free. It seems likely that this component could be > >> modified to work with LCF with minor > effort. > >> > >> The missing component still seems to be AD authentication, which > >> needs a solution. > >> > >> Karl > >> > >> ________________________________ > >> From: ext Peter Sturge [mailto:peter.sturge@googlemail.com >> peter.sturge@googlemail.com>] > >> Sent: Tuesday, April 20, 2010 10:44 AM > >> To: dev@lucene.apache.org > >> Subject: Re: FW: Solr and LCF security at query time > >> > >> If you want to do this completely within Solr, have a look at: > >> SOLR-1834 and SOLR-1872. These use a SearchComponent plugin for Solr. > >> > >> Thanks, > >> Peter > >> > >> > >> > >> On Tue, Apr 20, 2010 at 1:25 PM, >> karl.wright@nokia.com>> wrote: > >> FYI > >> > >> ________________________________ > >> From: Wright Karl (Nokia-S/Cambridge) > >> Sent: Tuesday, April 20, 2010 8:16 AM > >> To: 'dominique.bejean@eolya.fr' > >> Cc: 'solr-dev@apache.org'; ' > >> connectors-dev@incubator.apache.org >> connectors-dev@incubator.apache.org>'; ' > >> connectors-user@incubator.apache.org >> connectors-user@incubator.apache.org>' > >> Subject: RE: Solr and LCF security at query time > >> > >> Dominique, > >> > >> Yes, I am aware of this ticket and contribution. Luckily LCF > >> establishes a powerful multi-repository security model, even though > >> it doesn't yet do the final step of enforcing that model at the > >> search end. LCF allows you to define multiple authorities to > >> operate against disparate repositories, and use the appropriate > >> authority to secure any given document. The solr people are aware > >> of this design, which addresses the issues raised by SOLR-1834 very > >> nicely. However, as I said before, time is a problem, and the work > >> still needs to be > done. > >> > >> I suggest you read up on the actual security model of LCF, and > >> perhaps experiment with that and the SOLR-1834 contribution, to see > >> if there is common ground. One thing we've learned at MetaCarta is > >> that post-filtering for security purposes is expensive, and it is > >> better to modify the queries themselves to restrict the results, if > >> possible. I'm not sure which approach SOLR-1834 takes, although it > >> sounds like it might be the filtering approach. Still, it would be > better than nothing. > >> > >> Please let me know what you find out. > >> > >> Thanks, > >> Karl > >> > >> ________________________________ > >> From: ext Dominique Bejean [mailto:dominique.bejean@eolya.fr >> dominique.bejean@eolya.fr>] > >> Sent: Tuesday, April 20, 2010 8:03 AM > >> To: Wright Karl (Nokia-S/Cambridge) > >> Cc: connectors-user@incubator.apache.org >> connectors-user@incubator.apache.org>; > >> connectors-dev@incubator.apache.org >> connectors-dev@incubator.apache.org> > >> Subject: Re: Solr and LCF security at query time > >> > >> Karl, > >> > >> Thank you for your reply. > >> > >> I made some research today and I found this : > >> http://freesurf001.appspot.com/issues.apache.org/jira/browse/SOLR-1 > >> 83 > >> 4 http://demo.findwise.se:8880/SolrSecurity/ > >> > >> Sorl security model have to be able to filter result list with > >> items coming from various sources at the same time (livelink, > >> documentum, file system, ...). Big subject :) > >> > >> Dominique > >> > >> > >> Le 20/04/10 13:34, > >> karl.wright@nokia.com a ?crit : > >> Hi Dominique, > >> > >> At the moment, in order to enforce the LCF security model within > >> Lucene/Solr, you will need to build this functionality into > >> whatever client you are using to display the Lucene search results. > >> Specifically, you would need to take the following steps: > >> > >> (1) Have your users access your search client through Apache. > >> (2) Use the Apache module mod_auth_kerb, combined with LCF's > >> mod_authz_annotate, to cause authorization HTTP headers to be > >> transmitted to the client webapp. > >> (3) Have your client webapp alter whatever queries it is doing, to > >> add an appropriate query clause for each of the access tokens > >> transmitted in the headers. > >> > >> (This is how it is done at MetaCarta.) > >> > >> Alternatively, you may find a way to do this completely with a web > >> application under a Java app server such as Tomcat. I have not yet > >> done the research to find out whether this is a feasible alternative. > >> Effectively, what you need something like mod_auth_kerb to do is to > >> authenticate your user against Active Directory, or whomever the > authenticator ought to be. > >> JAAS may be helpful here. > >> > >> There are, of course, intentions to fill out the missing pieces > >> more completely and transparently via a Solr search plugin and/or filt= er. > >> What has been lacking is time. If you are in a position to do > >> development in this area, we're happy to have any assistance you > >> might > provide. > >> > >> Thanks, > >> Karl > >> ________________________________ > >> From: ext Dominique Bejean [mailto:dominique.bejean@eolya.fr] > >> Sent: Tuesday, April 20, 2010 5:06 AM > >> To: connectors-user@incubator.apache.org >> connectors-user@incubator.apache.org> > >> Subject: Solr and LCF security at query time > >> > >> Hi, > >> > >> I don't see in LCF wiki how Solr and LCF works together at query > >> time in order to remove from the result list the items the user is > >> not allowed to access. > >> > >> In > >> http://cwiki.apache.org/CONNECTORS/lucene-connectors-framework-conc > >> ep > >> ts.html, > >> I just see these sentences : > >> > >> " Once all these documents and their access tokens are handed to > >> the search engine, it is the search engine's job to enforce > >> security by excluding inappropriate documents from the search > >> results. For Lucene, this infrastructure is expected to be built on > >> top of Lucene's generic metadata abilities, but has not been > >> implemented at > this time." > >> > >> I am not sure to understand. Does this mean that for the moment, it > >> is not possible for Solr to apply security by using an Authority > Connector ? > >> > >> Dominique > >> > >> > >> > >> > >> > >> > >> ------------------------------------------------------------------- > >> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For > >> additional commands, e-mail: dev-help@lucene.apache.org > >> > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For > additional commands, e-mail: dev-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org