Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 80659 invoked from network); 20 Apr 2010 22:05:55 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 20 Apr 2010 22:05:55 -0000 Received: (qmail 27821 invoked by uid 500); 20 Apr 2010 22:05:54 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 27701 invoked by uid 500); 20 Apr 2010 22:05:53 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 27605 invoked by uid 99); 20 Apr 2010 22:05:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Apr 2010 22:05:53 +0000 X-ASF-Spam-Status: No, hits=-0.8 required=10.0 tests=AWL,HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of karl.wright@nokia.com designates 192.100.122.230 as permitted sender) Received: from [192.100.122.230] (HELO mgw-mx03.nokia.com) (192.100.122.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Apr 2010 22:05:48 +0000 Received: from vaebh105.NOE.Nokia.com (vaebh105.europe.nokia.com [10.160.244.31]) by mgw-mx03.nokia.com (Switch-3.3.3/Switch-3.3.3) with ESMTP id o3KM5NHn019038; Wed, 21 Apr 2010 01:05:24 +0300 Received: from vaebh102.NOE.Nokia.com ([10.160.244.23]) by vaebh105.NOE.Nokia.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 21 Apr 2010 01:05:23 +0300 Received: from smtp.mgd.nokia.com ([65.54.30.5]) by vaebh102.NOE.Nokia.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.3959); Wed, 21 Apr 2010 01:05:18 +0300 Received: from NOK-EUMSG-01.mgdnok.nokia.com ([65.54.30.86]) by nok-am1mhub-01.mgdnok.nokia.com ([65.54.30.5]) with mapi; Wed, 21 Apr 2010 00:05:17 +0200 From: To: , Date: Wed, 21 Apr 2010 00:05:19 +0200 Subject: RE: FW: Solr and LCF security at query time Thread-Topic: FW: Solr and LCF security at query time Thread-Index: Acrg0jbei8OhmxrnTfSshZlniLbfHAAAgmKQ Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: multipart/alternative; boundary="_000_CF3CE3EFBCA3564185DF065952A267C85302A7BE80NOKEUMSG01mgd_" MIME-Version: 1.0 X-OriginalArrivalTime: 20 Apr 2010 22:05:18.0612 (UTC) FILETIME=[903E0540:01CAE0D5] X-Nokia-AV: Clean --_000_CF3CE3EFBCA3564185DF065952A267C85302A7BE80NOKEUMSG01mgd_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi Peter, I'm the principal committer for LCF, but I don't know as much about Solr as= I ought to, so it sounds like a potentially productive collaboration. LCF does exactly what you are looking for - the only issue at all is that y= ou need to fetch a URL from a webapp to get what you are looking for. The = "plugs" are all inside LCF for different kinds of repositories. Here's a l= ink that might help with drinking the LCF "koolaid", as it were: https://cw= iki.apache.org/confluence/display/CONNECTORS/Lucene+Connectors+Framework+co= ncepts The url would be something like this (on a locally installed tomcat-based L= CF instance): http://localhost:8080/lcf-authority-service/UserACLs?username=3Dsomeusernam= e@somedomain.com ... and this fetch returns something like: TOKEN:xxxxxxx TOKEN:yyyyyyy TOKEN:zzzzzzz .... ... which represent the amalgamated tokens for all of the defined authoriti= es, and by some strange coincidence ( ;-) ) are compatible with certain pie= ces of metadata that have been passed into Solr with each document - one se= t of Allow tokens, and a second set of Deny tokens. The LCF solr output co= nnection doesn't yet do this, but it is trivial for me to make that happen. Does this sound plausible to you? Karl ________________________________ From: ext Peter Sturge [mailto:peter.sturge@googlemail.com] Sent: Tuesday, April 20, 2010 5:41 PM To: connectors-user@incubator.apache.org; dev@lucene.apache.org Subject: Re: FW: Solr and LCF security at query time Hi Karl, Integrating LCF to get external token support for SOLR-1872 sounds very int= eresting indeed. I don't know anything about LCF, but one of the things I w= as planning for SOLR-1872 is to make acl.xml (or rather its behaviour) 'plu= ggable' - i.e. it would just be one of a series of plugins that could be us= ed for obtaining back-end authentication information. If you're good with LCF, perhaps we could work together to build this in. O= ne of the first things would be defining an interface that would be as easy= as possible to plug LCF into. Have you any suggestions/insight on this fro= nt? Many thanks, Peter On Tue, Apr 20, 2010 at 4:08 PM, > wrote: SOLR-1872 looks exactly like what I was envisioning, from the search query = perspective, although instead of the acl xml file you specify LCF stipulate= s you would dynamically query the lcf-authority-service servlet for the acc= ess tokens themselves. That would get you support for AD, Documentum, Live= Link, Meridio, and Memex for free. It seems likely that this component coul= d be modified to work with LCF with minor effort. The missing component still seems to be AD authentication, which needs a so= lution. Karl ________________________________ From: ext Peter Sturge [mailto:peter.sturge@googlemail.com] Sent: Tuesday, April 20, 2010 10:44 AM To: dev@lucene.apache.org Subject: Re: FW: Solr and LCF security at query time If you want to do this completely within Solr, have a look at: SOLR-1834 and SOLR-1872. These use a SearchComponent plugin for Solr. Thanks, Peter On Tue, Apr 20, 2010 at 1:25 PM, > wrote: FYI ________________________________ From: Wright Karl (Nokia-S/Cambridge) Sent: Tuesday, April 20, 2010 8:16 AM To: 'dominique.bejean@eolya.fr' Cc: 'solr-dev@apache.org'; 'connectors-dev@incu= bator.apache.org'; 'connectors-= user@incubator.apache.org' Subject: RE: Solr and LCF security at query time Dominique, Yes, I am aware of this ticket and contribution. Luckily LCF establishes a= powerful multi-repository security model, even though it doesn't yet do th= e final step of enforcing that model at the search end. LCF allows you to = define multiple authorities to operate against disparate repositories, and = use the appropriate authority to secure any given document. The solr peopl= e are aware of this design, which addresses the issues raised by SOLR-1834 = very nicely. However, as I said before, time is a problem, and the work st= ill needs to be done. I suggest you read up on the actual security model of LCF, and perhaps expe= riment with that and the SOLR-1834 contribution, to see if there is common = ground. One thing we've learned at MetaCarta is that post-filtering for se= curity purposes is expensive, and it is better to modify the queries themse= lves to restrict the results, if possible. I'm not sure which approach SOL= R-1834 takes, although it sounds like it might be the filtering approach. = Still, it would be better than nothing. Please let me know what you find out. Thanks, Karl ________________________________ From: ext Dominique Bejean [mailto:dominique.bejean@eolya.fr] Sent: Tuesday, April 20, 2010 8:03 AM To: Wright Karl (Nokia-S/Cambridge) Cc: connectors-user@incubator.apache.org; connectors-dev@incubator.apache.org Subject: Re: Solr and LCF security at query time Karl, Thank you for your reply. I made some research today and I found this : http://freesurf001.appspot.com/issues.apache.org/jira/browse/SOLR-1834 http://demo.findwise.se:8880/SolrSecurity/ Sorl security model have to be able to filter result list with items coming= from various sources at the same time (livelink, documentum, file system, = ...). Big subject :) Dominique Le 20/04/10 13:34, karl.wright@nokia.com a = =E9crit : Hi Dominique, At the moment, in order to enforce the LCF security model within Lucene/Sol= r, you will need to build this functionality into whatever client you are u= sing to display the Lucene search results. Specifically, you would need to= take the following steps: (1) Have your users access your search client through Apache. (2) Use the Apache module mod_auth_kerb, combined with LCF's mod_authz_anno= tate, to cause authorization HTTP headers to be transmitted to the client w= ebapp. (3) Have your client webapp alter whatever queries it is doing, to add an a= ppropriate query clause for each of the access tokens transmitted in the he= aders. (This is how it is done at MetaCarta.) Alternatively, you may find a way to do this completely with a web applicat= ion under a Java app server such as Tomcat. I have not yet done the resear= ch to find out whether this is a feasible alternative. Effectively, what y= ou need something like mod_auth_kerb to do is to authenticate your user aga= inst Active Directory, or whomever the authenticator ought to be. JAAS may= be helpful here. There are, of course, intentions to fill out the missing pieces more comple= tely and transparently via a Solr search plugin and/or filter. What has be= en lacking is time. If you are in a position to do development in this are= a, we're happy to have any assistance you might provide. Thanks, Karl ________________________________ From: ext Dominique Bejean [mailto:dominique.bejean@eolya.fr] Sent: Tuesday, April 20, 2010 5:06 AM To: connectors-user@incubator.apache.org Subject: Solr and LCF security at query time Hi, I don't see in LCF wiki how Solr and LCF works together at query time in or= der to remove from the result list the items the user is not allowed to acc= ess. In http://cwiki.apache.org/CONNECTORS/lucene-connectors-framework-concepts.= html, I just see these sentences : " Once all these documents and their access tokens are handed to the search= engine, it is the search engine's job to enforce security by excluding ina= ppropriate documents from the search results. For Lucene, this infrastructu= re is expected to be built on top of Lucene's generic metadata abilities, b= ut has not been implemented at this time." I am not sure to understand. Does this mean that for the moment, it is not = possible for Solr to apply security by using an Authority Connector ? Dominique --_000_CF3CE3EFBCA3564185DF065952A267C85302A7BE80NOKEUMSG01mgd_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi Peter,
 
I'm the principal committer for LCF, but I don't k= now as=20 much about Solr as I ought to, so it sounds like a potentially productive=20 collaboration.
 
LCF does exactly what you are looking for - the on= ly issue=20 at all is that you need to fetch a URL from a webapp to get what you are lo= oking=20 for.  The "plugs" are all inside LCF for different kinds of=20 repositories.  Here's a link that might help with drinking the LCF=20 "koolaid", as it were: https://cwiki.apache.org/confluence/display/CONNE= CTORS/Lucene+Connectors+Framework+concepts
 
The url would be something like this (on a locally= =20 installed tomcat-based LCF instance):
 
http://localhost:8080/lcf-authority-service/UserA= CLs?username=3Dsomeusername@somedomain.com
 
... and this fetch returns something=20 like:
 
TOKEN:xxxxxxx
TOKEN:yyyyyyy
TOKEN:zzzzzzz
....
 
... which represent the amalgamated tokens for all= of the=20 defined authorities, and by some strange coincidence ( ;-) ) are= =20 compatible with certain pieces of metadata that have been passed into= =20 Solr with each document - one set of Allow tokens, and a second set of= Deny=20 tokens.  The LCF solr output connection doesn't yet do this, but it is= =20 trivial for me to make that happen.
 
Does this sound plausible to you?
 
Karl

 

From: ext Peter Sturge=20 [mailto:peter.sturge@googlemail.com]
Sent: Tuesday, April 20, 20= 10=20 5:41 PM
To: connectors-user@incubator.apache.org;=20 dev@lucene.apache.org
Subject: Re: FW: Solr and LCF security at q= uery=20 time

Hi Karl,

Integrating LCF to get external token support fo= r=20 SOLR-1872 sounds very interesting indeed. I don't know anything about LCF, = but=20 one of the things I was planning for SOLR-1872 is to make acl.xml (or rathe= r its=20 behaviour) 'pluggable' - i.e. it would just be one of a series of plugins t= hat=20 could be used for obtaining back-end authentication information.

If= =20 you're good with LCF, perhaps we could work together to build this in. One = of=20 the first things would be defining an interface that would be as easy as=20 possible to plug LCF into. Have you any suggestions/insight on this=20 front?

Many thanks,
Peter



On Tue, Apr 20, 2010 at 4:08 PM, &= lt;karl.wright@nokia.com>= =20 wrote:
SOLR-1872 looks exactly like what I was envisioning, = from=20 the search query perspective, although instead of the acl xml file you=20 specify LCF stipulates you would dynamically query the=20 lcf-authority-service servlet for the access tokens themselves.  Tha= t=20 would get you support for AD, Documentum, LiveLink, Meridio, and Memex fo= r=20 free. It seems likely that this component could be modified to work = with=20 LCF with minor effort.
 
<= SPAN>The=20 missing component still seems to be AD authentication, which needs a=20 solution.
 
Karl


From: ext Peter Sturge [mailto:peter.sturge@googlemail.com]
Sent: Tuesday= , April=20 20, 2010 10:44 AM
To: dev@lucene.apache.org
Subject: Re: FW: Solr= and=20 LCF security at query time

If you want to do this completely within Solr, have a look=20 at:
SOLR-1834 and SOLR-1872. These use a SearchComponent plugin for=20 Solr.

Thanks,
Peter



On Tue, Apr 20, 2010 at 1:25 PM, <karl.wright@nokia.com> wrote:
FYI


From: Wright Karl (Nokia-S/Cambridg= e)=20
Sent: Tuesday, April 20, 2010 8:16 AM
To: 'dominique.bejean@eolya.fr'
Cc: 'solr-dev@apache.org= ';=20 'connectors-dev@incubator.apache.org'; 'connectors-user@incubator.apache.org'
Subject= :=20 RE: Solr and LCF security at query time

Dominique,
 
Yes, I=20 am aware of this ticket and contribution.  Luckily LCF establishes= a=20 powerful multi-repository security model, even though it doesn't yet do= the=20 final step of enforcing that model at the search end.  LCF allows = you=20 to define multiple authorities to operate against disparate repositorie= s,=20 and use the appropriate authority to secure any given document.  T= he=20 solr people are aware of this design, which addresses the issues raised= by=20 SOLR-1834 very nicely.  However, as I said before, time is a probl= em,=20 and the work still needs to be done.
 
I=20 suggest you read up on the actual security model of LCF, and perhaps=20 experiment with that and the SOLR-1834 contribution, to see if there is= =20 common ground.  One thing we've learned at MetaCarta is that= =20 post-filtering for security purposes is expensive, and it is better=20 to modify the queries themselves to restrict the results, if= =20 possible.  I'm not sure which approach SOLR-1834 takes, although i= t=20 sounds like it might be the filtering approach.  Still, it would b= e=20 better than nothing.
 
Please=20 let me know what you find out.
 
Thanks,
Karl


From: ext Dominique Bejean [mailto:= dominique.bejean@eolya.fr]
Sent: Tuesday= , April=20 20, 2010 8:03 AM
To: Wright Karl (Nokia-S/Cambridge)
Cc= :=20 connectors-user@incubator.apache.org; connectors-dev@incubator.apache.org
Subject:<= /B> Re:=20 Solr and LCF security at query time

Karl,

Thank you for your reply.

I made some=20 research today and I found this :
http://freesurf001.appspot.com/issues.apache.org/jira/b= rowse/SOLR-1834
http://demo.findwise.se:8880/SolrSecurity/

S= orl=20 security model have to be able to filter result list with items coming = from=20 various sources at the same time (livelink, documentum, file system, ..= .).=20 Big subject :)

Dominique


Le 20/04/10 13:34, karl.wright@nokia= .com=20 a =E9crit :=20
Hi=20 Dominique,
 
At the=20 moment, in order to enforce the LCF security model within Lucene/Solr= , you=20 will need to build this functionality into whatever client you a= re=20 using to display the Lucene search results.  Specifically, you w= ould=20 need to take the following steps:
 
(1)=20 Have your users access your search client through=20 Apache.
(2)=20 Use the Apache module mod_auth_kerb, combined with LCF's=20 mod_authz_annotate, to cause authorization HTTP headers to be transmi= tted=20 to the client webapp.
(3)=20 Have your client webapp alter whatever queries it is doing, to add an= =20 appropriate query clause for each of the access tokens transmitted in= the=20 headers.
 
(This=20 is how it is done at MetaCarta.)
 
Alternatively, you may find a way to do this completel= y with=20 a web application under a Java app server such as Tomcat.  I hav= e not=20 yet done the research to find out whether this is a feasible=20 alternative.  Effectively, what you need something like mod_auth= _kerb=20 to do is to authenticate your user against Active Directory, or whome= ver=20 the authenticator ought to be.  JAAS may be helpful=20 here.
 
There=20 are, of course, intentions to fill out the missing pieces more comple= tely=20 and transparently via a Solr search plugin and/or filter.  What = has=20 been lacking is time.  If you are in a position to do developmen= t in=20 this area, we're happy to have any assistance you might provide. = ;=20
 
Thanks,
Karl

From: ext Dominique Bejean [mailto:dominique.bejean@eolya.fr]
Sent:=20 Tuesday, April 20, 2010 5:06 AM
To: connectors-user@incubator.apache.org
Subjec= t:=20 Solr and LCF security at query time

Hi,

I don't see in LCF wiki how Solr and LCF works toget= her=20 at query time in order to remove from the result list the items the u= ser=20 is not allowed to access.

In http://cwiki.apache.org/CONNECTORS/lucene-connectors-= framework-concepts.html,=20 I just see these sentences :

" Once all these documents and t= heir=20 access tokens are handed to the search engine, it is the search engin= e's=20 job to enforce security by excluding inappropriate documents from the= =20 search results. For Lucene
= , this=20 infrastructure is expected to be built on top of Lucene's generic met= adata=20 abilities, but has not been implemented at this time."

I am no= t=20 sure to understand. Does this mean that for the moment, it is not pos= sible=20 for Solr to apply security by using an Authority Connector=20 ?
  
Dominique

<= /BLOCKQUOTE>

--_000_CF3CE3EFBCA3564185DF065952A267C85302A7BE80NOKEUMSG01mgd_--