manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gonzalez, Pablo" <>
Subject RE: Problem with manifold
Date Mon, 05 Nov 2012 09:13:50 GMT

By 'modifying the component itself' do you mean to write a subclass of ManifoldCFSearchComponent?

-----Original Message-----
From: Karl Wright [] 
Sent: viernes, 02 de noviembre de 2012 14:47
Subject: Re: Problem with manifold

If you don't get anywhere with the debug component, you can try modifying the component itself
to print the incoming query and the modified query.  You might also want to look at the ManifoldCF
component tests, which create a handler internally and executed successfully when the component
was released.  If you create a similar handler and that works, then you can try to figure
out what the differences are.


On Fri, Nov 2, 2012 at 8:29 AM, Gonzalez, Pablo <> wrote:
> Well, it went wrong. I will crawl again just in case, and if it doesn't go well, I will
search on Internet about that debug component you mentioned earlier.
> -----Original Message-----
> From: Gonzalez, Pablo
> Sent: viernes, 02 de noviembre de 2012 12:03
> To:
> Subject: RE: Problem with manifold
> Ok, I already had the fields in my schema.xml. This is the piece of code regarding them:
>    <field name="allow_token_document" type="string" indexed="true" 
> stored="false" multiValued="true"/>
>    <field name="deny_token_document" type="string" indexed="true" 
> stored="false" multiValued="true"/>
>    <field name="allow_token_share" type="string" indexed="true" 
> stored="false" multiValued="true"/>
>    <field name="deny_token_share" type="string" indexed="true" 
> stored="false" multiValued="true"/>
> So, just to make it clear, what you are suggesting is to cut the piece of code that contains
my request handler and paste it in another part of the solrconfig.xml file, and try this a
number of times. I will try to do so, and I'll tell you whether it went right or wrong.
> -----Original Message-----
> From: Karl Wright []
> Sent: viernes, 02 de noviembre de 2012 11:38
> To:
> Subject: Re: Problem with manifold
> Actually, from your log it is clear that ManifoldCF can be reached fine from your Solr
instance, so please disregard that question.
> The only other potential issue has to do with Solr search component ordering.  This is
a bit of black magic, because other Solr components may modify the request in ways which are
potentially incompatible with the ManifoldCF plugin.  So if you are sure your fields are all
correct, you might want to play around with the ordering of your components to see if that
makes any difference.
> There used to be debug component you could also use which would print out the (full)
query and the results returned - that may also be useful.
> Thanks,
> Karl
> On Fri, Nov 2, 2012 at 6:25 AM, Karl Wright <> wrote:
>> Hi Pablo,
>> The first thing that I notice is that, as you have this configured, 
>> you need four fields declared in your schema as indexable fields:
>> allow_token_document
>> deny_token_document
>> allow_token_share
>> deny_token_share
>> Do you have these fields declared, and did you have them all declared 
>> when you performed the crawl?
>> Second, the way it is configured, the machine that is running Solr 
>> must be the same as the machine running ManifoldCF (because you used 
>> a localhost url).  Is this true?
>> Thanks,
>> Karl
>> On Fri, Nov 2, 2012 at 5:43 AM, Gonzalez, Pablo 
>> <> wrote:
>>> Hello, Mr Wright, and thank you for such a fast response. Well, the way I am
using to try and communicate mcf and solr is via a SearchComponent. For this I added the apache-solr-mcf-3.6-SNAPSHOT.jar
that comes in the file solr-integration to the lib folder of the deployment of the solr webapp
in tomcat. Then I changed solrconfig.xml, adding this piece of code:
>>> <!-- LCF document security enforcement component --> 
>>> <searchComponent name="mcfSecurity"
>>> class="org.apache.solr.mcf.ManifoldCFSearchComponent">
>>> <str name="AuthorityServiceBaseURL">http://localhost:8345/mcf</str>
>>> </searchComponent>
>>> <requestHandler name="/search" class="solr.SearchHandler"
>>> default="true">
>>>     <!-- default values for query parameters can be specified, these
>>>          will be overridden by parameters in the request
>>>       -->
>>>    <!--  <lst name="defaults">
>>>        <str name="echoParams">explicit</str>
>>>        <int name="rows">10</int>
>>>        <str name="df">text</str>
>>>      </lst>-->
>>> <arr name="last-components">
>>> <str>mcfSecurity</str>
>>> </arr>
>>> <!--a bunch of comments-->
>>> </requestHandler>
>>> Last thing, I didn't write any additional Java code. I thought it wasn't necessary.
>>> Thanks,
>>> Pablo
>>> -----Original Message-----
>>> From: Karl Wright []
>>> Sent: viernes, 02 de noviembre de 2012 10:21
>>> To:
>>> Subject: Re: Problem with manifold
>>> The ManifoldCF Solr plugin operates by requesting access tokens from ManifoldCF
(which seems to be working fine), and using those to modify the incoming Solr search expression
to limit the results according to those access tokens.
>>> There are two ways (and two independent classes) you can configure to perform
this modification.  One of these classes functions as a query parser plugin.  The other functions
as a search component.  Obviously, for either one to work right, the Solr configuration has
to work properly too.  Can you provide details as to (a) which one you are using, and (b)
what the configuration details are, e.g. the appropriate clauses from solrconfig.xml?
>>> Thanks,
>>> Karl
>>> On Fri, Nov 2, 2012 at 4:57 AM, Gonzalez, Pablo <>
>>>> Hello,
>>>> I don't know if you already got this message, but anyway here I go:
>>>> I have been trying to connect ManifoldCF to Solr. I have a file 
>>>> system in a remote server, protected by active directory.
>>>> I have configured a manifold job to import only a part of the 
>>>> documents under the file system. In fact, I do the importing 
>>>> process from a file which only contains 2 documents, in order to 
>>>> make it easier to see what is happening and get conclusions. 
>>>> Afterwards the documents are output to the solr server.
>>>> I have created a request handler called "selectManifold" to "connect"
>>>> manifold and solr. Then I call it via
>>>> http://[host]:8080/solr/selectManifold?indent=on&version=2.2&q=*%3A
>>>> *
>>>> &f
>>>> q=&start=0&rows=10&fl=*%2Cscore&wt=&explainOther=&hl.fl=&Authentica
>>>> t ed UserName=user@domain . When doing this, tomcat's log
>>>> (catalina.out) writes this:
>>>> oct 31, 2012 2:40:33 PM
>>>> org.apache.solr.mcf.ManifoldCFSearchComponent
>>>> prepare
>>>> Información: Trying to match docs for user 'user@domain'
>>>> oct 31, 2012 2:40:33 PM
>>>> org.apache.solr.mcf.ManifoldCFSearchComponent
>>>> getAccessTokens
>>>> Información: For user 'user@domain', saw authority response 
>>>> AUTHORIZED:Auth+active+directory+para+el+file+system (this one is 
>>>> the active directory I'm currently using for the job) oct 31, 2012
>>>> 2:40:33 PM org.apache.solr.mcf.ManifoldCFSearchComponent
>>>> getAccessTokens
>>>> Información: For user 'user@domain', saw authority response 
>>>> AUTHORIZED:ad (this one isn't) oct 31, 2012 2:40:33 PM 
>>>> org.apache.solr.core.SolrCore execute
>>>> Información: [] webapp=/solr path=/selectManifold 
>>>> params={explainOther=&fl=*,score&indent=on&start=0&q=*:*&hl.fl=&wt=
>>>> & fq =&version=2.2&rows=10&AuthenticatedUserName=user@domain}
>>>> hits=0 status=0 QTime=183
>>>> So, it effectively connects and gets my user's tokens. In fact, if 
>>>> I go to http://[host]/mcf/UserACLs?username=user@domain, this is 
>>>> the result:AUTHORIZED:Auth+active+directory+para+el+file+system
>>>> TOKEN:active_dir:S-1-5-32-545
>>>> TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1111
>>>> TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-513
>>>> TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1113
>>>> TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1110
>>>> TOKEN:active_dir:S-1-5-21-2039231098-2614715072-2050932820-1107
>>>> TOKEN:active_dir:S-1-1-0
>>>> TOKEN:ad:S-1-5-32-545
>>>> TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1111
>>>> TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-513
>>>> TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1113
>>>> TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1110
>>>> TOKEN:ad:S-1-5-21-2039231098-2614715072-2050932820-1107
>>>> TOKEN:ad:S-1-1-0
>>>> Moreover, if I go to http://[host]:8080/solr/admin/schema.jsp and 
>>>> search for the allow_token_document field, it says that
>>>> active_dir:S-1-5-21-2039231098-2614715072-2050932820-1110
>>>> (which appeared in the list of UserACLs) has frequency 2 (remember 
>>>> I only have 2 documents indexed) And still, when I call 
>>>> [host]:8080/solr/selectManifold?indent=on&version=2.2&q=*%3A*&fq=&start=0&rows=10&fl=*%2Cscore&wt=&explainOther=&hl.fl=&AuthenticatedUserName=user@domain"
>>>> class="external-link"
>>>> rel="nofollow">http://[host]:8080/solr/selectManifold?indent=on&ver
>>>> s
>>>> io
>>>> n=2.2&q=*%3A*&fq=&start=0&rows=10&fl=*%2Cscore&wt=&explainOther=&hl.
>>>> fl =&AuthenticatedUserName=user@domain,
>>>>  it says no result has been found. Do you know why could it be?
>>>> One final thing: when I call
>>>> & fq =&start=0&rows=10&fl=*%2Cscore&wt=&explainOther=&hl.fl=,
>>>> with the default handler (that is, without manifold) , it gives me 
>>>> a result with the 2 documents I indexed Sorry for the long post but 
>>>> I wanted you to have all the data.
>>>> Pablo

View raw message