manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Farrell <pfarr...@funnelback.com>
Subject Re: Manifold/Alfresco seeding and security
Date Tue, 20 Oct 2015 15:36:40 GMT
Hi,

Having had to go back to basics and re-install my Alfresco instance, I can confirm that the
AMP file for the alfresco indexer web scripts does actually install without error. There must
have been an issue with my previous Alfresco instance. 

Having said that, the Alfresco WebScript connector fails. The failure is down to the ‘Context’
setting (see below):



When you attempt to save the configuration of the WebScript connector, Manifold clearly tries
to check the connection. It seems to do this by making an API call (/auth/resolve/admin).
The issue is with what Manifold prepends to the start of that path. 
If I leave the setting as above then Manifold reports   :   

<tr><td>The Web Script <a href="%2Falfresco%2Fservice%2Fapi%2Fnode%2Fauth%2Fresolve%2Fadmin">/alfresco/service/api/node/auth/resolve/admin</a>
has responded with a status of 404 - Not Found.</td></tr>[\n]”

In other words, it builds the full path as “alfresco/service/api/node/auth/resolve/admin”.

For my Alfresco Community 5.0 instance, I get to that same web script via the URL “/alfresco/service/auth/resolve/admin”
i.e. without the ‘/api/node’.

Somewhere, Manifold is assuming that the ‘/api/node’ is a correct path inclusion. In other
words, there is nothing I can put into that box to prevent it. 

Paul

> On 20 Oct 2015, at 12:56, Karl Wright <daddywri@gmail.com> wrote:
> 
> Hmm.  What file was missing?  Maurizio maintains the indexer plugin; I feel certain he'd
want to know.
> 
> Karl
> 
> 
> On Tue, Oct 20, 2015 at 7:53 AM, Paul Farrell <pfarrell@funnelback.com <mailto:pfarrell@funnelback.com>>
wrote:
> Hi guys,
> 
> Just to let you know what’s going on - for informational purposes more than anything.
> 
> I initially tried taking the AMP file provided in the MCF plugins directory (0.7.0) and
tried to install it into Alfresco but got a message saying a file was missing.
> 
> Instead, I cloned the repository on GitHub for the alfresco-indexer project and then
built it on my local machine. This generated the AMP file (0.7.2). 
> 
> I was able to successfully install the AMP file onto my Alfresco instance. 
> 
> As it happens I now cannot log into Alfresco Share ('bad credentials or server not available'
message) but that is something I can work on. Apparently the installation of some AMP files
have been known to cause this issue. 
> 
> So, progress to a point!
> 
> Paul Farrell
> Senior Search Consultant
>  
> 109-123 Clifton Street, London EC2A 4LD
> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com
<http://www.funnelback.com/>
> 
> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
> 
> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback> - Twitter
<https://twitter.com/funnelback>
> 
> Funnelback UK Ltd is a limited liability company registered in England & Wales. Registered
address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company registration number:
07004264.
> 
>> On 20 Oct 2015, at 12:36, Rafa Haro <rharoapache@gmail.com <mailto:rharoapache@gmail.com>>
wrote:
>> 
>> Hi, 
>> 
>> At the Alfresco side, hope this helps:
>> 
>> http://docs.alfresco.com/4.1/tasks/amp-install.html <http://docs.alfresco.com/4.1/tasks/amp-install.html>
>> 
>> Cheers
>> 
>> 
>> 
>> 
>> 
>> On Tue, Oct 20, 2015 at 1:13 PM, Karl Wright <daddywri@gmail.com <mailto:daddywri@gmail.com>>
wrote:
>> 
>> The AMP file is actually shipped as part of the binary MCF distribution.  You can
find it under "plugins".
>> 
>> Karl
>> 
>> 
>> On Tue, Oct 20, 2015 at 6:42 AM, Paul Farrell <pfarrell@funnelback.com <mailto:pfarrell@funnelback.com>>
wrote:
>> Hi all,
>> 
>> Hopefully this will be my only request for information today. 
>> I’m afraid this is a bit of a newbie question but I have managed to get the Manifold
UI to now show ‘Alfresco Webscripts’ as a connector. The only bit I am missing now is
to install the AMP file in Afresco. 
>> 
>> I realise that this is slightly outside of the Manifold remit but I wondered if anyone
can advise how I build the AMP file from the URL (https://github.com/maoo/alfresco-indexer
<https://github.com/maoo/alfresco-indexer>)? I have cloned the repository to my local
drive but, having never worked with Maven, am at a loss at how to generate the AMP file that
I then need to install into Alfresco. 
>> 
>> Many thanks,
>> 
>>> On 19 Oct 2015, at 17:36, Karl Wright <daddywri@gmail.com <mailto:daddywri@gmail.com>>
wrote:
>>> 
>>> The only way you can have such a reduced list of connectors is if somebody commented
out many connectors in your connectors.xml, or removed them from the database table where
they are registered by hand.
>>> 
>>> Karl
>>> 
>>> 
>>> On Mon, Oct 19, 2015 at 12:33 PM, Paul Farrell <pfarrell@funnelback.com <mailto:pfarrell@funnelback.com>>
wrote:
>>> After a good deal of time clicking around I came to the same conclusion - that
there is no way of telling from the UI!!
>>> 
>>> Having dug a bit deeper I believe I may actually have the Alfresco WebScript
connectors installed. At least the 0.7.0 version. I notice in the ‘lib’ directory that
I have ‘alfresco-indexer-webscripts-0.7.0.amp.
>>> 
>>> Looking in the ‘connectors.xml’ file I can also see the line :
>>> 
>>> <repositoryconnector name="Alfresco Webscript" class="org.apache.manifoldcf.crawler.connectors.alfrescowebscript.AlfrescoConnector”/>
>>> 
>>> You can imagine my excitement!
>>> 
>>> The only thing I am missing is the option in the UI. When I click to create a
new repo connection I get:  CMIS, Dropbox, Generic, GoogleDrive, HDFS, Jira, Meridio, RSS,
Sharepoint. 
>>> 
>>> Perhaps I am hoping for too much to hope that I can make a simple change to enable
this repo connection?
>>> 
>>> Thanks for all the help everyone 
>>> 
>>> 
>>> 
>>>> On 19 Oct 2015, at 17:26, Karl Wright <daddywri@gmail.com <mailto:daddywri@gmail.com>>
wrote:
>>>> 
>>>> Hah; there's not a way to inquire in the UI, if that's what you mean.  But
if you see "Alfresco webscript" in the list of repository connection types, you've got a version
that supports that connector.
>>>> 
>>>> Thanks,
>>>> Karl
>>>> 
>>>> 
>>>> On Mon, Oct 19, 2015 at 12:17 PM, Paul Farrell <pfarrell@funnelback.com
<mailto:pfarrell@funnelback.com>> wrote:
>>>> Thanks Rafa.
>>>> 
>>>> As an aside, is there an easy way to identify which version of ManifoldCF
you are on?
>>>> 
>>>> Cheers
>>>> 
>>>> Paul Farrell
>>>> Senior Search Consultant
>>>>  
>>>> 109-123 Clifton Street, London EC2A 4LD
>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865> | funnelback.com
<http://www.funnelback.com/>
>>>> 
>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>> 
>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
- Twitter <https://twitter.com/funnelback>
>>>> 
>>>> Funnelback UK Ltd is a limited liability company registered in England &
Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD. Company
registration number: 07004264.
>>>> 
>>>>> On 19 Oct 2015, at 16:54, Rafa Haro <rharo@apache.org <mailto:rharo@apache.org>>
wrote:
>>>>> 
>>>>> Hi Paul, 
>>>>> 
>>>>> All you need to do is to install this webscript <https://github.com/maoo/alfresco-indexer>
within your Alfresco instance. The connector itself is already part of the most recent versions
of ManifoldCF
>>>>> 
>>>>> Cheers,
>>>>> Rafa
>>>>> 
>>>>> On Mon, Oct 19, 2015 at 5:29 PM, Paul Farrell <pfarrell@funnelback.com
<mailto:pfarrell@funnelback.com>> wrote:
>>>>> Ok, thanks again guys. 
>>>>> 
>>>>> The Webscript connector it is. 
>>>>> 
>>>>> I realise I am asking a lot here but are there any easy-to-follow guidelines
on how to get this Webscript connector installed?  I see there is a GitHub page here (https://github.com/maoo/alfresco-webscript-manifold-connector
<https://github.com/maoo/alfresco-webscript-manifold-connector>) which discusses it
(although it directs you to a repository of files). 
>>>>> 
>>>>> I am just keen to make sure that any steps I follow to try and get this
Webscript connector installed and working are updated, reliable steps. I would hate to waste
time with out of date information. 
>>>>> 
>>>>> Thanks all
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 19 Oct 2015, at 16:23, Muhammed Olgun <mh.olgun@gmail.com <mailto:mh.olgun@gmail.com>>
wrote:
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> I suggest that you should use Alfresco Webscript as Karl mentioned.
Web services is so slow compared to other services and I've also checked that Alfresco CMIS
web services does not return change token(may be there is something that I don't know). 
>>>>>> 
>>>>>> By the way current version of CMIS connector is not aware of change
token. I would write a patch for you if alfresco supports change token property.
>>>>>> 
>>>>>> Thanks!
>>>>>> Muhammed 
>>>>>> 19 Eki 2015 Pzt, saat 18:11 tarihinde Karl Wright <daddywri@gmail.com
<mailto:daddywri@gmail.com>> şunu yazdı:
>>>>>> Hi Paul,
>>>>>> 
>>>>>> The Alfresco Webscript connector is a wholly different connector
that has no relation to the CMIS connector.  It requires an Alfresco webscript plugin be installed
on your Alfresco server to work, though.
>>>>>> 
>>>>>> Hope that helps.
>>>>>> 
>>>>>> Karl
>>>>>> 
>>>>>> 
>>>>>> On Mon, Oct 19, 2015 at 10:32 AM, Paul Farrell <pfarrell@funnelback.com
<mailto:pfarrell@funnelback.com>> wrote:
>>>>>> Hi Muhammed/Karl,
>>>>>> 
>>>>>> Firstly, thank-you so much for taking the time to reply. It is very
much appreciated. 
>>>>>> 
>>>>>> Currently I am using the AtomPub for my CMIS repository connection.
I have just read something which may shed a little light on this. The post read that change
tokens are not passed via AtomPub connections (https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758
<https://forums.alfresco.com/forum/developer-discussions/alfresco-api/cmis-change-log-token-problem-using-opencmis-03282011-1758>).
If true, this would explain why ManifoldCF may be unable to determine a change in Alfresco.
>>>>>> 
>>>>>> It looks like I have two possible options left open to me (correct
me if I’m wrong):
>>>>>> 
>>>>>> 1. I look to use ‘Web Services’ instead of ‘AtomPub’ for
the connection mechanism
>>>>>> 2. I upgrade ManifoldCF so that I can use the ‘Web Scripts’ connector?
 (or is this the same as the ‘Web Services’ connection mentioned above?)
>>>>>> 
>>>>>> Thanks again,
>>>>>> 
>>>>>> Paul
>>>>>> 
>>>>>> Paul Farrell
>>>>>> Senior Search Consultant
>>>>>>  
>>>>>> 109-123 Clifton Street, London EC2A 4LD
>>>>>> T +44 (0) 207 183 6865 <tel:%2B44%20%280%29%20207%20183%206865>
| funnelback.com <http://www.funnelback.com/>
>>>>>> 
>>>>>> UNITED KINGDOM | AUSTRALIA | NEW ZEALAND | POLAND | UNITED STATES
>>>>>> 
>>>>>> Connect with us: LinkedIn <http://www.linkedin.com/company/funnelback>
- Twitter <https://twitter.com/funnelback>
>>>>>> 
>>>>>> Funnelback UK Ltd is a limited liability company registered in England
& Wales. Registered address: Zetland House 109-123, Clifton Street, London. EC2A 4LD.
Company registration number: 07004264.
>>>>>> 
>>>>>>> On 19 Oct 2015, at 15:12, Muhammed Olgun <mh.olgun@gmail.com
<mailto:mh.olgun@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Repositories should give information to ManifoldCF when they
updated. Current CMIS connector reindex document if the lastest version of the document has
changed, not updated. 
>>>>>>> 
>>>>>>> There is a change token property in CMIS specification and it
should change when document is updated so ManifoldCF can understand that document is updated
but implementing change token property is optional.  I've checked Alfresco's CMIS web site
and seen that they didn't set the change token.
>>>>>>> 
>>>>>>> I think, there is nothing we can do at this point.
>>>>>>> 
>>>>>>> 19 Eki 2015 Pzt, 15:59 tarihinde, Karl Wright <daddywri@gmail.com
<mailto:daddywri@gmail.com>> şunu yazdı:
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> This looks like a bug in the CMIS connector to me; usually the
document version string the connector constructs should be adequate to detect all changes.
 Can you create a ticket?  https://issues.apache.org/jira <https://issues.apache.org/jira>
, project ManifoldCF.  Please include what version of MCF you are using here.  FWIW, this
may be in fact a bug in the Alfresco CMIS implementation, but we'll have to have some back
and forth before I can determine that for sure.
>>>>>>> 
>>>>>>> In the meantime, have you considered using the Alfresco Webscript
connector?  It's the preferred way to do Alfresco indexing, although there have been issues
reported having to do with running it on some configurations of Alfresco.  I'm not entirely
sure what the problem is there; maybe a version dependency of some kind.
>>>>>>> 
>>>>>>> Karl
>>>>>>> 
>>>>>>> 
>>>>>>> On Mon, Oct 19, 2015 at 7:43 AM, Paul Farrell <pfarrell@funnelback.com
<mailto:pfarrell@funnelback.com>> wrote:
>>>>>>> Hi Everyone,
>>>>>>> 
>>>>>>> Hoping someone may be able to advise.
>>>>>>> 
>>>>>>> I am currently using Manifold, together with a CMIS connector,
to retrieve and index content from an Alfresco repository.
>>>>>>> 
>>>>>>> All is going well apart from, what I would call, the ‘incremental
crawl’.
>>>>>>> 
>>>>>>> The main issue I am having is that the modification of a document’s
security settings, in Alfresco, is not being picked up in next Manifold crawl. As an example
I have a document ‘TestDoc1’ which has user A and B as Consumers. I run a crawl in Manifold
and it picks up the documents fine.  The security is set as expected. I then remove ‘User
A’ from the security of that document and re-run the Manifold crawl. User A can still see
the document in the local search engine.
>>>>>>> 
>>>>>>> It is as if Manifold is not treating the security update as a
‘modification’ and is therefore not refreshing it. Note that if I go into the Output Connections,
edit and save the relevant output connection and then click ‘Remove all associated documents’,
the next time I crawl, the changes are picked up. It is clear that Manifold is just not updating
whatever internal record it has for this item.
>>>>>>> 
>>>>>>> Any ideas?
>>>>>>> 
>>>>>>> Many thanks.
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
> 


Mime
View raw message