manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Question about obtaining metadata values via CMIS connector => ElasticSearch
Date Mon, 11 May 2015 13:42:32 GMT
Hi Deanna,

I have contacted the author of the plugin, who works for Alfresco.  In
ManifoldCF we distribute only the AMP binary, so Maurizio would be the
right guy to answer any source questions.

Thanks,
Karl


On Mon, May 11, 2015 at 9:27 AM, Delapasse, Deanna <
ddelapasse@oceaneering.com> wrote:

> The Alfresco Webscripts connector requires an AMP installed into the
> Alfresco server to provide the webscripts the connector calls.  The
> connector's author pointed me to his GitHub source code, but it isn't
> working for me as-is (installs ok, but the included webscripts aren't
> accessible).  Are the AMP sources available from MCF?  And do you know the
> last Alfresco version that anyone used it with? Possibly I will need to
> tweak it to work with my Alfresco 4.2.f.
>
> thanks!
> Deanna
>
>
> On Wed, May 6, 2015 at 11:19 AM, Karl Wright <daddywri@gmail.com> wrote:
>
>> Here's the key finding:
>>
>> "Ok, the problem is because you only get to write the seeding query. The
>>
>> query that fetches individual documents is hardwired.  I believe it is set
>> in opencmis in fact."
>>
>>
>> So basically, for the CMIS connector, you aren't writing the query that finds the
document data and metadata; you are writing the query that finds the set of documents to index.
 And the query you *need* to modify is in fact baked into some jar in Apache Chemistry, which
greatly limits the CMIS connector's utility for indexing metadata.
>>
>>
>> Is there any way you can use one of the two the native Alfresco connectors we supply?
>>
>>
>> Karl
>>
>>
>>
>> On Wed, May 6, 2015 at 12:10 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Hi Deanna,
>>>
>>> I vaguely recall that Apache Chemistry (which the CMIS connector relies
>>> on) running against Alfresco has some limitations where metadata is
>>> concerned.  I'm pretty sure there was an email exchange posted somewhere,
>>> so you might be able to dig it up here:
>>>
>>> http://www.mail-archive.com/user@manifoldcf.apache.org/index.html
>>>
>>> I'll look around and see.
>>>
>>> The other potential problem is your ElasticSearch configuration.  I
>>> don't know a lot about this myself.  I think it makes sense to try to
>>> figure out on which end the problem lies; if you can see in some log what
>>> actually gets posted to ElasticSearch for each document, that would help.
>>>
>>> Karl
>>>
>>>
>>> On Wed, May 6, 2015 at 11:42 AM, Delapasse, Deanna <
>>> ddelapasse@oceaneering.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm trying to use ManifoldCF to crawl my Alfresco repo (via the CMIS
>>>> connector) and push the results into ElasticSearch.  My users want to
>>>> search metadata (including custom) and content. I followed some tutorials
>>>> and got it running quickly BUT...regardless of my ElasticSearch mapping the
>>>> only CMIS metadata entity I can find in my indexed results is cmis:objectId.
>>>>
>>>> I have tried using various cmis queries (with 'select * ...' and with
>>>> 'select cmis:name, cmis:lastModifiedBy, ...'.  I have verified my queries
>>>> and they definitely return metadata, but the data doesn't appear in
>>>> ElasticSearch.   I tried a simple attachment mapping and also a mapping
>>>> where I specifically list some of the cmis properties. Regardless of
>>>> mapping, my indexes look like this:
>>>>
>>>>
>>>> {
>>>>    "_index":"test",
>>>>    "_type":"file",
>>>>    "_id":"
>>>> http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content/10.0.txt?id=2555a540-a5b3-4c27-90f6-c89b6742bd4f%3B1.0
>>>> ",
>>>>    "_version":2,
>>>>    "_score":1,
>>>>    "_source":{
>>>>       "cmis:objectId":"2555a540-a5b3-4c27-90f6-c89b6742bd4f;1.0",
>>>>       "allow_token_document":"__nosecurity__",
>>>>       "deny_token_document":"__nosecurity__",
>>>>       "allow_token_share":"__nosecurity__",
>>>>       "deny_token_share":"__nosecurity__",
>>>>       "allow_token_parent":"__nosecurity__",
>>>>       "deny_token_parent":"__nosecurity__",
>>>>       "file":{
>>>>          "_content_type":"text/plain",
>>>>          "_name":"10.0.txt",
>>>>          "_content":"DQpJIGFtIGFuIEFsZnJlc2NvIGZpbGUuDQo="
>>>>       }
>>>>    }
>>>> }
>>>>
>>>> The ES results are good and I can search perfectly by content &
>>>> cmis:objectId.  I have enabled debugging and no errors appear in the log.
 *What
>>>> do I have to DO to get cmis:name, cmis:lastModifiedBy and other properties
>>>> to appear?*
>>>>
>>>> Thanks in advance!  This product is very simple to use and has
>>>> potential to be a HUGE help to us!!!
>>>>
>>>> Deanna
>>>>
>>>
>>>
>>
>

Mime
View raw message