manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wright <daddy...@gmail.com>
Subject Re: Reading and posting plain text, rather than encoded files
Date Thu, 27 Aug 2015 20:58:49 GMT
Hi,

ManifoldCF's connectors are general purpose; they are intended to work with
*any* repository or output.  So in general, connectors in MCF cannot
interpret or generate content that is Alfresco or ElasticSearch specific.

You are welcome to convert these documents to ManifoldCF's means of
managing documents, RepositoryDocument, in your repository connector, and
then convert them back in your output connector.  Or, if you want to write
specific proprietary connectors that communicate in a specific format of
JSON, you can.  But do not expect ManifoldCF's suite of other connectors
and transformers to work with this in any meaningful way.

Karl


On Thu, Aug 27, 2015 at 4:29 PM, Shinichiro Abe <shinichiro.abe.1@gmail.com>
wrote:

> Hi,
>
> I’m work in progress at
> https://issues.apache.org/jira/browse/CONNECTORS-1234
>
> Regards,
> Shinichiro Abe
>
> > 2015/08/28 4:47、Jeff Potts <jeffpotts01@gmail.com> のメール:
> >
> > I've spent a very short time playing with ManifoldCF. Cool project,
> thank you for contributing it.
> >
> > I can read binary files from a source repo like Alfresco 5.0.d and post
> them to Elasticsearch 1.7.2 successfully.
> >
> > Now I'm wondering if the rest of my use cases can be achieved with
> ManifoldCF...
> >
> > Use case 1: Read JSON from a file system, post to Elasticsearch as-is
> >
> > When I tried to use the file system repository and the Elasticsearch
> output, I noticed that the file is being encoded and stored in ES in the
> _content property. What I'd rather do is have the file posted to ES as-is,
> such as if the file is already a JSON document in the expected format for
> my type mapping in ES. These files are 15k to 30k of nested object JSON.
> >
> > Use case 2: Read JSON from Alfresco, post it to Elasticsearch along with
> object metadata
> >
> > In a slight twist on the first, I'd like to store JSON documents in a
> repository, like Alfresco, and then read the metadata from the Alfresco
> object and merge it with the JSON stored in the content and post that to
> Elasticsearch as a JSON string, not as an encoded blob.
> >
> > I didn't see anything covering these in the docs but I may have missed
> it.
> >
> > Jeff
>
>

Mime
View raw message