manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Potts <>
Subject Re: Reading and posting plain text, rather than encoded files
Date Thu, 27 Aug 2015 22:25:47 GMT
I mentioned those as examples only. I am not asking to do anything repository specific. In
the first case I want to post the plain text of a file to an out put as-is with no wrappers
added by manifold and no encoding.

In the second case I want to merge some properties on a document with the existing JSON from
the file content and then post that to the output as-is.


> On Aug 27, 2015, at 3:58 PM, Karl Wright <> wrote:
> Hi,
> ManifoldCF's connectors are general purpose; they are intended to work with *any* repository
or output.  So in general, connectors in MCF cannot interpret or generate content that is
Alfresco or ElasticSearch specific.
> You are welcome to convert these documents to ManifoldCF's means of managing documents,
RepositoryDocument, in your repository connector, and then convert them back in your output
connector.  Or, if you want to write specific proprietary connectors that communicate in a
specific format of JSON, you can.  But do not expect ManifoldCF's suite of other connectors
and transformers to work with this in any meaningful way.
> Karl
>> On Thu, Aug 27, 2015 at 4:29 PM, Shinichiro Abe <>
>> Hi,
>> I’m work in progress at
>> Regards,
>> Shinichiro Abe
>> > 2015/08/28 4:47、Jeff Potts <> のメール:
>> >
>> > I've spent a very short time playing with ManifoldCF. Cool project, thank you
for contributing it.
>> >
>> > I can read binary files from a source repo like Alfresco 5.0.d and post them
to Elasticsearch 1.7.2 successfully.
>> >
>> > Now I'm wondering if the rest of my use cases can be achieved with ManifoldCF...
>> >
>> > Use case 1: Read JSON from a file system, post to Elasticsearch as-is
>> >
>> > When I tried to use the file system repository and the Elasticsearch output,
I noticed that the file is being encoded and stored in ES in the _content property. What I'd
rather do is have the file posted to ES as-is, such as if the file is already a JSON document
in the expected format for my type mapping in ES. These files are 15k to 30k of nested object
>> >
>> > Use case 2: Read JSON from Alfresco, post it to Elasticsearch along with object
>> >
>> > In a slight twist on the first, I'd like to store JSON documents in a repository,
like Alfresco, and then read the metadata from the Alfresco object and merge it with the JSON
stored in the content and post that to Elasticsearch as a JSON string, not as an encoded blob.
>> >
>> > I didn't see anything covering these in the docs but I may have missed it.
>> >
>> > Jeff

View raw message