manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shinichiro Abe <shinichiro.ab...@gmail.com>
Subject Re: Reading and posting plain text, rather than encoded files
Date Thu, 27 Aug 2015 20:29:59 GMT
Hi,

I’m work in progress at https://issues.apache.org/jira/browse/CONNECTORS-1234

Regards,
Shinichiro Abe

> 2015/08/28 4:47、Jeff Potts <jeffpotts01@gmail.com> のメール:
> 
> I've spent a very short time playing with ManifoldCF. Cool project, thank you for contributing
it.
> 
> I can read binary files from a source repo like Alfresco 5.0.d and post them to Elasticsearch
1.7.2 successfully.
> 
> Now I'm wondering if the rest of my use cases can be achieved with ManifoldCF...
> 
> Use case 1: Read JSON from a file system, post to Elasticsearch as-is
> 
> When I tried to use the file system repository and the Elasticsearch output, I noticed
that the file is being encoded and stored in ES in the _content property. What I'd rather
do is have the file posted to ES as-is, such as if the file is already a JSON document in
the expected format for my type mapping in ES. These files are 15k to 30k of nested object
JSON.
> 
> Use case 2: Read JSON from Alfresco, post it to Elasticsearch along with object metadata
> 
> In a slight twist on the first, I'd like to store JSON documents in a repository, like
Alfresco, and then read the metadata from the Alfresco object and merge it with the JSON stored
in the content and post that to Elasticsearch as a JSON string, not as an encoded blob.
> 
> I didn't see anything covering these in the docs but I may have missed it.
> 
> Jeff


Mime
View raw message