manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tony Edgin <>
Subject Re: Determining document model passed to search engine
Date Mon, 11 Feb 2013 19:55:29 GMT
Thanks for the speedy response!

I eventually want to index the contents of our local website with Elastic

I would use the Web repository connector with the no authority connector
and the Elasticsearch output connector.  Would you mind letting me know the
names and meanings of the metadata that get's passed to Elastic Search?

Thanks again.

On Mon, Feb 11, 2013 at 12:45 PM, Karl Wright <> wrote:

> So let me get this clear - you are looking to find out what the
> names/meanings are of the metadata that gets passed to the output
> connector, for a given repository connection?
> If this is what you are looking for, I'm afraid that while at one
> point the end-user documentation described this pretty accurately, it
> is now significantly out of date.  While it's not terribly hard to
> compile this information from source code etc., the work definitely
> needs to be repeated by somebody.
> If you want to ask this question about a specific connector, I can
> certainly try to answer it, though.  If you want to contribute either
> the information or a documentation patch, this would be great too.
> Karl
> On Mon, Feb 11, 2013 at 2:38 PM, Tony Edgin <>
> wrote:
> > I'm sure this is documented somewhere, and I apologize in advance for not
> > being able to find it.
> >
> > How do I determine the model or schema of the document passed to the
> search
> > engine by a given job?
> >
> > For instance, I'm running a job that crawls a directory on my local file
> > system and passes to to Elastic Search.  Interrogating Elastic Search, I
> can
> > determine that the document has three fields, "file", "type" and "uri",
> all
> > strings.  How would I have known that in advance?
> >
> > Thanks for any help.

View raw message