manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafa Haro <rh...@apache.org>
Subject Re: Metadata Adjuster transformer
Date Thu, 16 Apr 2015 07:44:08 GMT
Hi Timo, 

If you are using the Tika transformer, probably it is also extracting the document type as
general metadata field and you can manipulate that one in the metadata adjuster

Cheers,
Rafa


En 15 de abril de 2015 en 21:24:17, Karl Wright (daddywri@gmail.com) escrito:

Hi Timo,

Yes, you can do that, but not with the current metadata adjuster.  It does not allow you
to access the core fields.

Karl


On Wed, Apr 15, 2015 at 3:16 PM, Timo Selvaraj <timo.selvaraj@gmail.com> wrote:
Thanks Karl.

Can I create a new meta field contenttype and add the value HTML based on the mime type value
in the core field?

Timo

On Apr 15, 2015, at 3:13 PM, Karl Wright <daddywri@gmail.com> wrote:

Hi Timo,

The metadata adjuster currently does not give you access to the core document fields, only
to the document's general metadata.  Basically, anything that ManifoldCF uses to make crawling
decisions based upon is not accessible or modifiable by the adjuster, because it's not general
metadata.

That include the document's file name, content/mime type, length, creation date, and modification
date.

Technically it is possible to build a document transformer which would copy internal fields
like those described into general metadata fields that could then be manipulated with the
metadata adjuster.  Some connectors already supply such general metadata fields, but it is
by no means a consistent practice.

Karl


On Wed, Apr 15, 2015 at 2:49 PM, Timo Selvaraj <timo.selvaraj@gmail.com> wrote:
Hi,

I need to change the incoming meta data into a specified format.

I want to change 

"Content-Type":"text/html"
to

"contenttype":"HTML"
Has anyone done something similar with the metadata adjuster?

Thanks,
Timo




Mime
View raw message