manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kambiz Niktabar <nikta...@yahoo.com>
Subject Re: Date normalization & URL mapping
Date Tue, 18 Nov 2014 15:15:17 GMT
Hi Karl,

Thanks for your quick reply. I'm using MCF 1.7.1 and below is the solr log for one specific
document pushed by file system connector

349529202 [qtp1191043673-15] INFO  org.apache.solr.update.processor.LogUpdateProcessor  รป
[core1] webapp=/solr path=/update/extract params={literal.deny_token_document=MyGroup:DEAD_AUTHORITY&literal.size=373&literal.stream_name=Indexing_test1.doc&literal.createdOn=Fri+Sep+26+04:23:31+BRT+2014&literal.id=file://///server1/shared/Kanik/Indexing_test1.doc&resource.name=Indexing_test1.doc&literal.allow_token_document=MyGroup:S-1-5-21-220523388-1085031214-725345543-1306383&literal.allow_token_document=MyGroup:S-1-5-32-544&literal.xisourcetype=FileServer&wt=xml&version=2.2&literal.xifilename=Indexing_test1.doc&literal.attributes=32&literal.Content-Type=application/rtf&literal.lastModified=Tue+Nov+18+12:04:48+BRT+2014&literal.shareName=shared}
{add=[file://///server1/shared/Kanik/Indexing_test1.doc (1485122413007994880)]} 0 15


I see that URL Mapping tab in the job created based on Windows share repository.
by the way, How the request should be created?

Regards
Kambiz


________________________________
 From: Karl Wright <daddywri@gmail.com>
To: "user@manifoldcf.apache.org" <user@manifoldcf.apache.org>; Kambiz Niktabar <niktabar@yahoo.com>

Sent: Tuesday, November 18, 2014 3:29 PM
Subject: Re: Date normalization & URL mapping
 


Hi Kambiz,

What version of MCF are you using?  In 1.7, the file system connector sets the RepositoryDocument's
modifiedDate field, which the solr output connector formats as iso 8601 format:

      if ( modifiedDateAttributeName != null )
      {
        Date date = document.getModifiedDate();
        if ( date != null )
        {
          outputDoc.addField( modifiedDateAttributeName, DateParser.formatISO8601Date( date
) );
        }
      }


As for mapping urls in the file system connector, it does not at this time have that kind
of feature.  This is something you will need to request if you need it.  Where are you seeing
a URL mapping tab in either Solr or File system connectors?

Karl




On Tue, Nov 18, 2014 at 9:01 AM, Kambiz Niktabar <niktabar@yahoo.com> wrote:

Hello,
>
>
>Actually I have two questions this time:
>	1. Is there any way to handle date normalization as part of document processing in Manifold
CF? I tried running file system connector and want to map last modified field to a field in
Solr but looking at the Solr log it looks like this lastModified=Fri+Sep+26+05:51:08+BRT+2014
which is not acceptable by Solr. How can I normalize date and time to an acceptable format
for Solr? 
>	2. How can I use that URL mapping tab to replace part of URL with something else like
below:
>e.g. \\server1\shared\test\file.doc  -> H:\test\file.doc
>
>
>Regards
>Kambiz Niktabar
Mime
View raw message