manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1197) FileSystem output connector error with some file names
Date Fri, 08 May 2015 07:19:00 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14534014#comment-14534014
] 

Karl Wright commented on CONNECTORS-1197:
-----------------------------------------

Hi Andrea,

It is not possible to just detect a failure and then modify the document name when detected,
for many reasons.  One of them is that we don't get back good feedback from Java as to what
is wrong exactly with the filename.  The other reason is that the connector also has to handle
document deletion, which has an entirely different error structure.

Your only choices are therefore the following:
(1) A special "windows" mode, which does an entirely different character mapping and where
no attempt is made to be wget compliant at all;
(2) Skipping any files whose names cause hard errors on write.

Thanks.

> FileSystem output connector error with some file names
> ------------------------------------------------------
>
>                 Key: CONNECTORS-1197
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1197
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: File system connector
>    Affects Versions: ManifoldCF 2.1
>         Environment: Windows 7 64 bit
>            Reporter: Andrea
>            Assignee: Karl Wright
>
> I'm having some problems trying to perform a job starting from a web crawling and with
a file system output connector. 
> The job is terminated with an error like the following (I think it could depend on special
chars in file name).
> Error: Could not create file 'E:\ManifoldCF\http\nypost.com\2015\05\06\bloombergs-the-man-to-beat-hillary-for-democratic-nomination?msg=fail&shared=email':
E:\ManifoldCF\http\nypost.com\2015\05\06\bloombergs-the-man-to-beat-hillary-for-democratic-nomination?msg=fail&shared=email
(The filename, directory name, or volume label syntax is incorrect)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message