manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinay (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-1494) Error crawling file system with file names having special characters.
Date Thu, 08 Feb 2018 09:26:00 GMT
Vinay created CONNECTORS-1494:
---------------------------------

             Summary: Error crawling file system with file names having special characters.
                 Key: CONNECTORS-1494
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1494
             Project: ManifoldCF
          Issue Type: Bug
          Components: File system connector
    Affects Versions: ManifoldCF 2.9.1
            Reporter: Vinay


I am crawling a file system mounted on linux machine. So the Repository Connection is of type
"File System". For some files which has some special characters, Manifold Cf is not picking
such files.

File ex: 2GHz_XY-SCDMA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf

exception: java.lang.NumberFormatException: For input string: ""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_151]
    at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
    at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
    at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513)
~[?:?]
    at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76)
~[?:?]
    at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503)
~[mcf-agents.jar:?]
    at org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47)
~[mcf-pull-agent.jar:?]
    at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308) [mcf-pull-agent.jar:?]
FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input string: ""



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message