manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinay (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CONNECTORS-1494) Error crawling file system with file names having special characters.
Date Thu, 08 Feb 2018 10:39:00 GMT

     [ https://issues.apache.org/jira/browse/CONNECTORS-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vinay updated CONNECTORS-1494:
------------------------------
    Description: 
I am crawling a file system mounted on linux machine. So the Repository Connection is of type
"File System". For some files which has some special characters, Manifold Cf is not picking
such files.

File ex: a_XY-SMnA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf

exception: java.lang.NumberFormatException: For input string: ""
     at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_151]
     at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
     at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
     at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513)
~[?:?]
     at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76)
~[?:?]
     at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503)
~[mcf-agents.jar:?]
     at org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47)
~[mcf-pull-agent.jar:?]
     at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308) [mcf-pull-agent.jar:?]
 FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input string: ""

  was:
I am crawling a file system mounted on linux machine. So the Repository Connection is of type
"File System". For some files which has some special characters, Manifold Cf is not picking
such files.

File ex: 2GHz_XY-SCDMA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf

exception: java.lang.NumberFormatException: For input string: ""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_151]
    at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
    at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
    at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513)
~[?:?]
    at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76)
~[?:?]
    at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503)
~[mcf-agents.jar:?]
    at org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47)
~[mcf-pull-agent.jar:?]
    at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308) [mcf-pull-agent.jar:?]
FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input string: ""


> Error crawling file system with file names having special characters.
> ---------------------------------------------------------------------
>
>                 Key: CONNECTORS-1494
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1494
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: File system connector
>    Affects Versions: ManifoldCF 2.9.1
>            Reporter: Vinay
>            Priority: Major
>
> I am crawling a file system mounted on linux machine. So the Repository Connection is
of type "File System". For some files which has some special characters, Manifold Cf is not
picking such files.
> File ex: a_XY-SMnA_ABC_Uuޓࠚϯmӣܼ˵Ҫȳ_֚3ҿؖúشԃԫхրҠë.pdf
> exception: java.lang.NumberFormatException: For input string: ""
>      at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
~[?:1.8.0_151]
>      at java.lang.Long.parseLong(Long.java:601) ~[?:1.8.0_151]
>      at java.lang.Long.<init>(Long.java:965) ~[?:1.8.0_151]
>      at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter$SpecPacker.<init>(DocumentFilter.java:513)
~[?:?]
>      at org.apache.manifoldcf.agents.transformation.documentfilter.DocumentFilter.getPipelineDescription(DocumentFilter.java:76)
~[?:?]
>      at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.getTransformationDescription(IncrementalIngester.java:503)
~[mcf-agents.jar:?]
>      at org.apache.manifoldcf.crawler.system.PipelineSpecification.<init>(PipelineSpecification.java:47)
~[mcf-pull-agent.jar:?]
>      at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:308)
[mcf-pull-agent.jar:?]
>  FATAL 2018-02-07T23:47:15,927 (Worker thread '2') - Error tossed: For input string:
""



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message