lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bustaa <bus...@gmail.com>
Subject TikaEntityProcessor + multivalue field as url source
Date Wed, 29 Jan 2014 11:03:01 GMT
Hello Solr Users,

i'm trying to get Tika's "BinFileDataSource" to take the filenames
from a multivalue field (array) but I'm getting the following
exception:

Debug output from running dataimport (shortenend):


          "query",
          "<<< LONG SQL-QUERY >>>",
          "time-taken",
          "0:0:0.11",
          null,
          "----------- row #1-------------",
          "di_description",
          "asdad",
          "di_longtitle",
          "",
          "di_file",
          "fileadmin/user_upload/dateien/abc/file1.pdf,fileadmin/user_upload/dateien/abc/file2.pdf",
          "di_title",
          "test",
          "di_date",
          "2014-01-30T00:00:00Z",
          "di_notes",
          "",
          null,
          "---------------------------------------------",
          "transformer:script:PrependPath",
          [
            null,
            "---------------------------------------------",
            "di_description",
            "asdad",
            "di_longtitle",
            "",
            "di_file",
            [
              "/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf",
              "/Users/b/Sites/fileadmin/user_upload/dateien/abc/file2.pdf"
            ],
            "di_title",
            "test",
            "di_date",
            "2014-01-30T00:00:00Z",
            "di_notes",
            "",
            null,
            "---------------------------------------------",
            "entity:binaryImport",
            [
              "query",
              "[/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf,
/Users/b/Sites/fileadmin/user_upload/dateien/abc/file2.pdf]",
              "EXCEPTION",
              "java.lang.RuntimeException:
java.io.FileNotFoundException: Could not find file:
[/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf,
/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf] <<< MORE
STACKTRACE >>>",
              "time-taken",
              "0:0:0.1"
            ]
          ]
        ]
      ]

Is there a way to get Tika's "BinFileDataSource" to accept the
multiple values or is there a workaround (the CMS we are using save
the file comma-separated into on big text field).

Thanks in advance,

Sam

Mime
View raw message