Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA48710E3D for ; Wed, 29 Jan 2014 11:03:36 +0000 (UTC) Received: (qmail 82967 invoked by uid 500); 29 Jan 2014 11:03:29 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 82447 invoked by uid 500); 29 Jan 2014 11:03:29 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 82437 invoked by uid 99); 29 Jan 2014 11:03:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jan 2014 11:03:28 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of bustaa@gmail.com designates 209.85.217.178 as permitted sender) Received: from [209.85.217.178] (HELO mail-lb0-f178.google.com) (209.85.217.178) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Jan 2014 11:03:22 +0000 Received: by mail-lb0-f178.google.com with SMTP id u14so1343726lbd.9 for ; Wed, 29 Jan 2014 03:03:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=cC8DQ8okIWX+470wvhzMurFSsopyeu5Pey67nyAaXZY=; b=HeK3V7zd7pwEBGXn1krTRT2cUsgcQ15x/JDu8abBYwqQ499NfhP3PmoAr80rc8y4Ce e8hHFpko7L3iIoRI8CgLnSjvFQayMwPfWexpBR08K/mpuQXpVumCiuAH55S4IOxTQmv1 t90Lli9o2A57b85JuBlxt46zZyv6ZlnWcvMo69T/P2E4rr6Z5LELmvxK2+pbJrNXlkXT KoqZpSQSrv6udYO8ocE3ZjEH1my7m/0Z6uY4t8BEolQZR11LSy2VBfN16dmY0ySQ8lw6 IZ81YI2GMod4fIN0xHQ/Dagpr1ZHO/iI/LmlUIWrSqsHm73MTOcxNNoFAPPllOH5THlm 92Zw== MIME-Version: 1.0 X-Received: by 10.112.131.100 with SMTP id ol4mr973552lbb.38.1390993381369; Wed, 29 Jan 2014 03:03:01 -0800 (PST) Received: by 10.112.27.132 with HTTP; Wed, 29 Jan 2014 03:03:01 -0800 (PST) Date: Wed, 29 Jan 2014 12:03:01 +0100 Message-ID: Subject: TikaEntityProcessor + multivalue field as url source From: Bustaa To: solr-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hello Solr Users, i'm trying to get Tika's "BinFileDataSource" to take the filenames from a multivalue field (array) but I'm getting the following exception: Debug output from running dataimport (shortenend): "query", "<<< LONG SQL-QUERY >>>", "time-taken", "0:0:0.11", null, "----------- row #1-------------", "di_description", "asdad", "di_longtitle", "", "di_file", "fileadmin/user_upload/dateien/abc/file1.pdf,fileadmin/user_upload/dateien/abc/file2.pdf", "di_title", "test", "di_date", "2014-01-30T00:00:00Z", "di_notes", "", null, "---------------------------------------------", "transformer:script:PrependPath", [ null, "---------------------------------------------", "di_description", "asdad", "di_longtitle", "", "di_file", [ "/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf", "/Users/b/Sites/fileadmin/user_upload/dateien/abc/file2.pdf" ], "di_title", "test", "di_date", "2014-01-30T00:00:00Z", "di_notes", "", null, "---------------------------------------------", "entity:binaryImport", [ "query", "[/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf, /Users/b/Sites/fileadmin/user_upload/dateien/abc/file2.pdf]", "EXCEPTION", "java.lang.RuntimeException: java.io.FileNotFoundException: Could not find file: [/Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf, /Users/b/Sites/fileadmin/user_upload/dateien/abc/file1.pdf] <<< MORE STACKTRACE >>>", "time-taken", "0:0:0.1" ] ] ] ] Is there a way to get Tika's "BinFileDataSource" to accept the multiple values or is there a workaround (the CMS we are using save the file comma-separated into on big text field). Thanks in advance, Sam