Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DAFF010FF7 for ; Thu, 19 Dec 2013 11:30:57 +0000 (UTC) Received: (qmail 48947 invoked by uid 500); 19 Dec 2013 11:30:52 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 48617 invoked by uid 500); 19 Dec 2013 11:30:51 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 48605 invoked by uid 99); 19 Dec 2013 11:30:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Dec 2013 11:30:50 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mathias.lux@gmail.com designates 209.85.216.44 as permitted sender) Received: from [209.85.216.44] (HELO mail-qa0-f44.google.com) (209.85.216.44) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Dec 2013 11:30:44 +0000 Received: by mail-qa0-f44.google.com with SMTP id i13so4654497qae.17 for ; Thu, 19 Dec 2013 03:30:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:content-type; bh=G1qaanb+bfyDcRygtBk+x8nK8kU6Pih74sODfqqPs0w=; b=Bnd+Y/EXY2vW+2ra8P9/Qy2V0G7OEudet10699i2iqolctBTiZsRRroAGawnM+ORft JPYg1z8QyOV9ZH7Vrq9M0hhpzr/L626mid9KCAy/XNZcuIcmIqZ+sVL7SdYJ5y5by+pP bc2GG+/nSfA1b4QEH1zhRJ8SAg8gWiieJTJwSS9RzwJZBKYSiHge2bU1Xsf9lXgMI5Fd DxG2hLZrvHr0u+7fF7vGgXWvfzsU/UVQJovik1i5Lnj6N57fbda0G28YDgkpM9mxHpQp DoY5ICQjSJCfl1fGycjQt3XOvw+s6jKS8MEW2HEp4Cbt3A+i5P9I/5k0T9dp/k/Bagm2 sprw== MIME-Version: 1.0 X-Received: by 10.229.136.73 with SMTP id q9mr1716483qct.15.1387452623925; Thu, 19 Dec 2013 03:30:23 -0800 (PST) Sender: mathias.lux@gmail.com Received: by 10.224.96.4 with HTTP; Thu, 19 Dec 2013 03:30:23 -0800 (PST) In-Reply-To: References: Date: Thu, 19 Dec 2013 12:30:23 +0100 X-Google-Sender-Auth: jKQ-1DzvF-vzGyuCxEh6_9MA504 Message-ID: Subject: Re: DataImport Handler, writing a new EntityProcessor From: Mathias Lux To: solr-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Hi! Thanks for all the advice! I finally did it, the most annoying error that took me the best of a day to figure out was that the state variable here had to be reset: https://bitbucket.org/dermotte/liresolr/src/d27878a71c63842cb72b84162b599d99c4408965/src/main/java/net/semanticmetadata/lire/solr/LireEntityProcessor.java?at=master#cl-56 The EntityProcessor is part of this image search plugin if anyone is interested: https://bitbucket.org/dermotte/liresolr/ :) It's always the small things that are hard to find cheers and thanks, Mathias On Wed, Dec 18, 2013 at 7:26 PM, P Williams wrote: > Hi Mathias, > > I'd recommend testing one thing at a time. See if you can get it to work > for one image before you try a directory of images. Also try testing using > the solr-testframework using your ide (I use Eclipse) to debug rather than > your browser/print statements. Hopefully that will give you some more > specific knowledge of what's happening around your plugin. > > I also wrote an EntityProcessor plugin to read from a properties > file. > Hopefully that'll give you some insight about this kind of Solr plugin and > testing them. > > Cheers, > Tricia > > > > > On Wed, Dec 18, 2013 at 3:03 AM, Mathias Lux wrote: > >> Hi all! >> >> I've got a question regarding writing a new EntityProcessor, in the >> same sense as the Tika one. My EntityProcessor should analyze jpg >> images and create document fields to be used with the LIRE Solr plugin >> (https://bitbucket.org/dermotte/liresolr). Basically I've taken the >> same approach as the TikaEntityProcessor, but my setup just indexes >> the first of 1000 images. I'm using a FileListEntityProcessor to get >> all JPEGs from a directory and then I'm handing them over (see [2]). >> My code for the EntityProcessor is at [1]. I've tried to use the >> DataSource as well as the filePath attribute, but it ends up all the >> same. However, the FileListEntityProcessor is able to read all the >> files according to the debug output, but I'm missing the link from the >> FileListEntityProcessor to the LireEntityProcessor. >> >> I'd appreciate any pointer or help :) >> >> cheers, >> Mathias >> >> [1] LireEntityProcessor http://pastebin.com/JFajkNtf >> [2] dataConfig http://pastebin.com/vSHucatJ >> >> -- >> Dr. Mathias Lux >> Klagenfurt University, Austria >> http://tinyurl.com/mlux-itec >> -- PD Dr. Mathias Lux Klagenfurt University, Austria http://tinyurl.com/mlux-itec