Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5D439179 for ; Fri, 4 Nov 2011 17:04:14 +0000 (UTC) Received: (qmail 64422 invoked by uid 500); 4 Nov 2011 17:04:13 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 64330 invoked by uid 500); 4 Nov 2011 17:04:13 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 64323 invoked by uid 99); 4 Nov 2011 17:04:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Nov 2011 17:04:13 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Nov 2011 17:04:11 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id D6FDA3318AA for ; Fri, 4 Nov 2011 17:03:51 +0000 (UTC) Date: Fri, 4 Nov 2011 17:03:51 +0000 (UTC) From: "Luca Cavanna (Updated) (JIRA)" To: dev@lucene.apache.org Message-ID: <1478588871.347.1320426231882.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (SOLR-1499) SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-1499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Cavanna updated SOLR-1499: ------------------------------- Attachment: SOLR-1499.patch I attached a new version of the patch. I cleaned up the code and added a new core into the example-DIH folder to show how the SolrEntityProcessor works. The only problem I see is that the example requires one more solr instance running and its address needs to be specified into the solr-data-config.xml file. I also have some doubts about the condition if (root) { solrQuery.setQuery(queryString); } inside the SolrEntityProcessor#init method, but I haven't had yet the time to write a specific test. Please let me know if you have some more suggestions! > SolrEntityProcessor - DIH EntityProcessor that queries an external Solr via SolrJ > --------------------------------------------------------------------------------- > > Key: SOLR-1499 > URL: https://issues.apache.org/jira/browse/SOLR-1499 > Project: Solr > Issue Type: New Feature > Components: contrib - DataImportHandler > Reporter: Lance Norskog > Fix For: 3.5, 4.0 > > Attachments: SOLR-1499.core.rev1182017.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.patch, SOLR-1499.tests.rev1182017.patch > > > The SolrEntityProcessor queries an external Solr instance. The Solr documents returned are unpacked and emitted as DIH fields. > The SolrEntityProcessor uses the following attributes: > * solr='http://localhost:8983/solr/sms' > ** This gives the URL of the target Solr instance. > *** Note: the connection to the target Solr uses the binary SolrJ format. > * query='Jefferson&sort=id+asc' > ** This gives the base query string use with Solr. It can include any standard Solr request parameter. This attribute is processed under the variable resolution rules and can be driven in an inner stage of the indexing pipeline. > * rows='10' > ** This gives the number of rows to fetch per request.. > ** The SolrEntityProcessor always fetches every document that matches the request.. > * fields='id,tag' > ** This selects the fields to be returned from the Solr request. > ** These must also be declared as elements. > ** As with all fields, template processors can be used to alter the contents to be passed downwards. > * timeout='30' > ** This limits the query to 5 seconds. This can be used as a fail-safe to prevent the indexing session from freezing up. By default the timeout is 5 minutes. > Limitations: > * Solr errors are not handled correctly. > * Loop control constructs have not been tested. > * Multi-valued returned fields have not been tested. > The unit tests give examples of how to use it as the root entity and an inner entity. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org