Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 835679279 for ; Fri, 20 Apr 2012 15:05:03 +0000 (UTC) Received: (qmail 16721 invoked by uid 500); 20 Apr 2012 15:05:02 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 16494 invoked by uid 500); 20 Apr 2012 15:05:02 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 16486 invoked by uid 99); 20 Apr 2012 15:05:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Apr 2012 15:05:02 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Apr 2012 15:05:00 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 83471413D6 for ; Fri, 20 Apr 2012 15:04:40 +0000 (UTC) Date: Fri, 20 Apr 2012 15:04:40 +0000 (UTC) From: "Colin Hebert (Created) (JIRA)" To: dev@lucene.apache.org Message-ID: <1152850840.10000.1334934280539.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (SOLR-3386) ExtractingRequestHandler applies fname settings to literals MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org ExtractingRequestHandler applies fname settings to literals ----------------------------------------------------------- Key: SOLR-3386 URL: https://issues.apache.org/jira/browse/SOLR-3386 Project: Solr Issue Type: Bug Components: contrib - Solr Cell (Tika extraction) Affects Versions: 3.5 Reporter: Colin Hebert Priority: Minor The SolrContentHandler.addLiterals() method call the SolrContentHandler.addField() which itself obtain the field with SolrContentHandler.findMappedName(). If this call makes sense with SolrContentHandler.addMetadata() [and others] because the user can't set the name of the fields otherwise, but with literals, the name of the field is manually given by the user so it shouldn't be changed at all (maybe applying unknownFieldPrefix or defaultField could be done, but even that doesn't seem quite normal). ---- I got this issue with the following usecase: I have a schema containing a "title" field which is mandatory and contains only one value. My documents have an internal title which is used as the value of the "title" field. When sending one of these documents (and HTML document), if it contains a "title" metadata I get an exception because I have multiple values for my "title" field (as I would expect). To fix that I used "fname.title=tika_title", so the title provided by tika is kept under another name. Both titles (the original one I pass manually, and the metadata one) are now named "tika_title" and I get an exception because "title" hasn't been provided at all. ---- An easy workaround for this bug is sending the literal as "my_title", and adding the following fnames "fname.my_title=title&fname.title=tika_title". A small swicheroo which puts back the correct value in the expected field. ---- A way to fix that is extracting the first blocks of SolrContentHandler.addField() in an external method (or put the lowerNames check in SolrContentHandler.findMappedName() ) and use that external method (or findMappedName() ) _before_ calling SolrContentHandler.addField() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org