Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B390C9C8 for ; Mon, 3 Jun 2013 16:35:29 +0000 (UTC) Received: (qmail 82706 invoked by uid 500); 3 Jun 2013 16:35:26 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 82550 invoked by uid 500); 3 Jun 2013 16:35:25 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 82477 invoked by uid 99); 3 Jun 2013 16:35:25 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jun 2013 16:35:25 +0000 Date: Mon, 3 Jun 2013 16:35:25 +0000 (UTC) From: "Jack Krupansky (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SOLR-4859) MinFieldValueUpdateProcessorFactory and MaxFieldValueUpdateProcessorFactory don't do numeric comparison for numeric fields MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-4859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673289#comment-13673289 ] Jack Krupansky commented on SOLR-4859: -------------------------------------- bq. I have a patch to fix JsonLoader I'm glad to hear that, but we still have the XML loader, CSV loader, and SolrCell. I would prefer a fix to the mutating field processor code itself. I mean, how un-obvious is it that a processor labeled "Min/Max Field Value" should be able to handle numeric string values when (if) the type of the field is known? Clearly a bad design was chosen. Why can't we correct that bad design decision and eliminate the need for workaround approaches like requiring users to explicitly convert types? But... if this really bad design decision really is cast in concrete... any ETA on the explicit conversion processors? As well as an update to the Javadoc to highlight the pitfalls of min/max for non-JSON field values. > MinFieldValueUpdateProcessorFactory and MaxFieldValueUpdateProcessorFactory don't do numeric comparison for numeric fields > -------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-4859 > URL: https://issues.apache.org/jira/browse/SOLR-4859 > Project: Solr > Issue Type: Bug > Components: update > Affects Versions: 4.3 > Reporter: Jack Krupansky > > MinFieldValueUpdateProcessorFactory and MaxFieldValueUpdateProcessorFactory are advertised as supporting numeric comparisons, but this doesn't work - only string comparison is available - and doesn't seem possible, although the unit tests show it is possible at the unit test level. > The problem is that numeric processing is dependent on the SolrInputDocument containing a list of numeric values, but at least with both the current XML and JSON loaders, only string values can be loaded. > Test scenario. > 1. Use Solr 4.3 example. > 2. Add following update processor chain to solrconfig: > {code} > > > sizes_i > > > > > {code} > 3. Perform this update request: > {code} > curl "http://localhost:8983/solr/update?commit=true&update.chain=max-only-num" \ > -H 'Content-type:application/json' -d ' > [{"id": "doc-1", > "title_s": "Hello World", > "sizes_i": [200, 999, 101, 199, 1000]}]' > {code} > Note that the values are JSON integer values. > 4. Perform this query: > {code} > curl "http://localhost:8983/solr/select/?q=*:*&indent=true&wt=json" > {code} > Shows this result: > {code} > "response":{"numFound":1,"start":0,"docs":[ > { > "id":"doc-1", > "title_s":"Hello World", > "sizes_i":999, > "_version_":1436094187405574144}] > }} > {code} > sizes_i should be 1000, not 999. > Alternative update tests: > {code} > curl "http://localhost:8983/solr/update?commit=true&update.chain=max-only-num" \ > -H 'Content-type:application/json' -d ' > [{"id": "doc-1", > "title_s": "Hello World", > "sizes_i": 200, > "sizes_i": 999, > "sizes_i": 101, > "sizes_i": 199, > "sizes_i": 1000}]' > {code} > and > {code} > curl "http://localhost:8983/solr/update?commit=true&update.chain=max-only-num" \ > -H 'Content-type:application/xml' -d ' > > > doc-1 > Hello World > 42 > 128 > -3 > > ' > {code} > In XML, of course, there is no way for the input values to be anything other than strings ("text".) > The JSON loader does parse the values with their type, but immediately converts the values to strings: > {code} > private Object parseSingleFieldValue(int ev) throws IOException { > switch (ev) { > case JSONParser.STRING: > return parser.getString(); > case JSONParser.LONG: > case JSONParser.NUMBER: > case JSONParser.BIGNUMBER: > return parser.getNumberChars().toString(); > case JSONParser.BOOLEAN: > return Boolean.toString(parser.getBoolean()); // for legacy reasons, single values s are expected to be strings > case JSONParser.NULL: > parser.getNull(); > return null; > case JSONParser.ARRAY_START: > return parseArrayFieldValue(ev); > default: > throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "Error parsing JSON field value. Unexpected "+JSONParser.getEventString(ev) ); > } > } > private List parseArrayFieldValue(int ev) throws IOException { > assert ev == JSONParser.ARRAY_START; > > ArrayList lst = new ArrayList(2); > for (;;) { > ev = parser.nextEvent(); > if (ev == JSONParser.ARRAY_END) { > return lst; > } > Object val = parseSingleFieldValue(ev); > lst.add(val); > } > } > } > {code} > Originally, I had hoped/expected that the schema type of the field would determine the type of min/max comparison - integer for a *_i field in my case. > The comparison logic for min: > {code} > public final class MinFieldValueUpdateProcessorFactory extends FieldValueSubsetUpdateProcessorFactory { > @Override > @SuppressWarnings("unchecked") > public Collection pickSubset(Collection values) { > Collection result = values; > try { > result = Collections.singletonList > (Collections.min(values)); > } catch (ClassCastException e) { > throw new SolrException > (BAD_REQUEST, > "Field values are not mutually comparable: " + e.getMessage(), e); > } > return result; > } > {code} > Which seems to be completely dependent only on the type of the input values, not the field type itself. > It would be nice to at least have a comparison override: compareNumeric="true". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org