Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0CF181175B for ; Sun, 24 Aug 2014 06:11:12 +0000 (UTC) Received: (qmail 74882 invoked by uid 500); 24 Aug 2014 06:11:11 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 74808 invoked by uid 500); 24 Aug 2014 06:11:11 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 74798 invoked by uid 99); 24 Aug 2014 06:11:11 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 24 Aug 2014 06:11:11 +0000 Date: Sun, 24 Aug 2014 06:11:11 +0000 (UTC) From: "Denis Shishlyannikoc (JIRA)" To: dev@lucene.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SOLR-6385) Strange behavior on indexing document with wrong date format MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SOLR-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108281#comment-14108281 ] Denis Shishlyannikoc commented on SOLR-6385: -------------------------------------------- Erick Erickson, can you be more specific when talking about other JIRAs? Thanks. > Strange behavior on indexing document with wrong date format > ------------------------------------------------------------ > > Key: SOLR-6385 > URL: https://issues.apache.org/jira/browse/SOLR-6385 > Project: Solr > Issue Type: Bug > Components: clients - java > Affects Versions: 4.7.2 > Environment: Solr server in Windows 7, solrj > Reporter: Denis Shishlyannikoc > Priority: Critical > > Hello. > I try to work with solr lately and did not get much experience with it yet, so part of problems that I will describe here can be due to lack of knowledge. > Excuse me for that. > Problems that I saw: > 1) I use solj to index collection of SolrInputDocuments. > To do it I call method add(Collection) of CloudSolrServer object. > Just for fun I tried to index one of documents with not correct date: > I took solr valid date value of one of these SolrInputDocuments and changed the "T" symbol in it to "K". > (this date is defined in schema.xml as > ) > Solr failed to index collection and returned SolrServerException. > Also what happened above is that part of documents of this SolrInputDocuments collection got indexed correctly, problematic date document failed to be indexed together with several valid (from all points of view) SolrInputDocuments of this collection. > Looks like solr went through documents in collection, indexing them one by one, trowed exception on problematic date document and finally did not index all valid documents that were after problematic date document. > 2) After failure, described in 1), solr kept problematic date document in some queue and tried to reindex this document again (attempt per some 3-5 minutes, did not measure exact time of that), showing same (failed to parse date) exception in logs! After solr server restart issue is gone: no more tries to reindex problematic date document. > Questions to be answered > 1) What is the default behavior of solr on indexing problematic values fields? > For example for date field: I expect solr to index null date (instead of not indexing of whole document) and then write some warning to logs and return some indication of problem on UpdateResponse. > Maybe solr behavior on not valid field values should be configurable (defined in some xml element in schema). > 2) While indexing collection of documents, should solr index all valid documents (and not return on first problem as it happens now) ? > If I index collection of documents, I expect solr to index all valid (from all points of view) documents and return indexing status on UpdateResponse about all not indexed problematic documents. > 3) Why solr tries to reindex problematic document? Looks like bug that can create useless load on server. > If this behavior is planned by design, then how can I force solr to stop reindexing such problem documents (without restarting of solr server)? > Where can I read about it? > Thank you. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org