lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Updating error while add doc to Solrcloud
Date Tue, 17 May 2016 15:45:29 GMT
I _think_ you are using "schemaless" mode and the
issue is that Solr guesses the type of the field based
on the first doc it encounters. Thereafter, if any
incoming doc has a different field (say the "guess"
is an int type and later something that's not an
int is in that field) then it is rejected. This is somewhat
borne out by the fact that when you blow away the collection,
the problem doc indexes.

By default, modern Solr uses "managed schema", which is
different than, but related to "schemaless". What I'd do:

Go to managed schema or even "classic" schema, here's
a place to get you started:
https://cwiki.apache.org/confluence/display/solr/Schema+Factory+Definition+in+SolrConfig

Best,
Erick

On Mon, May 16, 2016 at 11:36 PM, scott.chu <scott.chu@udngroup.com> wrote:
>
> I clear the cugna collection data (by renaming 'data' folder to 'xdata')and restart Solrcloud.
I add previous possible-error xml doc, it succeeds. So I'm sure doc data has no problem. Is
it because the index file size is too large? If the zk nodes fails during adding doc, could
it cause this updating error since I do see some "can't find leader error' in solr log?
>
> scott.chu,scott.chu@udngroup.com
> 2016/5/17 (週二)
> ----- Original Message -----
> From: scott(自己)
> To: solr-user
> CC:
> Date: 2016/5/17 (週二) 14:29
> Subject: Updating error while add doc to Solrcloud
>
>
>
> I build Solrcloud with 2 nodes, 1 shard, 2 replica. I add doc in xml format using post.jar
up to 2.85M+ no. of docs and 10gb index size. When I add more docs. the solr.log shows:
>
> --------------------------------------
>     2016-05-17 14:01:09,024 WARN (main) [ ] o.e.j.s.h.RequestLogHandler !RequestLog
>     2016-05-17 14:01:09,275 WARN (main) [ ] o.e.j.s.SecurityHandler ServletContext@o.e.j.w.WebAppContext@57fffcd7{/solr,file:/D:/portable_sw/solr-5.4.1/server/solr-webapp/webapp/,STARTING}{D:\portable_sw\solr-5.4.1\server/solr-webapp/webapp}
has uncovered http methods for path: /
>     2016-05-17 14:01:09,346 WARN (main) [ ] o.a.s.c.CoreContainer Couldn't add files
from D:\portable_sw\solr-5.4.1\mynodes\cloud\node1\lib to classpath: D:\portable_sw\solr-5.4.1\mynodes\cloud\node1\lib
>     2016-05-17 14:01:11,419 WARN (coreLoadExecutor-7-thread-1-processing-n:10.18.59.179:8983_solr)
[c:cugna s:shard1 r:core_node2 x:cugna_shard1_replica2] o.a.s.u.UpdateLog Exception reverse
reading log
> java.io.EOFException
> ...
> --------------------------------------
>
> Later I stop all and kill write.lock (I ususally do this in Solr 3 when add doc fails)
and add doc again but Solrcloud show can't find write.lock. So I recover write.lock and call
post.jar again. The output shows:
>
> --------------------------------------
>     "SimplePostTool version 5.0.0
>     Posting files to [base] url http://localhost:8983/solr/cugna/update using content-type
application/xml...
>     POSTing file NMLBOym_a_UN2000_10_20160511_1014.xml to [base]
>     SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/cugna/update
>     SimplePostTool: WARNING: Response: <?xml version="1.0" encoding="UTF-8"?>
>     <response>
>     <lst name="responseHeader"><int name="status">400</int><int
name="QTime">3</int></lst><lst name="error"><str name="msg">
>     Exception writing document id un_555917 to the index; possible analysis error.</str><int
name="code">400</int></lst>
>       </response>
>     SimplePostTool: WARNING: IOException while reading response: java.io.IOException:
Server returned HTTP response code: 40
>     0 for URL: http://localhost:8983/solr/cugna/update
>     1 files indexed.
>     COMMITting Solr index changes to http://localhost:8983/solr/cugna/update...
>     Time spent: 0:00:00.259"
> -------------------------------------
>
> I thought it's that doc un_555917 problem, then I comment out it in xml and do again,
it keeps showing same error to every single doc. I assume there's something wrong with Solrcloud.
Does anyone experience this before? What could be the problem? Can I recover it? Or I have
to add all docs again?
>
>
> scott.chu,scott.chu@udngroup.com
> 2016/5/17 (週二)
> ----- Original Message -----
> From: scott(自己)
> To: solr-user
> CC:
> Date: 2016/5/17 (週二) 09:39
> Subject: Re(2): [scottchu] Cab I migrate solrcloud by just copying wholepackagefolder?
>
>
>
> OK! Thanks for reminding. I'll stick to the convention.
>
> scott.chu,scott.chu@udngroup.com
> 2016/5/17 (週二)
> ----- Original Message -----
> From: Chris Hostetter
> To: solr-user ; scott(自己)
> CC:
> Date: 2016/5/17 (週二) 02:43
> Subject: Re: [scottchu] Cab I migrate solrcloud by just copying whole packagefolder?
>
>
>
> : Message-Id: <7FD5FD02628B55831193271F9B39AEB5@udngroup.com>
> : Subject: [scottchu] Cab I migrate solrcloud by just copying whole package
> : folder?
> : References:
> : <DC53648862EE91109783B40F8B92A3F3@udngroup.com><C162E1B1F0311FC46A64E2E2AA
> : 729138@udngroup.com>,
> : <856da447-e7b7-49a4-af87-f161b0fe50ae@elyograg.org>
> : In-Reply-To: <856da447-e7b7-49a4-af87-f161b0fe50ae@elyograg.org>
>
> https://people.apache.org/~hossman/#threadhijack
> Thread Hijacking on Mailing Lists
>
> When starting a new discussion on a mailing list, please do not reply to
> an existing message, instead start a fresh email. Even if you change the
> subject line of your email, other mail headers still track which thread
> you replied to and your question is "hidden" in that thread and gets less
> attention. It makes following discussions in the mailing list archives
> particularly difficult.
>
>
>
>
> -Hoss
> http://www.lucidworks.com/
>
>
> -----
> ???????????
> ??? AVG ?? - www.avg.com
> ??: 2015.0.6201 / ???: 4568/12238 - ????: 05/15/16
>
>
> -----
> 未在此訊息中找到病毒。
> 已透過 AVG 檢查 - www.avg.com
> 版本: 2015.0.6201 / 病毒庫: 4568/12245 - 發佈日期: 05/16/16
>
>
> -----
> 未在此訊息中找到病毒。
> 已透過 AVG 檢查 - www.avg.com
> 版本: 2015.0.6201 / 病毒庫: 4568/12245 - 發佈日期: 05/16/16

Mime
View raw message