lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerven Bolleman <jerven.bolle...@isb-sib.ch>
Subject Re: multilanguage prototype
Date Wed, 28 Jan 2009 16:08:14 GMT
Hi,

Your problem seems to be lower level than the SOLR code. You are sending
an xml request that contains an illegal (to xml spec) character. You
should strip these characters out of the data that you send. Or turn the
xml validation (not recommended because of all kinds of risks).

See
http://www.w3.org/International/questions/qa-controls#handling

Hope this helps,
Jerven

com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character
((CTRL-CHAR, code 1)) 

On Wed, 2009-01-28 at 12:27 +0530, revathy arun wrote:
> Hi,
> 
> I a, getting this error in the tomcat log file on passing chinese test to
> the content field
> The content field uses the ckj tokenizer.
> and is defined as
> 
> 
> <fieldType name="text_cjk" class="solr.TextField"
> positionIncrementGap="100">
> 
> <analyzer type="index">
> 
> <tokenizer class="solr.CJKTokenizerFactory"/>
> 
> </analyzer>
> 
> <analyzer type="query">
> 
> <tokenizer class="solr.CJKTokenizerFactory"/>
> 
> </analyzer>
> 
> </fieldType>
> 
> 
> 
> INFO: [] webapp=/lang_prototype path=/update params={} status=0 QTime=69
> 
> Jan 28, 2009 12:17:03 PM org.apache.solr.common.SolrException log
> 
> SEVERE: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character
> ((CTRL-CHAR, code 1))
> 
> at [row,col {unknown-source}]: [2,76]
> 
> at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:675)
> 
> at
> com.ctc.wstx.sr.BasicStreamReader.readTextPrimary(BasicStreamReader.java:4556)
> 
> at
> com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2888)
> 
> at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1019)
> 
> at
> org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:321)
> 
> at
> org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:195)
> 
> at
> org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:123)
> 
> at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
> 
> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
> 
> at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
> 
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
> 
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
> 
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
> 
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
> 
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
> 
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> 
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
> 
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
> 
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
> 
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:874)
> 
> at
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
> 
> at
> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
> 
> at
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
> 
> at
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
> 
> at java.lang.Thread.run(Thread.java:619)
> regards
> 
> On 1/28/09, revathy arun <revas.34@gmail.com> wrote:
> >
> > Hi,
> >
> >
> > This is the only info in the tomcat log at indexing
> >
> > Jan 27, 2009 3:46:15 PM org.apache.solr.core.SolrCore execute
> > INFO: [] webapp=/lang_prototype path=/update params={} status=0 QTime=191
> > I dont see any ohter errors in the logs .
> >
> > when i use curl to update i get success message.
> >
> > and commit data in solr admin is showing positive ,where in the index file
> > there are not indexes created.
> >
> > regards
> > sujatha
> >
> >
> >  On 1/27/09, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> >>
> >> errors: 11
> >>
> >> What were those?
> >>
> >> My hunch is your indexer had issues.  What did Solr output into the
> >> console or log during indexing?
> >>
> >>        Erik
> >>
> >> On Jan 27, 2009, at 6:56 AM, revathy arun wrote:
> >>
> >> Hi Shalin,
> >>>
> >>> The admin page stats are as follows
> >>> searcherName : Searcher@1d4c3d5 main
> >>> caching : true
> >>> numDocs : 0
> >>> maxDoc : 0
> >>>
> >>> *name: * /update  *class: *
> >>> org.apache.solr.handler.XmlUpdateRequestHandler
> >>> *version: * $Revision: 690026 $  *description: * Add documents with XML
> >>>  *
> >>> stats: *handlerStart : 1232692774389
> >>> requests : 22
> >>> errors : 11
> >>> timeouts : 0
> >>> totalTime : 1181
> >>> avgTimePerRequest : 53.68182
> >>> avgRequestsPerSecond : 6.0431463E-5
> >>>
> >>> *stats: *commits : 9
> >>> autocommits : 0
> >>> optimizes : 2
> >>> docsPending : 0
> >>> adds : 0
> >>> deletesById : 0
> >>> deletesByQuery : 0
> >>> errors : 0
> >>> cumulative_adds : 0
> >>> cumulative_deletesById : 0
> >>> cumulative_deletesByQuery : 0
> >>> cumulative_errors : 0
> >>>
> >>> in the solrconfg.xml i have commented this line
> >>>
> >>>
> >>> <!-- Used to specify an alternate directory to hold all index data
> >>>
> >>> other than the default ./data under the Solr home.
> >>>
> >>> If replication is in use, this should match the replication
> >>> configuration.
> >>>
> >>> <dataDir>${solr.data.dir:./solr/data}</dataDir>
> >>>
> >>> -->
> >>>
> >>> so the index will be created in the default data folder under solr home,
> >>>
> >>>
> >>>
> >>> Thanks for ur time
> >>>
> >>> regards
> >>>
> >>> sujatha
> >>> On 1/27/09, Shalin Shekhar Mangar <shalinmangar@gmail.com> wrote:
> >>>
> >>>>
> >>>> Are you looking for it in the right place? It is very unlikely that
a
> >>>> commit
> >>>> happens and index is not created.
> >>>>
> >>>> The index is usually created inside the data directory as configured
in
> >>>> your
> >>>> solconfig.xml
> >>>>
> >>>> Can you search for *:* from the solr admin page and see if documents
are
> >>>> returned?
> >>>>
> >>>> On Tue, Jan 27, 2009 at 5:01 PM, revathy arun <revas.34@gmail.com>
> >>>> wrote:
> >>>>
> >>>> this is the stats of my updatehandler
> >>>>> but i still dont see any index created
> >>>>> *stats: *commits : 7
> >>>>> autocommits : 0
> >>>>> optimizes : 2
> >>>>> docsPending : 0
> >>>>> adds : 0
> >>>>> deletesById : 0
> >>>>> deletesByQuery : 0
> >>>>> errors : 0
> >>>>> cumulative_adds : 0
> >>>>> cumulative_deletesById : 0
> >>>>> cumulative_deletesByQuery : 0
> >>>>> cumulative_errors : 0
> >>>>>
> >>>>> regards
> >>>>>
> >>>>> On 1/27/09, revathy arun <revas.34@gmail.com> wrote:
> >>>>>
> >>>>>>
> >>>>>> Hi
> >>>>>>
> >>>>>> I have committed.The admin page does not show any docs pending
or
> >>>>>>
> >>>>> committed
> >>>>>
> >>>>>> or any errors.
> >>>>>>
> >>>>>> Regards
> >>>>>> Sujatha
> >>>>>>
> >>>>>>
> >>>>>> On 1/27/09, Shalin Shekhar Mangar <shalinmangar@gmail.com>
wrote:
> >>>>>>
> >>>>>>>
> >>>>>>> Did you commit after the updates?
> >>>>>>>
> >>>>>>> 2009/1/27 revathy arun <revas.34@gmail.com>
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I have downloade solr1.3.0 .
> >>>>>>>>
> >>>>>>>> I need to index chinese content ,for this i have defined
a new field
> >>>>>>>>
> >>>>>>> in
> >>>>>
> >>>>>> the
> >>>>>>>
> >>>>>>>> schema
> >>>>>>>>
> >>>>>>>> as
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> <fieldType name="text_cjk" class="solr.TextField"
> >>>>>>>> positionIncrementGap="100">
> >>>>>>>>
> >>>>>>>> <analyzer type="index">
> >>>>>>>>
> >>>>>>>> <tokenizer class="solr.CJKTokenizerFactory"/>
> >>>>>>>>
> >>>>>>>> </analyzer>
> >>>>>>>>
> >>>>>>>> <analyzer type="query">
> >>>>>>>>
> >>>>>>>> <tokenizer class="solr.CJKTokenizerFactory"/>
> >>>>>>>>
> >>>>>>>> </analyzer>
> >>>>>>>>
> >>>>>>>> </fieldType>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I beleive solr1.3 already has the cjkanalyzer by default.
> >>>>>>>>
> >>>>>>>> my schema in the testing stage has only 2 fields
> >>>>>>>>
> >>>>>>>> <field name="id" type="string" indexed="true" stored="true"
> >>>>>>>>
> >>>>>>> required="true"
> >>>>>>>
> >>>>>>>> />
> >>>>>>>>
> >>>>>>>> <field name="content" type="text_cjk" indexed="true"
stored="false"
> >>>>>>>>
> >>>>>>> />
> >>>>
> >>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> However when i index the chinese text into content ,
no index is
> >>>>>>>>
> >>>>>>> being
> >>>>
> >>>>>  created.i dont see any errors in tomcat as well .
> >>>>>>>>
> >>>>>>>> this is only entry in tomcat on updating
> >>>>>>>>
> >>>>>>>> Jan 27, 2009 3:46:15 PM org.apache.solr.core.SolrCore
execute
> >>>>>>>> INFO: [] webapp=/lang_prototype path=/update params={}
status=0
> >>>>>>>>
> >>>>>>> QTime=191
> >>>>>>>
> >>>>>>>>
> >>>>>>>> I have attached the chinese text file for reference.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>>
> >>>>>>>> sujatha
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Regards,
> >>>>>>> Shalin Shekhar Mangar.
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Regards,
> >>>> Shalin Shekhar Mangar.
> >>>>
> >>>>
> >>
> >


Mime
View raw message