lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: [VOTE] Release Solr 1.3.0
Date Sat, 13 Sep 2008 20:01:46 GMT

: : Now here is another really weird thing... post any garbage to the
: : update URL, and you still get a success!  It successfully fails on

Skimming the code in XmlUpdateRequestHandler, and testing out various 
inputs, this seems like a bug in 
com.caucho.xml.stream.XMLStreamReaderImpl.

Using curl as yonik described...

curl -i http://localhost:8080/solr/update --data-binary 'crap' -H 'Content-type:text/xml;
charset=utf-8'

...resin-3.1.6 (on Linux) returns a success (incorrectly) but the request 
handler doesn't log any action taken. if we alter they payload ('crap') 
above we can see some different behaviors...

1) 'crap<add><doc><field name="id">hoss</field></doc></add>'

Solr adds the doc, ignorant of the crap before the add command

2) 'crap<add><doc></doc></add>'

Solr correctly complains about the missing id field (example 
configs require it)

3) 'crap<add>'

Solr returns success even though it's not legal XML

4) 'crap<add'

Get the following exception...

javax.xml.stream.XMLStreamException: :1:7 Expected > at 0xffffffff
        at com.caucho.xml.stream.XMLStreamReaderImpl.error(XMLStreamReaderImpl.java:1268)
        at com.caucho.xml.stream.XMLStreamReaderImpl.readElementBegin(XMLStreamReaderImpl.java:689)
        at com.caucho.xml.stream.XMLStreamReaderImpl.readNext(XMLStreamReaderImpl.java:653)
        at com.caucho.xml.stream.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
        at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:148)

5) '<add><doc>'

This appears to hang ... the connection seems to be left open as if it's 
waiting for more data.

...
None of these 5 things happen when testing with Jetty.

I'm not really very familiar with this StaX stuff -- but I suspect what's 
happening here is that on "wacky" input Caucho's 
XMLStreamReaderImpl.next() is returning values we're not expecting instead 
of throwing exceptions ... and depending on the input, this is either 
causing the XmlUpdateRequestHandler.processUpdate loop/switch to ignore 
the garbage data, or get stuck in an infinite loop (when there is no 
END_DOCUMENT)

The question is: Are we doing the right thing, and 
com.caucho.xml.stream.XMLStreamReaderImpl is broken; or is 
XMLStreamReaderImpl producing a legal sequence of parse events for those 
bad inputs and we're not dealing with it properly?

FWIW: adding the following line to our web.xml seems to make everything 
"work" (by which i mean: "fail") as expected...

  <system-property javax.xml.stream.XMLInputFactory="com.ctc.wstx.stax.WstxInputFactory"
/>

...do we want commit this?  

(It wouldn't be the first time we've had to put in settings to force Resin 
to use the XML Library we want because something doesn't work with 
theirs.)


-Hoss


Mime
View raw message