lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl (JIRA) <j...@apache.org>
Subject [jira] [Commented] (SOLR-772) malformed XML updates w/Resin's Stax parser doesn't trigger errors
Date Thu, 28 Feb 2013 13:25:12 GMT

    [ https://issues.apache.org/jira/browse/SOLR-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589530#comment-13589530
] 

Jan Høydahl commented on SOLR-772:
----------------------------------

Anyone running Solr in Resin who can do a quick test and (un)confirm this ancient bug running
one of the curl commands above?
                
> malformed XML updates w/Resin's Stax parser doesn't trigger errors
> ------------------------------------------------------------------
>
>                 Key: SOLR-772
>                 URL: https://issues.apache.org/jira/browse/SOLR-772
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Hoss Man
>
> Originally noted by yonik on the mailing list...
> {quote}
> Then I tried Resin 3.1.1 and 3.1.6....
> Things *seem* to mostly work... until you get to updating:
>    ...
> Now here is another really weird thing... post any garbage to the
> update URL, and you still get a success!  It successfully fails on
> jetty.  Mangled query requests correctly fail.  This perhaps initially
> points to something specific to the XML config in jetty?
> {quote}
> Followup from Hoss...
> {quote}
> Skimming the code in XmlUpdateRequestHandler, and testing out various inputs, this seems
like a bug in com.caucho.xml.stream.XMLStreamReaderImpl.
> Using curl as yonik described...
> curl -i http://localhost:8080/solr/update --data-binary 'crap' -H 'Content-type:text/xml;
charset=utf-8'
> ...resin-3.1.6 (on Linux) returns a success (incorrectly) but the request 
> handler doesn't log any action taken. if we alter they payload ('crap') 
> above we can see some different behaviors...
> 1) 'crap<add><doc><field name="id">hoss</field></doc></add>'
> Solr adds the doc, ignorant of the crap before the add command
> 2) 'crap<add><doc></doc></add>'
> Solr correctly complains about the missing id field (example configs require it)
> 3) 'crap<add>'
> Solr returns success even though it's not legal XML
> 4) 'crap<add'
> Get the following exception...
> {noformat}
> javax.xml.stream.XMLStreamException: :1:7 Expected > at 0xffffffff
>         at com.caucho.xml.stream.XMLStreamReaderImpl.error(XMLStreamReaderImpl.java:1268)
>         at com.caucho.xml.stream.XMLStreamReaderImpl.readElementBegin(XMLStreamReaderImpl.java:689)
>         at com.caucho.xml.stream.XMLStreamReaderImpl.readNext(XMLStreamReaderImpl.java:653)
>         at com.caucho.xml.stream.XMLStreamReaderImpl.next(XMLStreamReaderImpl.java:594)
>         at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:148)
> {noformat}
> 5) '<add><doc>'
> This appears to hang ... the connection seems to be left open as if it's waiting for
more data.
> ...
> None of these 5 things happen when testing with Jetty.
> I'm not really very familiar with this StaX stuff -- but I suspect what's happening here
is that on "wacky" input Caucho's XMLStreamReaderImpl.next() is returning values we're not
expecting instead of throwing exceptions ... and depending on the input, this is either causing
the XmlUpdateRequestHandler.processUpdate loop/switch to ignore the garbage data, or get stuck
in an infinite loop (when there is no END_DOCUMENT)
> The question is: Are we doing the right thing, and com.caucho.xml.stream.XMLStreamReaderImpl
is broken; or is XMLStreamReaderImpl producing a legal sequence of parse events for those
bad inputs and we're not dealing with it properly?
> FWIW: adding the following line to our web.xml seems to make everything "work" (by which
i mean: "fail") as expected...
> <system-property javax.xml.stream.XMLInputFactory="com.ctc.wstx.stax.WstxInputFactory"
/>
> ...do we want commit this?  
> (It wouldn't be the first time we've had to put in settings to force Resin to use the
XML Library we want because something doesn't work with theirs.)
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message