lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] [Updated] (SOLR-4283) Improve URL decoding (followup of SOLR-4265)
Date Tue, 08 Jan 2013 00:30:14 GMT


Uwe Schindler updated SOLR-4283:

    Attachment: SOLR-4283.patch
> Improve URL decoding (followup of SOLR-4265)
> --------------------------------------------
>                 Key: SOLR-4283
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.1, 5.0
>         Attachments: SOLR-4283.patch
> Followup of SOLR-4265:
> SOLR-4265 has 2 problems:
> - it reads the whole InputStream into a String and this one can be big. This wastes memory,
especially when your query string from the POSted form data is near the 2 Megabyte limit.
The String is then packed in splitted form into a big Map.
> - it does not report corrupt UTF-8
> The attached patch will do 2 things:
> - The decoding of the POSTed form data is done on the ServletInputStream, directly parsing
the bytes (not chars). Key/Value pairs are extracted and %-decoded to byte[] on the fly. URL-parameters
from getQueryString() are parsed with the same code using ByteArrayInputStream on the original
String, interpreted as UTF-8 (this is a hack, because Servlet API does not give back the original
bytes from the HTTP request). To be standards conform, the query String should be interpreted
as US-ASCII, but with this approach, not full escaped UTF-8 from the HTTP request survive.
> - the byte[] key/value pairs are converted to Strings using CharsetDecoder
> This will be memory efficient and will report incorrect escaped form data, so people
will no longer complain if searches hit no results or similar.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message