lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-4283) Improve URL decoding (followup of SOLR-4265)
Date Tue, 08 Jan 2013 00:22:13 GMT

     [ https://issues.apache.org/jira/browse/SOLR-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated SOLR-4283:
--------------------------------

    Attachment: SOLR-4283.patch
    
> Improve URL decoding (followup of SOLR-4265)
> --------------------------------------------
>
>                 Key: SOLR-4283
>                 URL: https://issues.apache.org/jira/browse/SOLR-4283
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.0
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 4.1, 5.0
>
>         Attachments: SOLR-4283.patch
>
>
> Followup of SOLR-4265:
> SOLR-4265 has 2 problems:
> - it reads the whole InputStream into a String and this one can be big. This wastes memory,
especially when your query string from the POSted form data is near the 2 Megabyte limit.
The String is then packed in splitted form into a big Map.
> - it does not report corrupt UTF-8
> The attached patch will do 2 things:
> - The decoding of the POSTed form data is done on the ServletInputStream, directly parsing
the bytes (not chars). Key/Value pairs are extracted and %-decoded to byte[] on the fly. URL-parameters
from getQueryString() are parsed with the same code using ByteArrayInputStream on the original
String, interpreted as UTF-8 (this is a hack, because Servlet API does not give back the original
bytes from the HTTP request). To be standards conform, the query String should be interpreted
as US-ASCII, but with this approach, not full escaped UTF-8 from the HTTP request survive.
> - the byte[] key/value pairs are converted to Strings using CharsetDecoder
> This will be memory efficient and will report incorrect escaped form data, so people
will no longer complain if searches hit no results or similar.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message