incubator-wink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Baram, Eliezer" <>
Subject FW: Tolerance to malformed media types in Wink client
Date Mon, 13 Sep 2010 08:18:22 GMT
And here is the mail he tried to post

---------- Forwarded message ----------
From: Steve Miller <<>>
Date: Mon, Sep 12, 2010 at 2:15 AM
Subject: Tolerance to malformed media types in Wink client

I created a crawler using the Apache wink client, but I found out that wink client is not
tolerant to malformed media types, even if the malformed part is only a media type parameter.
Unfortunately there are a lot of those in the internet.
When wink receives such media type it throw exception with the message: 'java.lang.IllegalArgumentException
... Verify that the format is like "type/subtype".'
I think it would be good if wink can be more tolerant for such media types, especially since
they are common. It will surly easy my time :-)

Here are examples of the media types that cause the problem and their source. This is a sample,
the sites list is longer, but the media type patterns return on themselves.

URL:   (and all aol sites around the globe)
Media Type: text/html;;charset=utf-8

Media Type: text/html; charset: UTF-8

Media Type: text/html; charset=

Media Type: text/html; $str_charset; charset=ISO-8859-1

Media Type: text/html; UTF-8;charset=ISO-8859-1

Media Type: text/html; utf-8

Media Type: text/html; UTF-8;charset=UTF-8


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message