hc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roland Weber (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HTTPCLIENT-655) User-Agent string violates RFC
Date Wed, 06 Jun 2007 16:52:26 GMT

    [ https://issues.apache.org/jira/browse/HTTPCLIENT-655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501982

Roland Weber commented on HTTPCLIENT-655:

Hi Odi,

a) I don't think we should make significant changes to the User-Agent header in the 3.1 code
base, like dropping Jakarta from it. People may have set up filter rules that rely on the
name. That is also the reason why I'm not sure about changing anything but the version indicator
at all. Since it's an RFC violation, we might change the space character to a dash. Btw, section
3.8 of RFC 2616 also mentions:
  successive versions of the same product SHOULD only differ in the product-version portion
of the product value
What is the lesser evil here?

b) Dropping Jakarta for the 4.0 code base is fine. What I don't like are calls to System.getProperty()
to collect a user agent string, at least not in the default User-Agent interceptor. We can
have a selection of them of course. Like one that says Apache-HttpCore/J-4.0-a5 in core and
another one that says Apache-HttpClient/J-4.0-a1 in client. And another one that collects
values from system properties.
(I'd like to see the version number being updated by the build process, but I don't have the
time nor inclination to learn Maven2...)

c) You suggestion also generates space characters in "(Windows XP 5.1;x86) and "(Sun Microsystems
Inc.)" ;-)

d) A request interceptor that checks a header for compliance is a _really_ good idea. I am
in favor of enabling such verification interceptors by default. People will never learn to
comply with specifications unless exceptions are thrown into their faces. Misbehaviour must
be punished, immediately and without mercy (Dubious API Dictator Roland ;-)


> User-Agent string violates RFC
> ------------------------------
>                 Key: HTTPCLIENT-655
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-655
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient
>    Affects Versions: 3.1 RC1
>            Reporter: Ortwin Gl├╝ck
>            Priority: Minor
> Our User-Agent says "Jakarta Commons-HttpClient/3.1-rc1". But space is a reserved character
to separate individual *products* and comments according to RFC 2616, section 14.43. Jakarta
is not a product. At the same time we may want to drop the Jakarta name altogether.
> We should change this to something more standard like: 
> "Apache-HttpClient/3.1-rc1 ("+ System.getProperty("os.name") +";"+ System.getProperty("os.arch")
+") "+
> "Java/"+ System.getProperty("java.vm.version") +" ("+ System.getProperty("java.vm.vendor")
> which renders:
> "Apache-HttpClient/3.1-rc1 (Windows XP 5.1;x86) Java/1.5.0_08 (Sun Microsystems Inc.)"
> Sun's internal Http client uses something like "Java/1.5.0_08".
> I am completely ignoring the fact that real-world user agents use almost arbitrary strings.
> Some fine examples of misbehaviour from my private logs:
> "Jakmpqes dihurxf wfyiupsc" -- apparently somebody has to hide something...
> "Missigua Locator 1.9"
> "Poodle predictor 1.0"
> "shelob v1.0"
> "ISC Systems iRc Search 2.1"
> "ping.blogug.ch aggregator 1.0"
> "http://www.uni-koblenz.de/~flocke/robot-info.txt"  -- ...sigh
> I am very tempted to write a User-Agent string validator that prevents misuse of this
field in HttpClient.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail: httpcomponents-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: httpcomponents-dev-help@jakarta.apache.org

View raw message