manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <>
Subject [jira] [Commented] (CONNECTORS-1155) Web connector should not be sending the port number in request header field Host
Date Tue, 03 Mar 2015 07:19:05 GMT


Karl Wright commented on CONNECTORS-1155:

There's another related ticket, which insists that the port number MUST be present if it isn't
the default: HTTPCLIENT-85.

> Web connector should not be sending the port number in request header field Host
> --------------------------------------------------------------------------------
>                 Key: CONNECTORS-1155
>                 URL:
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Web connector
>    Affects Versions: ManifoldCF 1.7.2
>            Reporter: Denis Beck
>            Assignee: Karl Wright
>             Fix For: ManifoldCF 1.8.2, ManifoldCF 2.0.2
> The web connector sends the port number in the request header field Host (e.g. Host: This causes redirect rules for the host name to fail. The port number
should not be part of the Host header.
> On the other hand RFC 2616 section 14.23 (
says “The Host request-header field specifies the Internet host and port number of the resource
being requested [...]”.
> I encountered this issue while trying to crawl a customer’s website. The very first
call to the seed URL caused a redirect which contained a link to the original URL itself and
the job ended without fetching anything. The Simple History showed Status 301, that's it.
Maybe the web connector does not follow the link in the redirect correctly?
> The redirect couldn't be triggered otherwise: I tried a browser and cURL. ManifoldCF's
web connector was the only one sending the port number with the Host header and wasn't able
to crawl the website due to this behavior.
> This issue could be worked around collaborating with the contractor which hosted the
customer's website. He added an exception for these requests. But in general, I think this
should be fixed, as such collaboration is not always possible. 

This message was sent by Atlassian JIRA

View raw message