httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Brodis <mabro...@colorado4x4.net>
Subject Re: [users@httpd] Need help with reverse proxying and image loading
Date Mon, 27 Jan 2014 17:57:37 GMT
I am a n00b with Apache also but I'll take a stab at this.

What you are wanting is actually 2 things.  A fully functional (for at
least one website) forward HTTP proxy and also a domain name change.  In my
opinion you will never get a functional webpage (at least not something as
complex and interconnected as a CNN site) with static mappings.  Static
mappings such as mysite.com be translated to cnn.com could work...but as
you pointed out what about the rest of the items on the CNN page.  There
will be images from Facebook, Twitter, 4space, Yahoo, Google..and everyone
of those could have 50 different hosts the images could come from, the
hostnames that you will pull content from will vary throughout the day and
your region.

So, for that to work you are going to need to use a real outbound forward
HTTP proxy which your workstation/browser will know how to use (read up on
forward proxies versus reverse proxies, same software can be used in very
different ways).  Now, using that method in theory you could still try to
change domain names of the site, though I'm not exactly sure how you would
do that and I don't think it would work right.  Here's why...when a browser
requests an item from a server it sends the hostname in the HTTP header.
This seems redundant usually, as the CNN servers know they are CNN so why
send "cnn" in the header.  This is because the server can serve up
different content based on the header value (look up virtual hosts, and
this is not virtual machine stuff).  So while some web-servers will serve
up the same content whether you request it from the IP or a hostname,
others will serve up something different.  Also the issue of SSL
certificates.  The SSL cert has to match the site that the browser is going
to by name.  SSL certs cannot be tied to an IP address and if you try to
forward a SSL cert through a domain-name-changed proxy service then the
name the browser has for a site will not match the CN (common name) value
in the SSL cert itself..and thus the browser will throw it's arms up,
complain, warn, etc.

For a normal forward HTTP proxy there is a way to set them up as a secure
proxy which will handle the SSL certs correctly but that is because there
is no domain-name changing happening in the process.

So, I'm not sure if what you're trying to do will work for a site as
complex as CNN.  Could you do a domain-name-change on a buddy's site with
very little interconnecting..sure..but it would still be a very statically
defined setup.

Good Luck...
-Mark


On Sun, Jan 26, 2014 at 8:08 PM, MM KP <sbctwc@gmail.com> wrote:

> Hello all
>
> I am new to apache & the apache mailing list so PLEASE forgive me for my
> long message :
>
>
> I am trying to configure a nice reverse proxy using Apache. Basically this
> is what I want : i want to be able to browse to something like
> testproxy.myproxy.com and proxy to www.cnn.com. I want to be able to see
> images and i want javascript and css and all that good  stuff loaded as
> well. I already created a DNS record for testproxy.myproxy.com and this
> is the configuration im using for the virtual host:
>
>
> <VirtualHost [::]:80>
>    ServerName testproxy.myproxy.com
>    ProxyRequests off
>    ProxyPass / http://www.cnn.com/
>    ProxyPassReverse / http://www.cnn.com/
> </VirtualHost>
>
>
> now when i restart the httpd service (By the way I am using RHEL 6.5), I
> can browse to testproxy.myproxy.com but allthat appears in the browser
> are text and links. No images are loaded nor any CSS/javascript. What am I
> missing in my virtualhost configuration thats preventing me from loading
> images? Ive noticed that some of the images on cnn.com are hosted on a
> different site such as :
>
> http://i2.cdn.turner.com/cnn/dam/assets/
>
> Im guessing that since the images are hosted in the /cnn/dam/assets/
> folder on i2.cdn.turner.com , and the virtualhost/reverse proxy is only
> set up to proxy pass to www.cnn.com , it is not loading images and
> scripts that are hosted on http://i2.cdn.turner.com/cnn/dam/assets/. I
> dont know if i am even close to being accurate with my assumptions. Apache
> is a very new thing to me.
>
>
> my question is how do I go about configuring my virtualhosts properly so
> that every image and script that is on www.cnn.com, will be URL rewritten
> as testproxy.myproxy.com/ blah blah blah as opposed to
> i2.cdn.turner.com/etcetcetcetc? for example one of the images on CNN's
> homepage is:
>
>
> http://i2.cdn.turner.com/cnn/dam/assets/140123154723-07-super-bowl-prep-bin-tease.jpg
>
> I want to be able to go to a browser, type in testproxy.myproxy.com in
> the address bar, proxy to www.cnn.com and when i right click on the
> image, i want the FQDN of the image to be something like
> http://testproxy.myproxy.com/images/super-bowl-prep-bin-tease.jpg.
> Basically i want all URLs to be rewritten as
> http://testproxy.myproxy.com/.......etc etc etc.
>
> All help is GREATLY appreciated because well, i am totally lost here lol.
> Ive done research on using mod_proxy_html and what not, but im still
> confused as to how I go about doing this in my situation.
>
> Please assist me!
>
>
> Thanks!!
>
> SBC
>
>

Mime
View raw message