httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: [users@httpd] apache fails to show jpg and not find files
Date Wed, 08 Apr 2009 21:57:16 GMT
deh wrote:
> 
> awarnier wrote:
>> Hi.
>>
>> Maybe the very first thing you need to do, if you are going to use your 
>> email program to post to lists such as this one, is to turn off all 
>> these nice features like "view as html" and "send as html".  View and 
>> compose and send as plain text, or you are going to confuse yourself and 
>> others no end, specially if what you want to include is html is the 
>> first place.
>>
>> The second thing is to just decide once and for all how you download the 
>> pages from the original server, and then stick to one way for now.  In 
>> other words, you have downloaded the pages one way or another, and they 
>> are now as they are, and we will try to understand what is going on.  If 
>> you keep on changing the contents of the pages as we are trying to help, 
>> you are going to get everyone confused again, including yourself.
>>
>> Next, what you show in your log below and seem to consider as a problem 
>> (accesses to a directory instead of a file), is actually normal.  When 
>> the browser asks Apache for a document at "/a/b/c/d/index.html", Apache 
>> will explore this whole path, element by element. So it will first look 
>> for "/a", and if it doesn't even find that, it will log an error in the 
>> log for "/a" and not go any further.
>> Similarly, if it finds "/a" and then "/a/b", and then "/a/b/c", but then 
>> not "/a/b/c/d", it will log an error for that, and never even look for 
>> "/a/b/c/d/index.html".
>>
>> Next, Apache itself will do fine with links as long as you want, as long 
>> as it can actually find what the browser is telling it to find under the 
>> DocumentRoot. The user-id under which Apache is running also needs to be 
>> able at least to read all these directories and files.  So verify this, 
>> so that we are not chasing the wrong issue (we don't know which lines 
>> you are /not/ showing us from your logs).
>>
>> And finally, the important part is to figure out what the browser is 
>> actually asking for.
>> You can figure that from your Apache access logs.
>> Check first if the access logs actually shows accesses to files that 
>> really exist on your disk, where the browser is asking for them.
>> Either the browser is asking for the wrong thing (due to incorrect links 
>> in the pages), or else the browser is asking for the right thing, but 
>> the asked-for document really isn't there.
>> What is it ?
>>
>> Suggestion :
>> - stop Apache
>> - delete all the logs
>> - start Apache
>> - in your browser, start with the very top document, then step by step 
>> check your access log, verifying that what you think the browser should 
>> be asking for at each browser click, is really what it is asking for.
>>
>> At the first discrepancy, stop and post the relevant access log lines.
>>
>>
>> deh wrote:
>>> Krist van Besien wrote:
>>>> On Wed, Apr 8, 2009 at 5:32 AM, deh <dhaselwood@verizon.net> wrote:
>>>>> I wanted to setup my web pages that are on verizon.net on my local
>>>>> network
>>>>> with a machine running Suse 11.1/apache2.  I downloaded the web pages
>>>>> with
>>>>> 'wget -a -k' into a directory and set the directory/root and directory
>>>>> for
>>>>> the apache .conf file to the directory holding 'index.html' in the
>>>>> downloaded directories and set the permissions.  When the server is
>>>>> accessed, the web page text presents, but there are only boxes for the
>>>>> .jpg,
>>>>> .jpeg files and the one case where there is a file to be downloaded it
>>>>> shows
>>>>> "Object not Found, Error 404".
>>>>>
>>>>> If I access 'index.html' from the browser (Konqueror) on the server
>>>>> machine
>>>>> everything is correct, so it looks like the paths to the files are
>>>>> being
>>>>> handled differently with apache than the browser.
>>>>>
>>>>> I'm new at this and need so direction as to where to look.
>>>> In your case it is probly the -k option to wget that is the problem.
>>>> This option tells wget to convert all hyperlinks so that they are
>>>> suitable for local viewing. This is why you can see your site in
>>>> konqueror.
>>>> In order to mirror your site locally just get all the html files using
>>>> a file transfer client, so that you get exactly the same as on your
>>>> server.
>>>>
>>>> If you are still experiencing 404 errors afterwards the way to start
>>>> solving these is to look in the error log. If you don't understand
>>>> what you see there, you can come back here and ask us :-)
>>>>
>>>> Krist
>>>>
>>>> -- 
>>>> krist.vanbesien@gmail.com
>>>> krist@vanbesien.org
>>>> Bremgarten b. Bern, Switzerland
>>>> --
>>>> A: It reverses the normal flow of conversation.
>>>> Q: What's wrong with top-posting?
>>>> A: Top-posting.
>>>> Q: What's the biggest scourge on plain text email discussions?
>>>>
>>>> ---------------------------------------------------------------------
>>>> The official User-To-User support forum of the Apache HTTP Server
>>>> Project.
>>>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>>>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>>>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
>>>> For additional commands, e-mail: users-help@httpd.apache.org
>>>>
>>>>
>>> Krist,
>>>
>>> Thanks for the response.  
>>>
>>> Dropping the '-k' option didn't fix the problem.  Here is a snip of
>>> error_log output as well as a look at the html file.  At the moment it
>>> looks
>>> like apache2 is truncating the path/file name.
>>> error_log
>>> Using webpage downloaded with wget -r
>>> [Wed Apr 08 12:32:23 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
>>> http://10.143.15.1:41574/
>>> [Wed Apr 08 12:32:25 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
>>> http://10.143.15.1:41574/
>>> [Wed Apr 08 12:32:25 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/sitebuildercontent,
>>> referer: http://10.143.15.1:41574/
>>> [Wed Apr 08 12:32:26 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/favicon.ico,
>>> referer:
>>> http://10.143.15.1:41574/
>>>
>>> error_log using webpage downloaded with wget -r -k
>>> [Wed Apr 08 13:25:24 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
>>> http://10.143.15.1:41574/
>>> [Wed Apr 08 13:25:31 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib, referer:
>>> http://10.143.15.1:41574/
>>> [Wed Apr 08 13:25:31 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/sitebuildercontent,
>>> referer: http://10.143.15.1:41574/
>>> [Wed Apr 08 13:25:39 2009] [error] [client 10.143.15.6] File does not
>>> exist:
>>> /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/sitebuildercontent,
>>> referer: http://10.143.15.1:41574/
>>>
>>> The problem is that these only have a partial path, and no file name.
>>>
>>> Line from file--
>>> ~/webpage/mysite.verizon.net/res7yvp2/w4dh22/index.html
>>> in the web page downloaded with--
>>> wget -r
>>> [I changed "<" to "#" since with "<" Preview Message didn't show the
>>> path/file]
>>> <td width="5">#img src="/imagelib/sitebuilder/layout/spacer.gif"
>>> width="1"
>>> height="1" alt=""><br></td>
>>>
>>> Same line as foregoing from web page
>>> downloaded with wget -r -k
>>> <td width="5">#img src="../../imagelib/sitebuilder/layout/spacer.gif"
>>> width="1" height="1" alt=""><br></td>
>>>
>>> The latter path/file is correct, as the imagelib is up two levels from
>>> the
>>> directory holding index.html
>>> For example, this is what I it should be--
>>> /home/deh/webpage/mysite.verizon.net/imagelib/sitebuilder/layout/spacer.gif
>>>
>>> Conclusion:
>>> 1) The '-k' option in 'wget' stores the correct path with respect to the
>>> index.html (DirectoryRoot) path.
>>> 2) Neither works correctly with apache
>>> 3) In the error_log, the file name is missing and the path is incomplete
>>>
>>> Could it be that the path is simply too long?
>>>
>>> Don
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>    "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>>
> 
> awarnier,
> 
> Here is another go at this--
> 
> Here is the 1st error from error_log--
> [Wed Apr 08 15:27:56 2009] [error] [client 10.143.15.10] File does not
> exist: /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib,
> referer: http://10.143.15.1:41574/
> 
> Here is the access_log--
> 10.143.15.10 - - [08/Apr/2009:15:27:56 -0400] "GET / HTTP/1.1" 304 - "-"
> "Opera/9.64 (X11; Linux i686; U; en) Presto/2.1.1"
> 10.143.15.10 - - [08/Apr/2009:15:27:56 -0400] "GET /index.html HTTP/1.1" 304
> - "http://10.143.15.1:41574/" "Opera/9.64 (X11; Linux i686; U; en)
> Presto/2.1.1"
> 10.143.15.10 - - [08/Apr/2009:15:27:56 -0400] "GET
> /imagelib/sitebuilder/layout/spacer.gif HTTP/1.1" 404 1145
> "http://10.143.15.1:41574/" "Opera/9.64 (X11; Linux i686; U; en)
> Presto/2.1.1"
> 
> Here is where the file that was not found resides--
> deh@PIII:~/webpage/mysite.verizon.net/imagelib/sitebuilder/layout> ls -l
> total 16
> -rwxr-xr-x 1 www users 43 2007-06-06 13:05 blank.gif
> -rwxr-xr-x 1 www users 67 2007-06-06 13:05 spacer.gif
> 
> Here is where index.html resides and <DirectoryRoot> is set--
 > deh@PIII:~/webpage/mysite.verizon.net/res7yvp2/w4dh22> ls -l

Here is a first problem :
- I presume <DirectoryRoot> is a typo, and you really mean that you have 
, in your httpd.conf,
   DocumentRoot /home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22
right ?
- assuming yes, the above directory is thus the DocumentRoot of your 
server.  Any GET request URI is thus going to be interpreted by Apache 
as relative to that directory.
The request for URI "/imagelib/sitebuilder/layout/spacer.gif" thus, is 
going to be interpreted by Apache as a request for
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib/sitebuilder/layout/spacer.gif
(one line)
which is obviously not the place where the file resides.
So Apache responds 404, and rightly so.


> total 80
> -rwxr-xr-x 1 www users 21461 2009-04-08 14:47 id2.html
> -rwxr-xr-x 1 www users 30840 2009-04-08 14:47 id3.html
> -rwxr-xr-x 1 www users 22716 2009-04-08 14:47 index.html
> 
> Here is the line from the html file--
> <td width="257" bgcolor="#FFFFFF" colspan="2" rowspan="3" valign="top">
> ../../imagelib/sitebuilder/layout/spacer.gif <br></td>
> [BTW, the html box is not checked, but if following line with the "<" ">"
> included in above, it shows up correctly in the message entry box, but in
> the Preview Message there is only a faint icon of a page.  Is there
> someplace else where html gets turned on besides the where the message is
> posted?]
> img src="../../imagelib/sitebuilder/layout/spacer.gif" alt=""
> 
> This path/file is correct when starting from the directory with index.html.
> The 'GET' in the access file would be correct if the "../../" was prepended
> to the path/file.
> 
That's where I think you have a slight misunderstanding.
The path
src="../../imagelib/sitebuilder/layout/spacer.gif"
is something that the /browser/ will interpret, but it will never send 
this as a URI to the server.

How does the browser "think" ?
Say it sends a first request to the server for "/", and it gets back a 
page.
Now for the browser, "/" is the "base location" where it got this 
current page which it is busy displaying.
Next, the browser, in that page that it is trying to display, finds a 
reference to another element, which it needs to display the page.  That 
is the image tag
<img src="../../imagelib/sitebuilder/layout/spacer.gif">
The browser is going to try, using the "base location" and the (in this 
case relative) address of the image, to build an absolute URI requesting 
this element.
But it knows that it cannot get any item that would be above the Root of 
the document tree, which is "/". So in this case it will just strip off 
the "../.." part of that relative URI, and send a request for
/imagelib/sitebuilder/layout/spacer.gif
which you see in your logs, and which the server is going to interpret 
as a request relative to your DocumentRoot, thus for
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/imagelib/sitebuilder/layout/spacer.gif
which does not exist, which --> 404.

It would be different if, for example, the browser had obtained the 
current page from a request to
/dir1/dir2/dir3/index.html
Then the above would be its "base location", and if in that page it 
found a refereence to
../../imagelib/sitebuilder/layout/spacer.gif, then it would do as follows :
- remove the last part of the base location (index.html), leaving 
"/dir1/dir2/dir3/"
- add to that the relative link found, giving
/dir1/dir2/dir3/../../imagelib/sitebuilder/layout/spacer.gif
- request that URI from the server.
- the server would then interpret that URI relative to your DocumentRoot
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22
which would in the end give a path of
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/dir1/dir2/dir3/../../imagelib/sitebuilder/layout/spacer.gif
which after eliminating the embedded ..'s would be
/home/deh/webpage/mysite.verizon.net/res7yvp2/w4dh22/dir1/imagelib/sitebuilder/layout/spacer.gif
which hopefully would be correct.

Have I lost you yet ?

Now the beginning of a solution to your problem, if you want to avoid 
correcting your DocumentRoot (and maybe creating other issues), or 
modifying all the links in the pages.
Assuming that the directory
/home/deh/webpage/mysite.verizon.net/imagelib/
is the real base of all the image links in your pages, then add
the followng to your configuration :

Alias /imagelib/ /home/deh/webpage/mysite.verizon.net/imagelib/
<Directory /home/deh/webpage/mysite.verizon.net/imagelib/>
   Order Allow,Deny
   Allow from all
</Directory>

You need this Directory section, because it is in fact outside of your 
DocumentRoot hierarchy, and Apache will not normally allow you to get 
documents from there.

Stop Apache, clear your logs, start Apache and try again.

The real solution would be to have your DocumentRoot set to
/home/deh/webpage/mysite.verizon.net/
and make sure your top index page is there.
Then you could remove the Alias and <Directory> section above.


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message