httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonas Eckerman <jonas_li...@frukt.org>
Subject RE: [users@httpd] Serving remote files from Apache...
Date Fri, 02 Jan 2004 06:18:28 GMT
On Thu, 1 Jan 2004 11:42:29 -0800, bruce wrote:

First a short thought:

After reading your mail, I feel mod_proxy should be suitable. If you decide you don't want
bto use HTTP or FTP when then "client" apaches fetches data from the central system, it might
still be a good idea to check out mod_proxy as it is is possible to write modules that makes
mod_proxy support other protocols as well.

I also feel that it should be worthwhile to check out existing methods external to Apache
for updating the "client" apaches configs. One such method could be to have the master system
push the config files to the clients with rsync.

>  1.1: How often does the configuration change?
>  The configuration for the Apache clients should be able to be
>  updated in a dynamic fashion. It is critical that the modified
>  Apache clients be able to access this information from the Central
>  system.

This doesn't answer the question of how often it will change.

If it may change more than once a minute, we can rule out everything that has to do with cron
jobs for eaxmple.


>  configured... Another way might be to have the config information
>  periodically downloaded to the client machine, and then have the
>  Apache client "load" it. This would require the config information
>  be in a different format than the straight text file...

Why? If the client machine fetches it with ftp, rsync, http or whatever with a very small
shell script called by cron, the fiule they fetch can be a standard apache httpd.conf. If
you use rsync or wget you should be able to check the fuiles timestamp and then have the shell
script tell apache to reload the config if the file has been changed since the last time it
was reloaded.

What do you see that makes this type of solution unfeasible?

> There would also be security considerations...

Of course. The whole project is full of them. :-)

But: Does the multitude of apaches have fixed IP-addresses? If not, simply restricting access
to the config based on IP should be a good start. If you use rsync, you can also use passwords.
Or you can tunnel through ssh (or use scp) and use certificates *and* passwords *and* IP-addresse.

>  For ease of use.. it seems the "easiest" approach would be to
>  "suck" the config information into the Apache client...

I think it would be far easier to just use one of the multitude of existing ways to fetch
a file regularly from one system to another, rather than inventing and implementing a way
for Apache to do this work.

>  1.3: How critical is it that *all* the Apaches *always* contain
>  the latest configuration data?
>  Each Apache client may in fact contain a different "config". But
>  it's important that they be updated with the latest as soon as
>  possible.

What does "as soon as possible" mean. No more than 10 seconds? No more than 10 minutes? An
hour?

> This of course will require some central app/process
>  that essentially monitors when "config" information is changed, as
>  well as the configs of each Apache client, and then which Apache
>  clients get what configs....

So push it from the server to the client using rsync. That ought to work.

And then the client can either have a cron job checking for changes, or if a gap of 1 minute
is unacceptable it can have a small shell script that checks for changes ever 5 seconds or
so. The easiest way might be to simply use rsync to send two fiules, the apache config and
a flag file. The shell script can simply check for the existance of a flag file and delete
it after it has reloaded the config. Use timestamps in the flag files names if changes can
sometimes happenb very close to each other.

If you do decide to push the config from the server to the client, you could actually have
the config stored any way you want to. In one text file per client, in one main text file
and a number of smaller text files for the stuff that differs between clients, in a SQL database,
or whatever. The script that pushes the config to the clients can also create the actual file
to push from whatever the data is stored in.

1.4.2: Will the OS installations for all the Apache's be identical?
     We are considering Linux/Windows variant OSes for the Apache client
apps

I would absolutely recommend Linux rather than Windows for this. For a couple  main reasons:

1: There's a lot more applications made for this kind of stuff easily available for Linux
(rsync for example is easy to use on Linux, but on Windows it is just frustratin).
2: You don't have to do anything special to get a good command shell with good scripting support
on Linux, and intepreted languages suitable for automagic are easy to isntall (perl comes
to mind).
3: A stable Windows based Apache system requires Windows 2K/XP with fixpacks. Windows 2K/XP
requires a lot more computer than a Linux system without X.
4: It's easy to restrict access to files and directories on a Linux system. No users can become
root, they canät read/change the cache or the config. While this is possible on Widnows as
well, it's not as easy and if the machine is used for ordinary apps as well there can ve problems
as some orinary apps simply won't work  correctly unless the user has administrator rights.

I think this kind of stuff is simply a lot easier to do on Linux (and other Unix-like systems)
than on Windows.

>  2.1: How much data are we talking about?
>  You had to ask...!!! For a given site, we believe that they will
>  average ~10-20 pages of text... We'd guess ~5-10K.. We don't
>  really know, but we do know we're not talking large quantities of
>  data for the sites...

Ok. So a cache on the "client" apaches can be used without requiring large harddisks. That's
good.

>  2.2: Is this dynamic data or static data? (The usefulness of
>  mod_cache depends on this).
>  The data in the sites/pages will be static...

This is good. It means you can use mod_proxy.

>  (at least initially) want to cache any page/site content on the
>  users PC anymore than we have to..

Ok. This might lead to scaling problems if the central servers canät keep up with all the
"client" apaches.

>  We have considered using temp files/directories as needed...

This is a comment I don't understand. Using for what? The only thing I can think of in this
setup that requires temp files is a cache.

>  We are also open to using/considering the use of caching methods,
>  provided we also incorporate some form of encryption for the
>  content.

That should be possible, depending on the OS. If you use a file system with support for user
based encryption and an OS that also supports it, you should be able to place the cache (and
config if desired) in an encrypted directory only accesible by the apache user.

If you decide to use mod_cache, there's another way to go as well.

Just as mod_proxy uses plugable modules for transport, mod_cache uses pluggable modules for
storage. Ass far as I know there's currently two storage modules available for mod_cache,
one stat stores on disk and one that stores in RAM. It should be possible to moify the disk
cache module and create a new module that lets mod_cache store data in signed encrypted form.

This still leaves the config data though.

>  2.4: What kind of hardware (CPU numbers and speed, disk size,
>  amount of RAM) will the Apache's be installed on?
>  This will range. For design issues, assume a minimal machine...
>  500MHz, 10GHd, 256Meg Ram.

Considering you wont execute dynamic web code or need big caches on the "client" apaches,
that should be enough for an ordinary apache with mod_proxy (and mod_cache if desired).

>  2.5: What kind of connections will the Apache's have to the remote
>  machines? The modified Apache clients will have to access the
>  Master/Central app/system through a secure process. Any
>  data/communication between the Central system and the client
>  Apache apps will have to go through a verification/validation
>  process...

And what does this mean? A "secure process"? Are you hoing to utilize IPC stuff (named pipes)
or?

Are we talking about IP networks? LAN, WAN, the Internet?

>  We're still not comfortable with this approach given that it would
>  require the Master App/System to be running an Apache Based app,
>  that will be required to have its own config file.

No, it will require the master system to run one of:

1: a HTTP server (does not have to be apache)
2: a FTP server.
3: somehing else for wich you find or make a transport module that mod_proxy can use.

>  If you're interested in having a further discussion, or in perhaps
>  joining what we're thinking about doing, we would be interested in
>  talking with you.

I don't feel I have the time to actually hoin any projects, but I do have the time to check
this list every now and then. :-)

Regards
/Jonas

-- 
Jonas Eckerman, jonas_lists@frukt.org
http://www.fsdb.org/


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Mime
View raw message