The Apache/Perl Integration Project
Introduction
Apache has been a growing project for many years. It is personally
amazing to me how Apache is so powerful, flexible, and easy to configure for
a programmer. There is so much documentation about how to configure and do
things in Apache, other commercial webservers just can't compare. Apache
makes webserving fun for programmers. There are a lot of things you can test
and tinker with.
Along with the growing, comes new ways of doing things without getting rid
of the old ways of doing things. I have a problem with my computer at
gnujobs.com.
Basically, I need to forward every request for 'http://www.tcu-inc.com/mark/articles'
to 'http://www.gnujobs.com/Articles'. I tried the Apache
Redirect directive, but it didn't work. So, I had to figure out why it didn't work
and to see if there was any way around it.
The Problem with Redirect
The problem is, 'Redirect' didn't work for me when I tried to forward
'http://www.tcu-inc.com/mark/articles' to 'http://www.gnujobs.com/Articles'
since both websites where running on the same webserver. However, I found out
later, and it was obvious when I thought about it twice,
I was using Redirect incorrectly. Nevertheless, it set me on a trip to
get reacquainted with mod_rewrite. The solutions I had were
- Use a Perl script.
- Use mod_rewrite with Virtual Host
- Correctly use Redirect with Virtual Host
Using a Perl script with Virtual Host
This was my first solution after I couldn't get Redirect to work right.
The only reason why I used it was because it was quick and dirty. It is
actually fairly complex because you have to understand Perl programming,
what "Location" does, and how to get a Perl script to execute with your
Apache webserver. Besides that, it was simple for me to do because I have
done similar things a million times over.
The nice thing about using a Perl script, is that you don't have to
recompile the Apache server. You do have to change the configuration slightly.
You don't have to install mod_perl, but if you do, the configuration
can be slightly different if you want to cache the perl script.
Also, this can be done is any programming language, not just Perl.
I had to change the httpd.conf file slightly:
<VirtualHost 206.21.120.103>
ServerAdmin info@gnujobs.com
ServerName www.tcu-inc.com
DocumentRoot /www/htdocs/
ScriptAlias /mark/articles "/www2/TCU.pl"
</VirtualHost>
The ScriptAlias is the key part here.
It redirects all requests from /mark/articles
to the TCU.pl perl script.
And the perl script looked like this,
#!/usr/bin/perl
print "Location: http://www.gnujobs.com$ENV{'REQUEST_URI'}";
$ENV{'REQUEST_URI'} is the key part here. It is an environment
variable that is equal to the file asked for
on the www.tcu-inc.com webserver. The perl script takes the file asked for, and
then redirects the browser to the new website.
Also, I did a "chmod 755 TCU.pl" on the perl script to make sure it was
executable.
Using the mod_rewrite module for Apache
Aaron Bush from COLUG is responsible for
getting me to do it in mod_rewrite. I didn't like my Perl solution, and knew
I could use mod_rewrite to do it, but I was being lazy.
I knew I should do it in mod_rewrite,
and Aaron asked me "Why don't you use mod_rewrite?".
Feeling lazy, I was
unhappy, for I knew that what separates the men from the boys can be
issues like this, so I said to myself "screw it, I haven't used mod_rewrite
in a long time, let me do it the right way".
You have to have mod_rewrite compiled into Apache in order to
use this option. Mod_rewrite is a nice module that
"provides a rule-based rewriting engine to rewrite requested URLs on the fly".
It is a very powerful module, and is essential to learn if you want to become a true
webmaster or web programmer. It is not essential that you use it, but that
you know what it can do so that you can present the options to your boss
when you want to do weird things with your webserver.
I compiled mod_rewrite into Apache and applied this configuration option
to httpd.conf.
<VirtualHost 206.21.120.103>
ServerAdmin info@gnujobs.com
ServerName www.tcu-inc.com
DocumentRoot /www/htdocs/
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.tcu\-inc\.com$
RewriteRule ^(/articles)(.*) http://www.gnujobs.com/Articles$2 [R]
</VirtualHost>
RewriteCond sets the condition for which we are going to use mod_rewrite, which
basically says pay attention to the webserver if www.tcu-inc.com is the base
website name being used. RewriteRule takes the conditions (everything at www.tcu-inc.com)
and says that if the requested file starts with "/articles" redirect it to
www.gnujobs.com. The "$2" corresponds to "(.*)". The rest
of the requested file after "/articles" is equal to "(.*)". [R] means "Take this
match and redirect it".
Using Redirect with Virtual Host
After I figured out how to redirect www.tcu-inc.com to gnujobs.com with
the other two methods, I realized
I was doing a mistake with the standard "Redirect". I wasn't putting the
Redirect command into the VirtualHost section. Once I did that, it worked fine.
<VirtualHost 206.21.120.103>
ServerAdmin info@gnujobs.com
ServerName www.tcu-inc.com
DocumentRoot /www/htdocs/
Redirect /mark/articles http://www.gnujobs.com/Articles
</VirtualHost>
Conclusion
I did a lot of work for nothing, but I made this nice article explaining different
options for other people to learn from. Here are my suggestions about which of these
3 methods you should use:
- If you don't forsee ever having programmers work on your website, use
the standard Redirect command.
- If you are a programmer or a true webmaster, then please use mod_rewrite.
If you standardize on using mod_rewrite, you might find other uses for it as well
which you might not realize unless you have had previous practice with mod_rewrite.
- I don't know why you would want to use a Perl script to do this, unless you
want to record certain stats (which are possible in other ways) or do something
else that is weird. Using a Perl script would just add more overhead. If you
used it with mod_perl, you might end up using more memory because of caching.
I wouldn't use a Perl script to do this, although that was my first working
solution. Bad solutions are good only in the fact that you at least have
something working.
Mark works as an independent consultant donating time to causes like
GNUJobs.com, writing articles, and writing
free software.
Copyright © 2000, Mark Nielsen.
Copying license http://www.linuxgazette.com/copying.html
Published in Issue 61 of Linux Gazette, January 2001