Web Servers and Web Technology
Part 2
Mission: To provide a web server for the client.
Criteria:
Successfully create a server must support the following:
PHP scripting
SSL encryption
Description:
Successfully create a server that must support PHP and SSL encryption.
Apache
A server was successfully set up to accommodate the client. A Unix machine running Apache 2.0 was chosen. Apache is a free, open source web server software program. Its history dates back to when it was first known as HTTP daemon. It was developed by Rob McCool at the National Center for Supercomputer Applications in the US. After he left NCSA, his HTTPD daemon project was no longer progressing. Many webmasters got together and came out with patches to improve HTTPD daemon. These people affectively kept patching the software and the software became known as Apache. Apache is run on roughly 33 million servers or 67% of web servers (Netcraft: Web Server Survey Archives).
PHP
PHP stands for hypertext preprocessor. PHP is also a free, open sources software product. It's purpose of running PHP is to create dynamic pages. It too was created many years ago and is based on C language code. PHP version 4 was successfully installed on this server. PHP is the most popular server-side scripting language (PHP programming language - Wikipedia, the free encyclopedia).
SSL
SSL stands for secure socket layer. SSL is used to securely transmit encrypted pages via the web. Many websites around the world use SSL on their websites especially when dealing with information that needs to be accessible to certain people. SSL version 0.9.7d. was successfully installed on the Apache Server.
Three separate hosts must be created.
Three unsecured (non-encrypted) virtual hosts and one secured host were created on the server.
http://www2.wswt.net:58730 This is the main address.
http://student2.wswt.net:58730 This is the address set up for students to access.
http://teacher2.wswt.net:58730 This is the address set up for teachers to access.
https://teacher2.wswt.net:48730 This is the address set up for teachers to access the site for test and information.
Creation of directories within the 3 hosts.
The following directory structure was setup:
www.wswt.net
________|_______________
| |
student.wswt.net
________|_______________
| | | |
teacher.wswt.net
________|_______________
| | |
SSL was implemented on the teacher site. Using openssl, a certificate and key were created so that the site can be viewed securely. The certificate details are as follows:
Issued to: Neil Moreton
Issued by: Neil Moreton
Valid from 15/05/2004 to 15/5/2005 (Certificate can be valid for any amount of time. For the prototype 1 year was chosen.
Version: V3
Serial Number: 00
Signature algorithm: md5RSA
Issuer:
(email address)
CN= Neil Moreton
OU= Web Servers Unit (Our department)
O=RMIT (Our organization)
L=Melbourne
S=Victoria
C=AU
Public Key (512 Bits)
And some identifying key
The certificate can also be viewed by going to https://teacher2.wswt.net:48730 and clicking the "View Certificate" button when a popup comes up.
The certificate when implemented on a live server will be signed by a Certifying Authority. The biggest and most trusted Certifying Authority is Verisign. Verisign is simply a third party that can tell you whether to trust a site. For the purpose of the prototype it has not been signed by Verisign because of costs.
The certificate and key is located at /students/nmoreton/apache2/certs
Certificate is named new.cert.crt. Key is named new.cert.key
Implement a username/password.
Basic authentication was implemented on the student and teacher virtual hosts. Basic authentication allows a person with the correct credentials (username & password) to access a site (or directories in a site).
For student.wswt.net:
Only teacher and student groups can access the forums, tests, and info directories
By accessing any of the above links, a popup will display asking you to enter your username & password.
For teacher.wswt.net:
Only the teacher group can access the tests and info directory.
For example: to enable authentication for the student forum directory, the following was done to the httpd.conf:
| AuthType Basic AuthName Students AuthUserFile /students/nmoreton/apache2/users/students AuthGroupFile /students/nmoreton/apache2/users/groups require valid-user |
A groups file was created using a normal text processor.
Within the "groups" is the following:
students: ben teachers: anne
All users in this group can view the student2.wswt.net:58730/forums, student2.wswt.net:58730/tests, student.wswt.net:58730/info site.
To test, use the username: ben password: cdef3456
Within the "teachersgroups" groups is the following:
teachers: dave anne All users in this group can view the student.wswt.net site.
To test, use the username: dave password: rave
This tells what groups/user combination is able to access the site.
Within, the students file is their username and password. Both of these files are in /students/nmoreton/apache2/users/ which are not web-accessible folders.
Ensure access is by authorized persons.
For some sites, it is necessary to give access to everyone. But for other sites, it might be necessary to restrict users from certain places. For this server, we restricted all users who access the student and teacher hosts to computers that are coming from RMIT. In other words, if I was at an internet cafe and tried to surf to student2.wswt.net:58730, I would get a "forbidden to access" error. If I was on a computer at RMIT and tried to access these sites then I would be able to reach the pages.
To restrict only users from RMIT, the following was added to the httpd.conf and ssl.conf:
order deny,allow allow from 131.170
deny from all
All hosts, use the 131.170.xxx.xxx in their ip address. Thus, by allowing from 131.170, only computers from RMIT can access the sites.
Log all requests/errors to servers.
Log files can be a great way to record data of how many people are accessing your site and where they are coming from. Additionally, error logs can tell you what problems the web server is having which can help alleviate any problems that come up. The following is a part of the access log that was set up for the student2.wswt.net host.
blowfly2.cs.rmit.edu.au - - [17/May/2004:20:00:59 +1000] "GET /info.php HTTP/1.1" 200 38103 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" blowfly2.cs.rmit.edu.au - - [17/May/2004:20:00:59 +1000] "GET /info.php?=PHPE9568F35-D428-11d2-A769-00AA001ACF42 HTTP/1.1" 200 4440 "http://iweb2.cs.rmit.edu.au:58730/info.php" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
203.25.172.97 - - [17/May/2004:21:05:36 +1000] "GET / HTTP/1.1" 304 - "http://www.neilmoreton.com/blah.htm" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
138.194.235.233 - - [18/May/2004:14:09:09 +1000] "GET /_vti_inf.html HTTP/1.1" 404 333 "-" "Mozilla/2.0 (compatible; MS FrontPage 5.0)"
An explanation of the access logs:
- Ip address or hostname of the remote client
- blowfly2.cs.rmit.edu.au- A client accessing the site via an RMIT computer. The Apache server determined the ip address and looked up the name of the host.
- 203.25.172.97- Another client who accessed the site. A hostname could not be resolved. This will happen quite often. This particular ip address is from someone with dialup access to the internet accessing the site. We know this because it was us.
- This is a field that is not going to be used. It is disabled and need not worry about it.
- -
- Userid of the person who is requesting the server document
- -
- anne-Used when authentication is implemented. So for the teacher's test site, a log entry would say the name of the user in this field.
- The time and date that the server finished processing the client's request
- [17/May/2004:20:00:59 +1000]-Day/Month/Year:Hour:Minute:Second +Time zone-The time is determined by the time set on the server computer not the clients time.
- Request line that the client sends
- "GET /info.php HTTP/1.1"-GET is a recognized command by web servers to serve a file to a client. /info.php is the file name. HTTP/1.1 is the protocol the client is using.
- Status code sent by the web server (Status Codes)
- 200-A successful code meaning, everything went well and the request was fulfilled.
- 304-A client sent a request for the page but the page in the client's cache is the same that is on the server. In other words, the web server didn't have to serve the page because the client already had the same copy sitting on his or her computer. This is helpful to reduce load to a server.
- 404-The address/file that the client has requested is not located on the server. This can be useful because perhaps another page on your site is pointing to a file that does not exist.
- Size of the file being served (bytes)
- 38103
- 4440
- - This dash was in the entry with the 304 code which means that no page was served thus - or 0 bytes was served.
- The referrer. The page that links to the page or file that is being accessed.
- The User-Agent. What type of browser, its compatibility, and operating system
- "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"-Mozilla compatible browser, Microsoft Internet Explorer version 6.0 running Windows NT 5(Windows 2000). This is useful because if you are developing a site that uses dynamic html and most of the users to your site are using old browsers not compatible with your site then your site becomes useless for its clients.
The following is a part of the error log that was set up for the student2.wswt.net host.
[Tue May 18 22:01:49 2004] [info] Init: Initializing (virtual) servers for SSL
[Tue May 18 22:01:49 2004] [info] Server: Apache/2.0.48, Interface: mod_ssl/2.0.
48, Library: OpenSSL/0.9.7d
[Tue May 18 22:01:49 2004] [notice] Apache/2.0.48 (Unix) mod_ssl/2.0.48 OpenSSL/
0.9.7d PHP/4.3.4 configured -- resuming normal operations
[Tue May 18 22:01:49 2004] [info] Server built: May 15 2004 11:16:26
[Tue May 18 22:01:49 2004] [debug] worker.c(1733): AcceptMutex: pthread (defauIn this log, it logs the date of the message, then the error message.
The first is [info] which is the level of the errorlog. Info level is a low level message.
Init: Initializing (virtual) servers for SSL means that the virtual servers (https://teacher2.wswt.net:48730) was started.
There are numerous messages that can be displayed and go hand in hand with the access logs.
To gain the most amount of information out of the error logs, the level of errors was set on debug. The levels of the error log give you more information the lower you set the error log. The highest level is emergency which basically tells you very big errors that might cause the server to crash. The lowest level which is the level that this server is set on is at debug. At debug level, it tells you everything such as "opening httpd.conf".
Access and error logs can be accessed by opening the following files with a text viewer:
/students/nmoreton/apache2/logs
access_log -all requests to iweb2.cs.rmit.edu.au:58730
error_log - all errors to iweb2.cs.rmit.edu.au:58730
student_access_log - all requests to student2.wswt.net:58730
student_access_log_comb - This has all the fields mentioned above - all requests to student2.wswt.net:58730
student_error_log - all errors to student2.wswt.net:58730
teacher_access_log - all requests to teacher2.wswt.net:58730 and https://teacher2.wswt.net:48730
teacher_error_log - all errors to teacher2.wswt.net:58730 and https://teacher2.wswt.net:48730
wswt_access_log - all requests to www2.wswt.net:58730 and www2.wswt.net:58730
wswt_error_log - all errors to www2.wswt.net:58730 and www2.wswt.net:58730
Allow only certain types of files on the server.
This can be useful so that if a file is accidentally placed in a web-accessible folder, then people cant access it. For example, let's say that an internal memo (written in Microsoft word) is accidentally placed into the web folder. A client would not be able to view that file (.doc) because the web server is set up to only serve .php, .jpg, .png, and .html files.
The following was added to the httpd.conf file:
<FilesMatch "*.*"> Order deny,allow
Deny from all
Allow from none
</FilesMatch>
<FilesMatch "\.(html|jpg|png|php)$">
Order allow,deny
Allow from all
Deny from none
</FilesMatch>
FilesMatch does exactly what is sounds. It matches files.
- <FilesMatch "*.*"> -Matches all file types
- Order deny,allow
Deny from all
Allow from none
-Blocks access to all files
- </FilesMatch>
- <FilesMatch "\.(html|jpg|png|php)$">
Order allow,deny
Allow from all
Deny from none
</FilesMatch>
-This allows only files with the extension html,jpg,png, and php to be served.
References:
The best place to find out about the Apache server is its official site.
Apache: http://www.apache.org
Graph taken from Netcraft
Netcraft: http://www.netcraft.com
PHP information
PHP: http://www.php.net
SSL information
OpenSSL: http://www.openssl.org