The HTTP Protocol
HTTP (Hyper-Text Transfer Protocol) has been the most popular application protocol on the web in the last decade. It uses a collection of simple, text-based commands to send and receive files such as web pages between a web server such as www.google.com and a client such as a browser. All web browsers follow HTTP protocol to communicate with a web server to retrieve web pages. You can consult many web resources (e.g., http://www.w3.org/Protocols/ or https://www.jmarshall.com/easy/http/) for the details of the protocol. In this exercise, you are asked to experiment the protocol by sending text-based HTTP command to a simple web server and observe the behavior of the protocol. Then you are asked to augment the server program such that the HEAD request can be served.
The most frequently used two commands in HTTP are GET and POST (case insensitive). The GET command requests a file from the web server, and the POST command sends information to the server for processing, e.g., submitting a form.
The basic flow of work is described as follows.
- An HTTP server is up and and running.
- On the client side, a socket connection is established.
- The client can then request a web page from the server by sending the following line-based text commands at the minimum, (HTTP supports many other commands.)
GET <path> HTTP/1.1\r\n Host: <host name where the server is running>\r\n \r\nwhere <path> is the path for the web page to be retrieved and <host> is the name of the computer where the server is running. The end of the client request is signaled by a new line by itself. For example, the commandGET /index.html HTTP/1.1\r\n Host: dana213-lnx-1\r\n \r\nis asking the web page index.html from the server that is running on the computer dana213-lnx-1.- The client can also send data to the server and ask for processing by sending the following line-based text commands,
POST <path-to-action> HTTP/1.1\r\n Host:<host name where the server is running>\r\n name1=value1&name2=value2\r\n \r\nwhere <path-to-action> specifies the path to a pre-defined action on the server, and name1=value1&name2=value2 are the name/value pairs submitted by the client for processing. The client may submit as many pairs as needed. Upon receiving the request, the server will extract the data and take proper action based the requested <path-to-action>. For example, the commandPOST /form HTTP/1.1\r\n Host: dana213-lnx-1\r\n Name=XMeng&Major=CS&Status=Graduated\r\n \r\nis asking the server to take action (see the following lab exercises for the meaning of actions here) for the submitted three pairs of data.The Exercise
Your exercise is to read a given pair programs of HTTP server and client that are written in Python. Experiment with the programs so you understand how the protocol works. In the following description, we used the term HTTP client/server and web client/server interchangeably. A web (HTTP) client can be an independent program or a web browser.
First download the entire set of programs from this directory. The files include the following.
- httpserver.py: The HTTP server.
- httpclient.py: The HTTP client.
- index-home.html: The main page or home page served on the server side. Any web client can request this page from the server.
- JLH.jpg: An image file served on the server side. Any web client can request this image from the server.
- form.html: An HTML form served on the server side. The web client and server use this form for any client to post information to the server, such as sending in a query.
Once you save these files in your directory, examine and run the program to observe their behaviors. Study the program to make sure you understand the interaction between a web client and a web server. In particular, do the following and answer the given questions.
- Run the server program on your terminal window by
python httpserver.py
- Then open a browser, and set the URL to be
http://localhost:9000/index-home.html
You should see a plain and simple web page to be displayed in your browser window. Now examine the content of the fileindex-home.html
using a text editor in your local terminal window to find out what's in there.- Change the content of
index-home.html
to contain any of your favorite content and try the above step again.- Try to load the image
JLH.jpg
on the server side by
http://localhost:9000/JLH.jpg
What is displayed in your browser? You can try to put any image on the server side and have the client request for it.- Now that you see some the basics of HTTP and HTML, try to revise your
index-home.html
file to contain both text and an image, with some pretty format. Go to https://www.w3schools.com/html/ and play around with some HTML code (if you haven't seen it before) and make your index page look nice.- In the above case, we used your
httpserver.py
as a web server program and a web browser as a web client program. Now try to use the programhttpclient.py
in your directory as a web client to request web content from your own web server. Do the following.
python httpclient.py localhost 9000 /index-home.html
and
python httpclient.py localhost 9000 /JLH.jpg
Observe what happens. Read the program to make sure you understand the interaction between a web client and a web server.- While the default action of our web client is requesting a page and print it on the terminal if it is a plain text file (which part of the client program does this?), the client program can also just request the header information. Try to edit the client program such that it will request the head information only (we have the function written for you, you just have to find out how to use it). Then run the following command.
python httpclient.py localhost 9000 /JLH.jpg
- You can also use our client to request information from an actual web server. Try the following.
Each time you request and receive a real web page from a real web server. The information returned should the text of the corresponding web page.
python httpclient.py www.eg.bucknell.edu 80 /~csci305/index.html
python httpclient.py www.eg.bucknell.edu 80 /~csci305/S18/lectures/lecture29-client-server.html
python httpclient.py www.bucknell.edu 80 /
- Web client can be used to send queries to search engine such as Google. When submitting a query, an HTML form is used. Keep your server running, try the following URL with your web browser.
http://localhost:9000/search
You can then fill in the two text boxes and hit the Submit button. Your web server, after receiving the request, will parse the submission text and respond properly. Study the code in the server and client and find out how this works.- Your final task is to revise necessary files to make your client and server interact as if the client is submitting a database query and the server responds with some text, imaging it is actually searching through the database and returning the result.