Lab 4

Interprocess Communication: TCP sockets

Goals

  • Learn to work with TCP sockets. In this lab you will work with a pair of programs which implement the client/server paradigm. The server will be a type of program known as daemon, which performs the following tasks on an infinite loop: wait for a request to arrive, process the request, and send back a response. The client will be a program that crafts a request, sends it to the server, receives and processes the response, and then terminates.

Credits

The material developed for this lab was developed by Prof. L. Felipe Perrone. Permission to reuse this material in parts or in its entirety is granted provided that this “credits” note is not removed. Additional students files associated with this lab, as well as any existing solutions can be provided upon request by e- mail to: perrone[at]bucknell[dot]edu


TCP Sockets

You have learned that the Unix pipe is a construct for interconnecting two processes that execute on the same machine. Unix pipes follow the byte stream service model, meaning that you work with them by pushing bytes in on the write end and pulling bytes out from the read end. Since access to pipes is provided via Unix file descriptors, the programmer can use the same “file” read and write system calls to operate on them.

The concept of a TCP socket is very similar to that of a pipe. The most fundamental difference is that TCP sockets serve to interconnect two processes that execute on arbitrary machines. Whether the two processes execute on the same host or on networked hosts across the world from each other, the set up and operations on the sockets are the same.

You should think of a socket as a communication endpoint. If a socket interconnects processes on arbitrary hosts on the Internet, the first thing that should occur to you is that sockets must be related to Internet addresses. When we say Internet address, it might occur to you that we’re somehow referring to IP addresses, which we use to pinpoint hosts on the Internet. An IP address, however, can only identify a host, not an application process within that particular host. If you need to pinpoint a specific application process within a host, you need to extend this concept of address to the pair <IP address, port number>, where port number serves to identify an application within a host.

This mapping of application to port number doesn’t happen by magic, of course. An application must bind to a port number within a given host and it must choose a port number that is not used by the system for a standard service. Take a look at /etc/services to find a large number of well-defined ports that are used by standard applications. The port numbers you use should never conflict with these. In fact, you should be using port numbers in “user space”;  pick something at random in the range 8,000 to 10,000 to experiment with your programs.

In this lab you will work with a pair of programs which implement the client/server paradigm. The server (echod) is a type of program known as daemon, which performs the following tasks on an infinite loop: wait for a request to arrive, process the request, and send back a response. The client (echoreq) is a program that crafts a request, sends it to the server, receives and processes the response, and then terminates.

The basic design pattern for client/server applications based on TCP sockets is illustrated in the figure below.

The figure shows the sequence of calls to functions in the socket library that are appropriate for the client process and for the server process. TCP sockets implements a high level abstraction that gives the programmer a byte-stream  communication channel across networked hosts that is reliable and order-preserving. After the connection set up, you have something that works identically to pipes.

Note: You will need to turn in a Makefile that generates all the objects and the executables for this lab (and pre-lab) assignment. This includes a rule for building wrappers.o. Failing to include this Makefile will give you a

Problem 1 (10 points)

Go through the code you have written for previous labs and find all the wrapper functions you wrote to substitute for system and library calls. Create two files with all the wrappers you have written:

  • wrappers.h – this file will contain the function prototypes (and only the prototypes) of your wrapper functions. It will be included by any programs you write in this and in future labs, which use the corresponding system and library calls.
  • wrappers.c – this file will contain the complete implementations of your wrapper functions. It will be separately compiled into object code by a rule in your Makefile. The resulting object file will be linked with the programs in this and in future lab assignments.

Your files should include wrappers for the following functions: fork(2), pipe(2), wait(2), waitpid(2),  open(2)close(2)write(2)read(2), connect(2), bind(2), listen(2), accept(2), and any others you use in this lab which set the “errno” variable when encountering an error condition.

Problem 2 (25 points)

First of all, you should implement the communication between the client and the server, that is, you will augment the two files given to you so that echoreq sends a string to echod, which receives the string and sends it back to echoreq without any changes.

Once that functionality works, you will add new functionality to echod. It’s a very familiar problem and you can reuse the code for tokenizing a string and eliminating extraneous spaces, which you wrote for Lab 2.

When the server receives a string with an arbitrary number of spaces between words, it will “clean it up” before returning to echoreq. By “clean up” you should understand that the string returned will have exactly one space between any pair of consecutive words.

For example, if the server receives a string such as:

this is       a       test          of    the      emergency   broadcast           system

It returns to the client the following string:

this is a test of the emergency broadcast system
It should go without saying that your code for the solution to this and the next two problems should not use system calls directly. Instead, make sure to use your the wrappers you defined. They will enable your code to be more readable and to handle errors.

For the sake of debugging your code, you can put both your client and your sever in the same host: your very workstation. However, to verify that everything is working to specifications, make sure to put your client and your server each on a different host.

Problem 3 (20 points)

Create a file called answers.txt to write answers to the following questions. Please write as concisely and clearly as possible.

  1. If no calls were made to fork in either of the programs you have, why is it that we’re claiming that TCP sockets are a mechanism for interprocess communication?
  2. Is the socket functionality provided by the kernel or by an external library? Present an argument to justify your answer from the perspectives of the actual implementation on Linux and from the perspective of the design decisions made for the construction of the operating system (think about the implications of these decisions on the performance of a modern networked computer). You should think about what makes sense, do a little research to verify whether you are on the right track, reason about your findings, and only then write your conclusions.
  3. Only when you consider a program’s communications needs and the operational constraints around a program can you choose which of the two IPC mechanisms is most appropriate. Describe what drives you decision to use either pipes and sockets to interconnect two processes.
  4. echoreq makes a call to gethostbyname(3). Explain what this library call does for your program and how you use its API.
  5. gethostbyname(3) is not the most up-to-date function for its kind of task (it is deprecated). Discover what function might eventually replace gethostbyname(3) and explain how differently it might work.

Problem 4 (10 points)

Copy the echoreq.c file to a new file called echoreq2.c. Once you have discovered a modern alternative to gethostbyname(3), modify echoreq2.c so that it uses this new version. Otherwise, the behavior echoreq2.c should be identical to what you see in your solution to Problem 3.

Submission

When you are done with everything, you need to:

  • cd ~/csci315/Labs/Lab4
  • git add Makefile
  • git add answers.txt
  • git add echod.c
  • git add echoreq.c
  • git add echoreq2.c
  • git add wrappers.h
  • git add wrappers.c
  • git commit -m “lab 4 completed”
  • git push
Hand In

Before turning in your work for grading, create a text file in your Lab 4 directory called submission.txt. In this file, provide a list to indicate to the grader, problem by problem, if you completed the problem and whether it works to specification. Wrap everything up by turning in this file:

  • git add ~/csci315/Labs/Lab4/submission.txt
  • git commit -m “Lab 4 completed”
  • git push

Rubric

  1. [up to -10 points] An incorrect or incomplete Makefile to build all programs in the lab assignment.
  2. [10 points] Problem 1: 6pts for correct wrappers for “new” system calls (connect, bind, listen, accept) ; 4 pts for wrappers for old functions.
  3. [25 points] Problem 2: 5 pts for using system call wrappers; 15 pts for correct communication between echod and echoreq; 5 pts for correct code to clean up blank spaces.
  4. [20 points] Problem 3: 4 pts for each correct answer.
  5. [15 points] Problem 4: 10 pts for correct name resolution using a function other than gethostbyname; 5 pts for correct behavior of echoreq2.

A note about your use of git for source version control:

Yes, we have been using git primarily as a means for you to place your work in a remote repository that the graders can access. HOWEVER, we can’t forget that the primary benefit of a source version control system is to help out the developer, that is: YOU!

Consider committing your code (even if only to your local repository) as soon as you have determined that you got something working. Heck, you can commit also partially debugged versions of your code, if you want to create a checkpoint. This can help you immensely if you turn out to screw something up, by accident, and want to recover your files from a previous checkpoint. To recover the previous state recoded in your repositories, the commands below are useful:

  • git log -> will let you see the history of commits to your repository and to find out the name of a revision, which needed if you want to recover that revision. Note that unlike svn, revision names in git are these long, unwieldy hash tags (with hexadecimal numbers). To know what you are looking for, you will have to rely on the comment that you entered when you committed the revision. Do you see now that it is important to use a meaningful message with your commits?
  • git reset -> read about this in your favorite git tutorial or man page.
  • get checkout -> read about this in your favorite git tutorial or man page.

1 thought on “Lab 4

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.