Notice: Constant FORCE_SSL_ADMIN already defined in /nfs/unixspace/linux/accounts/COURSES/cs206/public_html/sp17/wp-config.php on line 94
Lab 2 | CSCI 206 Computer Organization & Programming

Lab 2

The C build process

The goal of this lab is to explore how source code is turned into an executable program. All of your work should be done in your ~/csci206/Labs/Lab02 folder and added to your gitlab repository.

Goals

  • Learn how to compile C programs from the terminal using: make, gcc, as, and ld.
  • Learn how to debug common compile-time errors.
  • Compare the performance of C vs. Python.

Exercise 0

Watch these two short videos on the build process in the Unix environment: Part 1 and Part 2.

Now let’s introduce make briefly: make is a general purpose utility that is very popular for building programs that rely on lots of files, libraries, compilers, etc.  But let’s start generally. Open a file called Makefile. In it, place a rule:

The rule has a target of   sayWhatBob and executes the command echo "hello Bob"  as if you yourself typed it in the terminal.  Important to note that the command must be indented by a tab, not spaces. When you, in the terminal, do type make sayWhatBob , then you see:

Also, the  $ represents shell prompt.  Do not type the  $  in the examples below either.

Next, lets add a macro to our Makefile. This is just a text substitution for use later:

In a larger and/or more useful Makefile, we might use such a macro to define which compiler or flags to use as we make different targets.

Let’s look at one additional example rule. You can add this as an additional rule in your Makefile:

This rule shows an example of a dependency. Here, the target foo depends on the file foo.dat.  Recall that touch creates the argument or updates its timestamp. Assuming you don’t already have these two files in the local directory, try the following after saving your Makefile:

make looks for the target foo in the local directory.  If foo doesn’t exist or is older than its dependencies, then the command associated with the target will run.  If foo is newer than its dependencies, then it is “up to date” and nothing is done.  This is how you would keep your executable updated with respect to your source files (see below).

Read GNU Make in Detail for Beginners – Open Source For You as you do the remaining exercises.

 

Exercise 1 – make

The GNU make utility is the highest level build tool we will use in this class. It automates building programs from multiple source files. The complete GNU Make Manual is available online. Don’t bother reading it yet though, the link is only here as a reference if you need it later. This lab is just an introduction to the basics, we’ll learn more about make later this semester!

Let’s start with an example. Copy and paste this sample program (partially from Zyante) into the file salary.c using your selected text editor. (vim users might want read this first). Notice there is some useful information at the beginning of the file. This is very important for grading. You should always add these lines to the top of your source files. Later when we have multiple source files you can shorten supporting files to just include your name as long as the main program has a complete header.

To compile this program using make simply type: make salary into your terminal. You should get the output:  cc salary.c -o salary . What happened is that make looked in your  local directory for source code called salary. It found the salary.c file and knows (from an implicit rule) that a .c file is a C language source file. This rule also tells make that to compile a .c file it should use the program cc which is the GNU C compiler with the command line argument salary.c -o salary .

In other words, we didn’t need our own Makefile because make already knows how to this. If we had made our own, it might look like this:

Notice the rule for target  salary has the dependency  salary.c  after the colon. If the target is newer than its dependencies, then it is “up to date.”  If you set the dependencies correctly, make will help you ensure that your executable is compiled only when it needs to be (when it is out of date). This can be important for very large programs with many parts.

 

To run the salary program after compiling this way, type:

The $ represents shell prompt.  Do not type the  $  in the examples below either.

 

To see all of the implicit make rules, run make -p  go ahead and run this command at your terminal (it will generate a lot of output). The relevant rule used by make in this example is shown below.

Multiple rules can be listed in a special file, often called makefile or Makefile,  that the make program searches for when compiling programs. We will go over makefiles in more detail later, but suffice it to say in above listed rules $(…) accesses a named variable, and % is the wildcard character. The first line defines the variable LINK.c. The second is the pattern to match (wildcard) with a (wildcard).c file in the local directory. The third, indented line, is the rule to execute when the pattern specified in the line before matches. In this case it calls the C compiler/linker (defined in the first line). In our example, we haven’t defined values for CFLAGS, CPPFLAGS,etc so those go away. The variables $^ and $@ in the third line represent the source file (the one on the right side of the colon ‘:’) and output file (the one on the left side of the colon ‘:’), respectively. So, the executed command boils down to cc salary.c -o salary after the variables are translated into values.

This is a very nice way to compile single file C programs.

Copy salary.c to the file isalary.c. This new program will be an interactive salary calculator.  You must modify it to operate similar to the salary.c but instead of defining the value of hourlyWage, it will prompt the user to enter an integer and use that value for the calculation. Use make to compile isalary. When you run your program, the output should look something like:

Make sure you add the file isalary.c to your git repository!

Exercise 2 – gcc

In exercise 1 we already saw the GNU C compiler in action. On our systems cc and gcc are the same thing. They exist for historical reasons (a long time ago…) basically gcc is the more modern name for the GNU C and C++ compiler (it is functionally the same as cc) and cc is provided for backwards compatibility. make uses cc so it will still work correctly even on very old systems.

Take a look at the man page for gcc. It has one of the longest man pages at over 12k lines so don’t try to read this all in one sitting.

To compile isalary.c with gcc, just type gcc isalary.c . By default gcc will compile and link to an executable file called a.out. This is short for assembler output as used by the first compilers (1960’s). gcc has kept up this tradition. If you want your output to have a better name (and you should), use the -o option, as in: gcc isalary.c -o isalary .

To explore some of gcc’s output messages, we’ll start with the buggy code below. Copy and paste this into the file nogood.c.

When you compile it, it will fail with the error message:

When there is an error, gcc outputs the filename.c:line number:column of the line with the error plus a useful error message. Read the error message carefullygcc uses a short, factual, and precise description of the error. One error may cause another, you should fix the first output error first. Later errors could be a result of the earlier ones.

Fix all of the errors in nogood.c and run the program. The correct output is:  n = 5, n squared = 25, n cubed = 125 .

The compiler can also check for valid yet suspicious looking code (like using a variable before it is assigned a value). These checks will produce warnings. Each warning can be individually enabled, but it is common to enable all of these warnings by adding the -Wall  option when compiling the program, as in gcc nogood.c -o nogood -Wall . If you recompile your fixed nogood.c with -Wall, you should get at least 2 warnings (unless you already fixed them). If there are warnings, go back and fix them as well. From this point on, your code must compile without resulting any warnings with the -Wall option turned on. Not correcting compiler warnings is just lazy and may cost you more time and energy later.

gcc is actually performing four steps behind the scenes when compiling a source program into the executable. These steps are:

  1. preprocess – evaluate any preprocessor macros (like #include and #define).
  2. compile – translate C (or other high-level language) into assembly language appropriate for the platform.
  3. assemble – translate assembly language into object code.
  4. link – link assembled object code with system (and other) libraries, resulting an executable file.

Sometimes it is useful to stop after each step. This way you can examine the intermediate output to see what the compiler is up to. Search the gcc man page for the term “Stop after” and note the two options that make gcc stop after preprocessing and compiling. To prove you have figured this out save the preprocessed nogood.c to the file nogood.pre (use the -o option to specify the output filename).

Take a look at your nogood.pre file and you will notice the  #include <stdio.h> line has been replaced with a lot of ugly-looking code. Lines that begin with # in the source file (in our case, #include <stdio.h> in nogood.c) are preprocessor commands. The #include command inserts the contents of the given file into the source code before compiling. This is useful to make your code modular and easier to read. You might also see  #include "file.h" where the brackets are replaced by quotes. This instructs the preprocessor to look in the current directory for the given file first before checking the normal system directories. The version with brackets causes the preprocessor to look in the system directories for the file. So, now you can use brackets and quotes appropriately. Also notice that preprocessor commands are not terminated by a semicolon!

Now, if you preprocess and compile nogood.c (but don’t assemble or link) you will be left with the assembly language output. Save the compiled assembly to the file nogood.s (.s is the traditional GNU assembly extension). Take a look at this file. It might seem to be in some strange language. This is your first look into assembly language (feel free to stop to marvel at its beauty)! We will be learning assembly language soon. It is the same program broken down into commands that the CPU can execute.

To practice more C coding, copy isalary.c to fsalary.c and modify the new program to accept a floating point value for the wage and an integer value for the number of weeks worked (still assume 40 hours per week). Output the final salary with a leading $ and include two digits to the right of the decimal point. Make sure to compile your program using GCC with all warnings enabled and fix all errors or warnings. The output of your program should look like:

NOTE. You can also pass extra flags to make to enable warnings. If you remember back to example 1 there are several variables that make uses. These can be set on the command line from make. So if you type:  make fsalary CFLAGS=-Wall you can compile and enable all warnings using make. If you want to make this the default, just create a text file called makefile and insert the line  CFLAGS=-Wall. Then when you type make fsalary  the make program will have defined CFLAGS=-Wall by default.

Make sure you add all of the source files from this exercise to your git repository! (You should not add binary/compiled files, backups, etc to git.)

Exercise 3 – as and ld

Now you should know that by default GCC will perform all 4 compilation steps (preprocess, compile, assemble, and link) and in the previous exercise we saw how to stop after the preprocess and compile steps. Now let’s continue where we left off and learn how to assemble and link your nogood.s file!

First, the easy part: assemble your nogood.s into nogood.o.

There is not much to see here; now you have the assembly code translated into the machine code. It isn’t executable yet because it hasn’t been linked into the system libraries. If you try the most logical command to link the object code into an executable, you will run into the problem shown below.

There are two problems, something about an entry symbol _start and the undefined printf. The second error is easier to fix. printf is provided by the C standard library,  we just have to link to this library to fix it. This is done by adding the -lc option.

The _start symbol defines the entry point to your program from the operating system’s perspective. In C, your program begins at the main function but the operating system doesn’t know about main. The OS expects your program to define a _start location. C performs some initialization before your main program executes and some shutdown functions when it’s done. This is part of the C runtime (crt) library and is provided in a few separate object files. We need to link them as shown below. Note the order is important!

This time, it links without error. When we try to run it we get an error because the file /lib/ld64.so.1 doesn’t exist on our system. On Linux ld only performs the first half of linking and the second half is done at run-time. This allows system libraries to be shared in memory. For example, just about every program calls printf. If your’re running 10 processes that use printf, we don’t need all of them to have their own copy of printf, we want them to share a system-wide copy! Dynamic linking at run-time allows this to work. We just have to specify the correct dynamic linker as shown below.

Note that there is an ‘-lc’ linking library flag at the end of the command. This process is quite complex and this quick walk-through is just to prove to you that it can be done. In practice it’s almost always best to leave the complexities of linking up to gcc.

Add your nogood.s and nogood.o files to git (commit and push). Normally you wouldn’t add binary files such as nogood.o to a source code repository but for grading purposes of this lab, this particular set of files are required.

Exercise 4 – Comparing the performance of C vs. Python

One of the most important feature of C is that the running time of a program usually performs much faster than any equivalent program written in any other programming language. There are many reasons explaining why C can show better performance, which mostly relate to how close C is to assembly language.

For this exercise, we are considering a naïve algorithm shown below, to compute all the prime factors of a given number given, and we are going to compare the running time of this algorithm written in Python and its equivalent written in C. Save this program to the file primefact.py and run it.

To run primefact.py on the terminal (no magic IDLE play button here), type python primefact.py

You can use the time program to measure the execution time of the program as shown below (read the man page for the command ‘time’!). It should be about 5 seconds but this will depend on which machine you are using. Your task is to write primefact.c. This program should do exactly the same thing as the python program. Use time to measure the execution time. In the comment header to primefact.c (with your name, date, section, etc) add a section describing the results of your experiment. Note that because of the settings of program search paths, you may have to put the current path ‘./’ in front of an executable such as in ‘./primefact’ (without quotes.)

Now add your primefact.py/.c files to git (and commit and push to gitlab).

Exercise 5 – Change case

In this exercise you are asked to write a program that reads lines of input (using the function fgets) from keyboard. The program iterates over each input line and converts lowercase letters into uppercase one letter at a time. Then, the uppercase string is printed. This process should continue until end of file (EOF) is detected. You can generate an EOF from the keyboard by pressing Ctrl-D. Read the manual page for functions toupper and strlen as you will need to use these functions to complete your work.

A skeleton is provided below in addition to sample output. Your source code should be in the file switchcase.c.

In the above scenario, the user (you!) typed two lines, “This is a test string#1!” and “How to count numbers 1234?” while your program correctly converts the two strings into upper case as appropriate, “THIS IS A TEST STRING #1!” and “HOW TO COUNT NUMBERS 1234?”, respectively. The last input “^D” is a common notation for typing Ctrl-D from the keyboard, which signals the end of the keyboard input.

Now add your switchcase.c file to git (and commit and push to gitlab).

Grading rubrics

  • [25 points] Prelab reading / activities (zybook), at 50% completion.
  • [10 points] ex1 salary and isalary correct and have all header information.
  • [20 points] ex2 nogood and fsalary correct and compile without warning with -Wall.
  • [10 points] ex3 nogood.s and nogood.o exist.
  • [25 points] ex4 primefact.py and primefact.c exist, compile without warning, and speedup is reasonable and documented as comments in program.
  • [10 points] ex5 switchcase.c compiles without warning and operates correctly.
Print Friendly
Posted in Lab Tagged with: , , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

*