Files and File Commands

Data is stored in a file. A file can contain a program (a list of instructions for the computer to follow), data, or the text of a paper. A file is like a piece of paper on which you can put lots of characters, in any order. One of the primary functions of an operating system is to maintain user files, making sure that each user has access to their own files.

Files can be created in several ways. You can create the file, say, using an editor or programs can create files. For example, the C++ compiler creates files.

File Names

So that you can tell your files apart, you give a name to each one you create. When you create files, you should give them names that are meaningful to you. File names can contain most printable character, but to avoid pitfalls, you should use only letters, digits, dashes, periods and underscores. The following are legal file names: lab1, LAB3, Lab1, Alongname, read.me, program1.cc, program1.java

Remember that capital letters are different from lower case ones on the Suns. Therefore, Lab1 is a different file name from lab1.

What are Directories?

It is obvious that with so many people using the Sun workstations, there are a huge number of files on the disk drive. One needs a way to organize all of them as well as keep peoples' files separate. UNIX does this with directories and sub-directories (directories inside of directories).

When you are given a UNIX account, your login is associated with a directory. When you login you are placed in this directory which is called your home directory. Your home directory is a file that contains a list of all of your files (if any) and a list of all your sub-directories (if any). When you are in a directory, say your home directory, and you create a file, it is placed in that directory.

So that the UNIX system knows which files are yours, it associates your computer account with your home directory. The files in your home directory and sub-directories are yours; no one else can create, change or remove them except you (or someone who has superuser privilege like the System Administrator).

Sub-directories are used to organize your files much like file folders are used to organize a heap of documents. For example, you might place all the files for your labs in a sub-directory called ‘labs’ and your programs in another one called ‘progs’.

File Suffixes

Some commands in UNIX require an identifying part of the file name called a suffix. This suffix is part of the file name and contains a period followed by a single character or several characters. For example, the C++ compiler requires the .cc suffix in the file name. Therefore, prog1.cc is a file name that would contain a C++ program. Some popular suffixes follow:

.c for a C program
.cc for a C++ program
.java for a Java program
.h for a header file
.o object file (created by compiler)
.tex for a TeX document

Using Wildcard Characters in Filenames

A wildcard character is a symbol that you can use with many UNIX commands to specify more than one file. In UNIX, the wildcard character is the ‘*’ and it matches zero or more characters in that place of the filename argument. Therefore, b*.java matches all files starting with b and ending with the suffix .java. You could list all the Java program files in a directory starting with b by typing:

hostname{~}% ls b*.java

You can use more than one wildcard in a filename argument as shown below:

hostname{~}% ls do*at*

That would list the following files: doat, doat21, doormat, dogcatcher and documentation.

Copying Files

One way to create a file is to copy an existing file. The UNIX command to copy a file is cp. After the command you type the file to be copied and a new file name. Copy two files into your directory by doing the following:

hostname{~}% cp ~csci203/examples/humpty myhumpty
hostname{~}% cp ~csci203/examples/jack_theory .

In the first line we copied a file called humpty in the sub-directory examples under csci203’s home directory (~username is a user's home directory and csci203 is a username) and placed the contents of the file into a newly created file called myhumpty. The second line is the same except that we were lazy and typed a period that tells UNIX to use the same name. You made no changes to csci203’s existing files.

The long string of directory/sub-directory/filename is an example of a pathname that is how you specify any file on the system. Many commands take pathnames as possible arguments (see Section 2.4.5 for more on Pathnames).

The command cp may use a wildcard character in the filename argument. You could have copied all of the files in my examples directory by doing the following:

hostname{~}% cp ~csci203/examples/*  .

Here the period is important to give each file the same name.

Listing Files in Your Directory

Now try listing your directory by using the command ls and the two files should be there.

hostname{~}% ls
myhumpty      jack_theory

Try ls with the -a option.

hostname{~}% ls -a

You should see several file names which begin with a period, for example a .cshrc file. These dot files control how your system looks when you login and were placed there by the Systems Administrator. If you remove or alter these files you may experience many problems and may find you can't even login! If this should ever happen see your instructor immediately. Later, when we are more advanced, we may alter some of these files. Try ls with the -l option to list the files in extended form.

hostname{~}% ls -l

You can combine the two options by typing -al.

hostname{~}% ls -al

You can also list a filename or a filename argument with a wildcard. For example, you could list all the files starting with ‘lab’ by:

hostname{~}% ls -l lab*

Displaying the Information in a File

The command more followed by the filename displays the contents of that file. Try more on myhumpty and jack_theory. Hit the spacebar to display the next window full. While using more on jack_theory press an h and read the on-line help of the commands available in more. Hit the spacebar to get out of the help display. To go back a page, type “b.” To quit type “q.” Using the command cat can also complete this operation. However, the latter command scrolls through the entire document without page pausing. Note this command would not be desirable for displaying long documents.

hostname{~}% more jack_theory
hostname{~}% cat jack_theory

Like most commands that take a filename argument, more allows a wildcard character (the ‘*’). Therefore, we could be lazy and type:

hostname{~}% more j*
hostname{~}% cat j*

That will do more on all files in the current directory beginning with “j.” In this case, there is only one: jack_theory.

Removing Files

Many times we need to clean up our directories and we desire to remove some files. The command rm removes or deletes the file specified by the filename.

Copy your file myhumpty to several new file names, say junk and baloney. Then remove junk and baloney from your directory. Using ls, check that they were removed.

hostname{~}% rm junk baloney
hostname{~}% ls

To avoid inadvertently deleting the wrong file, type carefully when you use rm. The command rm allows a wildcard but you should be very careful when you use it or you will remove files you wanted to keep.