- In this prelab you will setup the Intel Pin tool to run on your system and use it to collect memory access traces from several programs.
- Open a shell, go into your local git repo, and do a “git pull“ to make sure you have the latest feedback from the TAs.
- Create the folder Labs/Lab14 in your regular lab folder.
Getting started with Intel Pin
In this lab, we will be using the Intel Pin tool. Pin is a powerful tool for dynamic program analysis. We will be using it to instrument a running executable so that we can record each and every memory access the program makes. Pin works similarly to how a Virtual Machine might use just-in-time compilation to generate machine code at execution time. A bunch of really smart folks applied the same concept to essentially re-compile any binary executable at runtime to add instrumentation instructions.
Pin is already installed on the lab computers (do not try to use the mips machine for this lab!), so there is no need to download the executable files. To make Pin work, we will be using two environment variables. If you use bash, the commands below will set up the environment for your current running shell:
You would have to type these lines manually every time you opened another shell to work with Pin. If you prefer to have these lines execute automatically, every time you open a new shell, there is a simple alternative. Just this one time, you can run the lines below to modify your .bashrc file (optional):
echo "export PIN_ROOT=~cs206/bin/pin
export PIN_TOOL=~cs206/bin/pin/source/tools/ManualExamples/obj-intel64/dinero.so" >> ~/.bashrc
After this, you can use Pin to instrument the ls command with the line:
$PIN_ROOT/pin.sh -t $PIN_TOOL -o mytrace.out -- ls
Here’s how you understand this line:
- $PIN_ROOT/pin.sh is the path to the executable bash script which actually runs pin. This script is necessary to select the proper tools and libraries for your environment. You can even cat the .sh to see what it’s doing with your library paths.
- Bash scripting is itself whole other thing: anything (or things) you can type into the command line (in a bash shell/terminal) can be placed into a shell script (like pin.sh) for repeated execution later. It’s basically functional abstraction for anything you do manually on the command line. Read about it all here.
- The pin executable handles instrumenting the target program. This data is linked to a “tool” that extracts the meaningful information. We specify the pin tool to use with the -t $PIN_TOOL part. In our case we set up $PIN_TOOL previously to point to our special dinero pin tool.
- The -o mytrace.out part tells the dinero Pin tool the name of a file in which to store it’s output. When you don’t specify anything, it uses the default file name dinerotrace.out; here, we are telling it to use a file called mytrace.out. Later in the lab you can use this option to save different traces with a unique filename.
- Finally, after the — (double hyphen) comes the program to instrument and run. In this case, we ask Pin to analyze the ls program. This will produce a directory listing, as ls always does, but it will take longer than usual because Pin is doing a lot of work behind the scenes to instrument the executable with instructions to collect every address reference made and then to save them all to the output file you named. After ls exits, you will see that the trace file you specified has been created.
Take a look at the trace file now:
$ head mytrace.out
W 0x7fffc0b83868 8
W 0x7fffc0b83860 8
W 0x7fffc0b83858 8
W 0x7fffc0b83850 8
W 0x7fffc0b83848 8
W 0x7fffc0b83840 8
W 0x7fffc0b83838 8
R 0x3cf5a1fb80 8
W 0x3cf5a1fd48 8
R 0x3cf5a1ffc8 8
The trace format is a simple ASCII text file where the first letter is either a R or W, for read or write. Then a memory address (can you identify which segments are being used?). The last number is the number of bytes read or written. Your file may not match the output above exactly because some of the libraries are dynamically linked and may have different addresses in memory.
You can now use Pin to generate memory access traces for just about every program on the system. The readme file has a few restrictions–most notably Google Chrome will not work properly. Also note that the memory trace for our simple ls command was 3.2 MB. If you trace a program that generates a lot of memory accesses or runs for a substantial amount of time, the trace file will be very large and very likely exceed your disk quota.
Use pin to generate traces for ls, pwd, and wc alice.txt. (link to alice.txt)(use wget to retrieve it from the web through the terminal). Name the output files: ls.out, pwd.out, and wc.out. Add each file to your git repo (although these files are large, git stores data in compressed form).
- 25 points – ls.out, pwd.out and wc.out added to git repo (25/3 points each).