Using strace to track system calls in Linux

Image by:

Original photo by Rikki Endsley. CC BY-SA 4.0

Being a system administrator is all about understanding what's going on underneath the hood of the machines you maintain. What really happens when a service runs or a command is executed? One tool to help system administrators keep tabs on how programs are interfacing with the operating system beneath them is strace.

To learn more about strace, we got in touch with Alex Juarez, who will be giving a talk at SCaLE 14x: Your system calls and you: A brief exploration of using strace.

Alex is a Principal Engineer at Rackspace, where he serves as a trainer and mentor to the staff who support the administrator needs of their customers. As an RHCA and RHCI, Alex is a big fan of finding the right tool for every job, and helping others to do the same. Learn more about strace, its uses, and how to learn more in this interview.

Interview

In a nutshell, what is strace? Why is it a valuable open source tool for Linux administrators?

Strace is a tool used to intercept system calls from your application to the Linux kernel. I find strace is invaluable for system administrators for two main reasons.

First off, we do not always have the source code of an application available, but we may still need to know what an application is doing. This can be anything from which files are opened, how much memory is being allocated or even why an application is crashing repeatedly.
Secondly, even if we do have the code, being a system administrator doesn't imply being a developer. We may not know how to follow the code. I find that looking at system calls as opposed to lines of code is a bit more descriptive

What's an example of how an administrator might use this tool in the real world?

Real world examples of when you could use strace present themselves fairly often. It really is one of those very versatile tools you could use whenever you are trying to troubleshoot an issue. I have a few examples in my talk, but one of my favorites is using strace to show Apache web server's behavior when enabling .htaccess files on a server.

Once you figure out what system calls a process is making, what's the best way to figure out what those mean? In other words, how do you interpret strace's output?

Interpreting the output from strace is one of the things we cover in the talk so I don’t want to give away too much. However, in short, there are two ways somebody could learn a bit more about how to interpret the output. The first way is that system calls have their own man pages which are full of information. The second way is to learn a bit more through writing very small C programs and running strace on them. You can write out small C programs to read, write or open a file and see what types of system calls those functions are translated to. Coding small programs and using strace on them will take you on a learning adventure.

Aside from attending your talk, what's the best resource for someone who wants to learn more?

I think the best way to get more experience and learn a bit more is to strace everything. Go through an exercise of stracing commands you use everyday and break down what files they are opening and what files they are writing to. Some programs will modify a file directly while others may edit a temporary file before copying the data into the original file.

Taking that a step further, write some code and strace it. Build the correlation between the functions you use and the system calls they make.

Finally, what else are you particularly excited about at SCaLE this year?

There is so much I'm excited for this year at SCaLE. I think I am particularly excited about variety of topics. I have a close eye on pretty much all of the SysAdmin, Security and Kernel tracks. Thursday I’ll be spending some time at UbuCon and on Sunday I’ll be spending some time in the education track.

Comments are closed.

This work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.