5 ops hacks for sysadmins

Five tools to help you find the source of your users' IT problems when you don't know where to start.
160 readers like this.
10 open source tools for your sysadmin toolbox

Opensource.com

As a sysadmin, every day I am faced with problems I need to solve quickly because there are users and managers who expect things to run smoothly. In a large environment like the one I manage, it's nearly impossible to know all of the systems and products from end to end, so I have to use creative techniques to find the source of the problems and (hopefully) come up with solutions.

This has been my daily experience for well over 20 years, and I love it! Coming to work each day, I never quite know what will happen. So, I have a few quick and dirty tricks that I default to when a problem lands on my lap, but I don't know where to start.

BUT WAIT! Before you jump straight onto the command line, spend some time talking to your users. Yes, it can be tedious, but they will have some good information for you. Keep in mind that users probably don't have as much experience as you have, and you will need to do some interpreting of whatever they say. Try to get a clear picture of what is happening and what should be happening, then describe the fault to yourself in technical language. Be aware that most users don't read what is on the screen in front of them; it's sad but true. Make sure you and the user are reading all of the text to gather as much information as possible. Once you have that together, jump onto the command line with these five tools.

Telnet

I am starting with a classic. Telnet was the predecessor to SSH, and, in the olden days, it was used on Unix systems to connect to a remote terminal just like SSH does, but it was not encrypted. Telnet has a very neat and invaluable trick for diagnosing network connectivity issues: you can Telnet into TCP ports that are not reserved for it. To do so, use Telnet like you normally would, but add the TCP port onto the end (telnet localhost 80, for instance) to connect to a web server. This enables you to check a server to see if a service is running or if a firewall is blocking it. So, without having the application client or even a login for the application, you can check if the TCP port is responding. If you know how, sometimes you can elicit a response from the server by manually typing into the Telnet prompt and checking the response. Web servers and mail servers are two examples where you can do this.

Getting a response from a webserver with Telnet

Tcpdump

The tcpdump tool lets you inspect what data is being transmitted on the network. Most network protocols are fairly simple and, if you combine tcpdump with a tool like Wireshark, you will have a nice, easy way to browse the traffic that you have captured. In the example below, I am inspecting packets in the bottom window and connecting to TCP port 3260 in the top.

Inspecting packets in real time with tcpdump

This screenshot shows a real-world use of Wireshark to look at the iSCSI protocol; in this case, I was able to identify that there was a problem with the way our QNAP network-attached storage was configured.

Using Wireshark to inspect a TCP session

find

The find command is simply the best tool if you don't know where to start. In its most simple form, you can use it to "find" files. For example, if I wanted to do a recursive search through all directories and get a list of the conf files, I could enter:

find . -name '*.conf'.

find command output

But one of find's hidden gems is that you can use it to execute a command against each item it finds. For example, if I wanted to get a long list of each file, I could enter:

find . -name '*.conf' -exec ls -las {} \;

find command output

Once you know this technique, you can use it in all sorts of creative ways to find, search, and execute programs in specific ways.

strace

I was introduced to the concept of strace on Solaris, where it is called truss. It is still as useful today as it was all those years ago. strace allows you to inspect what a process is doing as it runs in real time. Using it is simple; just use the command ps -ef and find the process ID that you are interested in. Start strace with strace -p <pid>; this will start printing out a whole lot of stuff, which at first looks like junk. But if you look closer, you will see text that you recognize, such as words like OPEN and CLOSE and filenames. This can lead you in the right direction if you are trying to figure out why a program is not working.

grep

Leaving the best for last: grep. This tool is so useful and powerful that I have trouble coming up with a succinct way to describe it. Put simply, it's a search tool, but the way it searches is what makes it so powerful. In problem analysis, I typically grep over a bunch of logs to search for something. A companion command called zgrep does the same thing with zipped files. In the following example, I used zgrep /var/log/* bancroft to grep across all the log files to see what I have been up to on the system. I used zgrep because there are zipped files in the directory.

grep command output

Another great way to use grep is for piping the output of other tools into it; this way, it can be used as a filter of sorts. In the following example, I listed the auth file and grepped for my login to see what I have been doing by using cat auth.log |grep bancroft. This can also be written as grep bancroft auth.log, but I used the pipe (|) to demonstrate the point.

grep command output

Other tools to consider

You can do a lot more with these tools, but I hope this brief introduction gives you a window into how to use them to solve the nasty problems that come your way. Another tool worth your attention is Nmap, which I did not include because it is so comprehensive that it needs an entire article (or more) to explain it. Finally, I recommend learning some white hat and hacking techniques; they can be very beneficial when trying to get to the bottom of a problem because they can help you collect information that can be crucial in decision making.

What to read next
Tags
User profile image.
I am a professional systems administrator specialising in Unix and Linux Systems. Since the early 90's I have worked with Enterprise Networking, Unix Systems, Network/Systems Security & Web Technologies, I've even learnt a thing or two about Windows.

8 Comments

Great article. I've used all those tools and they're great. I hope you'll consider writing about nmap. I'd like to learn how to use it better.

Awesome, thanks.

Does the sysadmin job's will completely move to the DevOps culture , or they continue to follow the same work process as they doing right now..?

I can see that everybody is moving to the DevOps culture, and Sysadmin posts are shrinking in most of the companies, some way or another the Sysadmin position are transforming into Junior DevOps Engineer (according to Indian companies) ..?

Will there be a Sysadmin position in the coming future..?

I agree. I defiantly do more devops these days than in the past. But I still have users to look after (desktop and application users) and we still have some hardware of our own, so I think as long as that is the case there will always be the need for pure sysadmin work.

In reply to by swaroop (not verified)

Two of the most useful utilities I have used: strace and strings.

Trace an application and all other called programs and push output to a file for examination
$ strace -f -o output.trace command

Sometimes programs don't match their documentation. They either have less options than they say they do, or more. Maybe they write their output to a specific directory, coded into the program. Maybe the programmer typo'd the code and it's trying to read a different config file (in a different location) than the documentation says. Maybe your 'sshd' or 'ssh' have been replaced by a hacker and you can scan through the source to figure out where the nefarious program is sending all of the captured data. Many times running 'strings' on an executable will give you a treasure-trove of information. Or, at least, tell you you need to start looking elsewhere for the issue.
$ strings /path/to/executable | less

Good article, and all good tools.

These aren't hacks though any more than a screwdriver is a hack for driving screws. Well, maybe using telnet to check a port is a bit of a hack but mostly this is describing using the tools for what they were intended.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.