Thursday, June 30, 2011

Generate a Random File in Linux

For testing purposes, sometimes it's useful to generate a lot of random data, e.g. for testing the effect of file compression on a system. Generating random file data is simple using the random number generator in the /dev directory. Reading from /dev/urandom will result in a random byte. Running the following command will dump random characters to the display:

cat /dev/urandom
To generate and write random data to a file, use the following command:

cat /dev/urandom > randfile
To limit the output, Ctrl+C the process, or use a sleep followed by kill. For example:

(cat /dev/urandom > randfile &); sleep 5; killall cat
The command sequence above will write random data to randfile for five seconds. The actual amount of data written will vary.

Tuesday, June 28, 2011

Useful .vimrc Settings

Customizing vi can improve your productivity. By editing ~/.vimrc you can define a set of commands which are run when vi starts. The following commands can be especially useful.

:set showmatch     highlights the matching brace or parenthesis in your code

:syntax on     enables basic syntax highlighting

:set expandtab     mixing tabs and spaces in source files is bad news
    enable expandtab to automatically convert your tabs to spaces

:set ts=4     sets the number of spaces used to display a tab

:set shiftwidth=4     sets the number of spaces by which to indent as a result of '>>' for example

:set ruler     displays the current line and column number
    useful for restricting your code to 80 characters (if you're into that)

At the very least, syntax highlighting provides a major boost to readability.

vi - find and replace

The vi editor is a powerful text editor available on most unix systems. To find and replace text in vi, run the following command:

If you'd like to be a little more careful about which instances of the original string are replaced, add the 'c' option to the end to enable checking:

Replacing text around a string can be accomplished with a regex:

The command above replaces lines of the form abcblah; with defblah; preserving the blah, whatever it may be. The escaped parentheses form a regex group and can be referenced later using \1, \2, \3, etc. (the number corresponding to the group).

Monday, June 20, 2011

Monitor Process Memory in Linux

Tracking memory on a per-process basis can be useful for detecting memory leaks. The top utility takes memory usage information from the /proc directory and displays it in a human readable format. To measure process memory over time, use top's batch mode:

top -b | grep myproc > mem_usage_file
The command above will print information for "myproc" periodically to the file "mem_usage_file". A line in the file might take the following form:

29439 matt 19 0 139m 87m 10m S 9 0.5 5:39.09 myproc
The virtual memory for myproc is 139MB, the resident memory is 87MB, and the shared memory is 10MB. These values could be extracted and plotted using any scripting language.

Wednesday, June 15, 2011

Simple Python HTTP Web Server

Installing and loading up Apache is usually overkill for hosting a simple web server. Python provides a built in HTTP server with limited functionality. In its simplest form, the following python script could be run from any directory to give access to the directory contents via HTTP:

import SimpleHTTPServer, SocketServer
handler = SimpleHTTPServer.SimpleHTTPRequestHandler
server = SocketServer.TCPServer(("",80), handler)

Running this server will serve up the process' working directory content to any requesting web browser or http client. Customizing this server to provide dynamic web pages is as easy as subclassing the SimpleHTTPRequestHandler. For example:

import SimpleHTTPServer, SocketServer
class MyHandler(SimpleHTTPServer.SimpleHTTPRequestHandler):
    def do_GET(self):
        if self.path == "mypage.html":
            self.wfile.write("<html><body>")             self.wfile.write("Hello World!")             self.wfile.write("</body></html>")
handler = MyHandler
server = SocketServer.TCPServer(("",80), handler)

The do_GET method overrides the definition in SimpleHTTPRequestHandler to implement a custom action. The script above will return a page with "Hello world!" in its body if a GET is received for "mypage.html". Any other path will result in a call to the original implementation of do_GET().

Read from stdin in Python

Many unix command line utilities take input from standard input and write to standard output. This convention allows multiple commands to be piped together. Reading from standard input in python provides an easy way to build new utilities:

import sys
for line in sys.stdin:

The code above could be modified to accomplish a number of simple tasks, such as splitting each line by white space and only printing the first element (as an alternative to awk).

Number of lines in a file

Counting the number of lines in given file is generally useful, especially in conjunction with grep.

grep "abc" * -r | wc -l 
wc -l myfile
Word count also has the ability to count the number of words or characters in a file using the -w and -m options respectively.