Tip 8: Data Manipulation in Unix
In most projects, you ultimately have some data in rows and columns in text files. How do you get it there? I assume that you are using "grep" to extract it from a file. How do you manipulate it? That is what I will show now.
There are a couple VERY useful unix commands: paste and cut.
paste allows you to append files horizontally, line-by-line. Suppose you have file1:
1 2 3 4
and file2:
2 4 6 8 10 12 14
and you run
paste file1 file2
It will output:
1 2 2 4 3 6 4 8 5 10 12 14
If one of the files is specified as "-", it will use stdin. This means you can gather data like this:
myprog | grep somekeyword > file.dat myprog -option2 | grep somekeyword | paste - file.dat > file2.dat
Note, however, that you must redirect it to a separate file (file2.dat cannot be the same as file.dat) or else it will lose the rest of file when it starts over-writing!
cut is the opposite of paste. It allows you to extract columns of data based on delimiters. For example, if you have a file like this:
x = 1 y = 2 z = 3
and you run:
cut -d '=' -f 2
will extract the column 2 and print it out:
1 2 3
You can also specify fixed character widths (-c), byte widths (-b) or tabbed fields (-d).