Difference between revisions of "Tip 8: Data Manipulation in Unix"
(→sort) |
|||
Line 50: | Line 50: | ||
Sort will, you guessed it, sort data. The only trick is that it uses alphabetical sorting by default. If you want numeric sorting, you must specify "-n". You can also specify "-r" for a reverse sort. | Sort will, you guessed it, sort data. The only trick is that it uses alphabetical sorting by default. If you want numeric sorting, you must specify "-n". You can also specify "-r" for a reverse sort. | ||
+ | Also the "-k" option allows you to specify the key (which column will be sorted). |
Revision as of 21:01, 16 July 2014
In most projects, you ultimately have some data in rows and columns in text files. How do you get it there? I assume that you are using "grep" to extract it from a file. How do you manipulate it? That is what I will show now.
There are a couple VERY useful unix commands: paste, cut, and sort.
paste
paste allows you to append files horizontally, line-by-line. Suppose you have file1:
1 2 3 4
and file2:
2 4 6 8 10 12 14
and you run
paste file1 file2
It will output:
1 2 2 4 3 6 4 8 5 10 12 14
If one of the files is specified as "-", it will use stdin. This means you can gather data like this:
myprog | grep somekeyword > file.dat myprog -option2 | grep somekeyword | paste - file.dat > file2.dat
Note, however, that you must redirect it to a separate file (file2.dat cannot be the same as file.dat) or else it will lose the rest of file when it starts over-writing!
cut
cut is the opposite of paste. It allows you to extract columns of data based on delimiters. For example, if you have a file like this:
x = 1 y = 2 z = 3
and you run:
cut -d '=' -f 2
will extract the column 2 and print it out:
1 2 3
You can also specify fixed character widths (-c), byte widths (-b) or tabbed fields (-d).
sort
Sort will, you guessed it, sort data. The only trick is that it uses alphabetical sorting by default. If you want numeric sorting, you must specify "-n". You can also specify "-r" for a reverse sort. Also the "-k" option allows you to specify the key (which column will be sorted).