Difference between revisions of "Tip 8: Data Manipulation in Unix"

From Vlsiwiki
Jump to: navigation, search
(Created page with 'In most projects, you ultimately have some data in rows and columns in text files. How do you get it there? I assume that you are using "grep" to extract it from a file. How do y…')
 
Line 26: Line 26:
 
  12
 
  12
 
  14
 
  14
 +
If one of the files is specified as "-", it will use stdin. This means you can gather data like this:
 +
 +
myprog | grep somekeyword > file.dat
 +
myprog -option2 | grep somekeyword | paste - file.dat > file2.dat
 +
 +
Note, however, that you must redirect it to a separate file (file2.dat cannot be the same as file.dat) or else it will lose the rest of file when it starts over-writing!
 +
 +
cut is the opposite of paste. It allows you to extract columns of data based on delimiters. For example, if you have a file like this:
 +
x = 1
 +
y = 2
 +
z = 3
 +
and you run:
 +
cut -d '=' -f 2
 +
will extract the column 2 and print it out:
 +
1
 +
2
 +
3
 +
You can also specify fixed character widths (-c), byte widths (-b) or tabbed fields (-d).

Revision as of 23:34, 6 April 2010

In most projects, you ultimately have some data in rows and columns in text files. How do you get it there? I assume that you are using "grep" to extract it from a file. How do you manipulate it? That is what I will show now.

There are a couple VERY useful unix commands: paste and cut.

paste allows you to append files horizontally, line-by-line. Suppose you have file1:

1
2
3
4

and file2:

2
4
6
8
10
12
14

and you run

paste file1 file2

It will output:

1	2
2	4
3	6
4	8
5	10 
	12
	14

If one of the files is specified as "-", it will use stdin. This means you can gather data like this:

myprog | grep somekeyword > file.dat
myprog -option2 | grep somekeyword | paste - file.dat > file2.dat

Note, however, that you must redirect it to a separate file (file2.dat cannot be the same as file.dat) or else it will lose the rest of file when it starts over-writing!

cut is the opposite of paste. It allows you to extract columns of data based on delimiters. For example, if you have a file like this:

x = 1
y = 2
z = 3

and you run:

cut -d '=' -f 2

will extract the column 2 and print it out:

1
2
3

You can also specify fixed character widths (-c), byte widths (-b) or tabbed fields (-d).