Difference between revisions of "Tip 8: Data Manipulation in Unix"

Latest revision as of 22:24, 12 March 2015

In most projects, you ultimately have some data in rows and columns in text files. How do you get it there? I assume that you are using "grep" to extract it from a file. How do you manipulate it? That is what I will show now.

There are a couple VERY useful unix commands: paste, cut, and sort.

@@ Line 51: / Line 51: @@
 Sort will, you guessed it, sort data. The only trick is that it uses alphabetical sorting by default. If you want numeric sorting, you must specify "-n". You can also specify "-r" for a reverse sort.
 Also the "-k" option allows you to specify the key (which column will be sorted).
+=== uniq ===
+This command will eliminate repeating lines (if the input is sorted) so that you have unique values. If you give it the option "-c", it will print out the count of non-unique lines so that you can easily make a histogram like this:
+  cat mydata.txt | sort -n | uniq -c > newdata.txt
+Then something like this in gnuplot:
+  gnuplot> set boxwidth 1
+  gnuplot> set style fill solid
+  gnuplot> plot 'newdata.txt' using 2:1 with boxes

Difference between revisions of "Tip 8: Data Manipulation in Unix"

Latest revision as of 22:24, 12 March 2015

Contents

paste

cut

sort

uniq

Navigation menu

Views

Personal tools

Navigation

Search

Tools