Data cleaning and spreadsheet software

Tuesday, 18 July 2017

Today we're going to look at one common way of manipulating CSV and other flat data files. We'll look at a few more command line tools to do help us, and review how paste works once again.

We'll look at a few different ways of converting a TSV file in to a CSV file.

Then we will look at the result of a compiled CSV file in a GUI environment, to be able to better understand what we're doing in the command line.

Here are some exercises:

Translate, Edit, and Text-Processing

Use tr, sed and awk to change all the tabs in your TSV file to another separator character."ANSWER: How do I convert a tab-separated values (TSV) file to a comma-separated values (CSV) file in BASH?," StackOverflow, Last updated 15 March 2017. https://stackoverflow.com/questions/22419979/how-do-i-convert-a-tab-separated-values-tsv-file-to-a-comma-separated-values/22421445#22421445

Figure out what these do and explain it to the class.

Editing on the command line

Use vi to open and match replace all the tab separators in your file. (Make sure to make a backup copy of your original file.)

For Next Time


Data cleaning and spreadsheet software - July 18, 2017 -