Wednesday, June 4, 2014

Unix Tip: Working with CSV file in Unix


One of the unix problem I worked recently, thought its worth sharing.
The problem required to query over csv files in unix, like we query database using sql.
So, these commands work quite like SQL. :)

Requirement:

To get the first record for where 2nd column is null, that is, emp_name has been left blank for the below input file content.

Input File Name:

test_data.csv [1]

Input File Content:

employee_id,emp_name,emp_age
234,Ram,27
235,Shyam,21
236,Mohan,40
237,,25

Command:

sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$2 ~/NULL/'  | sed 's/,NULL,/,,/g;' | head -n 1

Description:


- sed 's/,,/,NULL,/g;' test_data.csv : replaces all empty values with the string "NULL"
- awk -F, '$2 ~/NULL/' : finds all the values where 2nd column is NULL, -F option is used for defining field separator
- sed 's/,NULL,/,,/g;' : reverts back all the NULL to empty values
- head -n 1 : prints only the first record, if you remove this you will get all the records. [2]



Other examples:

1) Get all employees age greater then 20

sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$3 > 20' | sed 's/,NULL,/,,/g;'

2) Get all employees whose employee_id is equal of 234

sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$1 == 234' | sed 's/,NULL,/,,/g;'

3) Get all employees whose employee_id is less than or equal to 235

sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$1 <= 235' | sed 's/,NULL,/,,/g;'


--------------------------------------------------------------------------------------------
[1] CSV: A CSV file contains comma separated values and can be easily created/edited with Microsoft Excel, almost like normal excel files.
[2] In case we don’t need entire row, we can also use cut command here to get only one field or a few fields

No comments:

Post a Comment

Prototype

Prototype is another creation pattern. Intent:  - Intent is to create objects by cloning existing instance  - Specify the kinds of obj...