One of the unix problem I worked recently, thought its worth sharing.
The problem required to query over csv files in unix, like we query database using sql.
So, these commands work quite like SQL. :)
Requirement:
To get the first record for where 2nd column is null, that is, emp_name has been left blank for the below input file content.
Input File Name:
test_data.csv [1]
Input File Content:
employee_id,emp_name,emp_age 234,Ram,27 235,Shyam,21 236,Mohan,40 237,,25
Command:
sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$2 ~/NULL/' | sed 's/,NULL,/,,/g;' | head -n 1
Description:
- sed 's/,,/,NULL,/g;' test_data.csv : replaces all empty values with the string "NULL"
- awk -F, '$2 ~/NULL/' : finds all the values where 2nd column is NULL, -F option is used for defining field separator
- sed 's/,NULL,/,,/g;' : reverts back all the NULL to empty values
- head -n 1 : prints only the first record, if you remove this you will get all the records. [2]
Other examples:
1) Get all employees age greater then 20
sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$3 > 20' | sed 's/,NULL,/,,/g;'
2) Get all employees whose employee_id is equal of 234
sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$1 == 234' | sed 's/,NULL,/,,/g;'
3) Get all employees whose employee_id is less than or equal to 235
sed 's/,,/,NULL,/g;' test_data.csv | awk -F, '$1 <= 235' | sed 's/,NULL,/,,/g;'
--------------------------------------------------------------------------------------------
[1] CSV: A CSV file contains comma separated values and can be easily created/edited with Microsoft Excel, almost like normal excel files.
[2] In case we don’t need entire row, we can also use cut command here to get only one field or a few fields
No comments:
Post a Comment