Unix: Frequently used Unix commands.

Posted: March 24, 2017 in UNIX

 File Cleanup Commands

My file is a tab delimiter. Move the line which don’t have 11 columns to bad file.
— awk -F”\t” ‘NF-1 != 11 {print $0} ‘

Display a range of columns using Awk

Using the below command one can display a range of columns for example column 5th to 10th etc.,
— awk -F”\t” ‘BEGIN { OFS=”\t” ; ORS=”\t” } { for (i=1; i<=14;i++) printf(“%s%s”, $i,(i==14) ? “\n” : OFS) }’

Check how many tabs present in each line

— awk -F”\t” ‘{print NF}’ FINL_Contracts_Data.csv | sort | uniq

count the tabs in each line and aggregate the value

— awk -F”\t” ‘{print NF}’ FINL_Contracts_Data.csv | sort | uniq -c

Replace newline character to null between double quotes.
— awk -v RS='”[^\t]*”‘ -v ORS= ‘{gsub(/\r/, ” “, RT); print $0  RT}’

Replace invisible “NUL” charter between alphabets.
— sed -i ‘s/\x0//g’

Remove all the non ASCII characters from the source file.
— perl -pi -e ‘s/[^[:ascii:]]//g’

Remove Blank line what might have white space or newline.
— sed -i ‘/^[[:space:]]*$/d;s/[[:space:]]*$//’

Remove the lines in a file that has just quotes (blank line just has Quotes)

— sed -i ‘/^\”$/d’   (More robust command would be sed -i ‘/^.$/d’)

To grep and see such line in the file use — grep “^\”$”

Question:- AWK comma as field separator and text inside double quotes as a field

Example:-
Column1,             Column2,                           Column13
aaa,          “hell world, test text”,bbb,ccc,”  test text”

Find the length of the second column in the above example.

Solution:- awk -F\” ‘{for(i=1;i<=NF;i+=2) {gsub(“,”, “;”, $i)}}1’ OFS= File_nm.txt | awk -F\; ‘{ if (length($2) > max) max = length($2) } END { print max }’

Question:- Find Max Length of a column using AWK.

— awk ‘max==”” {max=length($1)} END{ print max}’ FS=”,”

Question:- Get the length for all the column in the file.

awk -F”,” ‘BEGIN {OFS=”,”} NR==1 {for (b=1;b<=NF;b++) {hdr[b]=$b} } NR > 1 {for (i=1;i<=NF;i++) {if(length($i) > max[i]) max[i] = length($i)} } END {for (i=1;i <= NF;i++) print hdr[i],max[i]+0}’ Csv_file_nm.csv

Note:- If you have “|” pipe delimiter just change -F”,” to -F”|” and (optional) OFS=”,” to OFS=”|”

Question:- Merge lines based on a pattern
for more information on the scenario please refer the below link

http://stackoverflow.com/questions/43018458/tab-separated-file-has-data-issue-in-unix-the-data-is-a-mess

— perl -0pe ‘s/\n(?!([0-9]{6}|$))//g’

Question:- Get min value from a column

— awk ‘min==”” || $1 < min {min=$1} END{ print min}’ FS=”\t”

 

Question:- Lookup in Unix

awk ‘FNR==NR{a[$1]=$0;next}; $1 in a {print a[$1],”|”,$0}’ Actual_File.txt Clean_File.txt

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s