Unix: Convert XML file to ASCII/UTF-8 encoding

Posted: February 17, 2016 in UNIX

The command in UNIX to convert XML file encoding is “iconv”

Ex:- In the below example XML with UTF-8 encoding is converted to ASCII

iconv -f UTF-8 -t ASCII//TRANSLIT $Inputfile  > $Out_File

What //IGNORE will do ?  (It will simply remove the unsupported character from the text.)

What //TRANSLIT will do?  (For unsupported character It will attempt to substitute it with a similar character from the target set.)

For more information on this command use “iconv –help”

To know the supported Code pages use “iconv -l”

Full Script

curr_date=”$(date +’%Y%m%d’)”

Source_File_Dir=$1
Log_Dir=$2

echo “*****The XMLencode script started..for current date: ${curr_date} ” > $Log_Dir/XMLencode.log

if [ -z “$1” ] || [ -z “$2” ]; then
echo “pass values to the variables.” >> $Log_Dir/XMLencode.log
exit 1
fi

echo “Source Directory: $Source_File_Dir”  >> $Log_Dir/XMLencode.log

cd $Source_File_Dir

# For loop starts..

for i in $(ls *.xml | grep -v “_ASCII\.xml”)  # Process the files with .xml extension. But not the files which has “_ASCII” anywhere in the file name.

do
echo Files in source directory: $i >> $Log_Dir/XMLencode.log

j=$(echo $i | sed ‘s/\.xml/_ASCII.xml/’)

if [ ! -f “$j” ];then

echo “input file name is ${i} ”
echo “output file name is ${j} ”

echo “encoding conversion starts..”

iconv -f UTF-8 -t ASCII//TRANSLIT $Source_File_Dir/${i} > $Source_File_Dir/${j}          # convering UTF-8 XML to ASCII encoding XML

if [ $? -gt 0 ]; then

echo “unable encode the file $Source_File_Dir/$i ” >> $Log_Dir/XMLencode.log
rm $Source_File_Dir/${j}
exit 1

else

echo “Encoding is successful for the file : $i” >> $Log_Dir/XMLencode.log

fi

fi
done

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s