HOWTO: Bash script for encoding all files from a directory to different charset on a Linux box



This is a simple bash script which uses iconv binary to convert all the files in a given directory. Most common use - converting some website from local encoding to utf-8, for example windows-1251 to utf-8. It saves you the time for typing the same command for every single file.

Installation

Usage

/path/to/dir_iconv.sh ~/txt1 cp1251 utf8 - converts all files from directory ~/txt1 от cp1251 (windows-1251) to utf8.

To make it easier put the file in /bin, /usr/bin or any other directory listed in $PATH variable. The above example can be executed like this:

dir_iconv.sh ~/txt1 cp1251 utf8

Original files are kept with .old extension.

The code

Here's the code:

#!/bin/bash

ICONVBIN='/usr/bin/iconv' # path to iconv binary

if [ $# -lt 3 ]
then
    echo "$0 dir from_charset to_charset"
    exit
fi

for f in $1/*
do
    if test -f $f
    then
        echo -e "\nConverting $f"
        /bin/mv $f $f.old
        $ICONVBIN -f $2 -t $3 $f.old > $f
    else
        echo -e "\nSkipping $f - not a regular file";
    fi
done

 

Comments:

wojmichal@...pl (08-01-2008 15:30) :
You can add extra line after
$ICONVBIN -f $2 -t $3 $f.old > $f
to cause removal of temporary files
rm -f $f.old

Agustincl (14-04-2009 17:55) :
Perfect code.
Is possible to makeit recursive?

Acl.

Post a comment (fixed now)

Back to articles list      |     

This page was last modified on 2010-02-09 01:25:18