Linux command for converting encoding

I need to use the command “file -i” to identify the file encoding and then convert to another encoding, such as ascii to utf-8.

My plan is to file -i >> logfile
Next is to check logfile for the encoding type and if ascii, convert to utf-8

there should only be the single line from the file -i

So depending on the encoding type, execute the iconv to convert from one to the other encoding. can this be done using scripting?

Hi,

Thank you for using linuxconfig.org forums.

What about this simple shell script:

#/bin/bash

file -i $1 | grep -qi ascii

if [ $? -eq 0 ]; then
    echo "ASCII FILE"
    echo "Executing some command"
else
    echo "NO ASCII FILE"
    echo "Converting to UTF-8 command"
fi

We first check whether the file is ASCII. Then based on the result in $? we use bash conditionals and act accordingly.

Copy the above code into file called eg. convert.sh and make it executable:

$ chmod +x convert.sh

Once ready execute the following:

$ echo TEST > file.txt
$ file file.txt
file.txt: ASCII text
$ ./convert.sh file.txt
ASCII FILE
Executing some command

Visit the following page for more information on how to create and execute bash scripts.

Note sure if this is exactly what you need but it should give you a good start.

Hope this helps.

Lubos

ChatGPT

Yes, it is definitely possible to accomplish your task using scripting. You can use a scripting language like Bash to achieve this.

Here’s an example script that performs the steps you mentioned:
#!/bin/bash

Run file -i command and store the output in a logfile

file -i <file_name> >> logfile

Read the encoding type from the logfile

encoding=$(awk -F’charset=’ ‘{print $2}’ logfile)

Remove trailing newline characters from the encoding

encoding=$(echo “$encoding” | tr -d ‘\n’)

Check if the encoding is ASCII

if [[ $encoding == “us-ascii” ]]; then
# Convert ASCII to UTF-8 using iconv
iconv -f ASCII -t UTF-8 <file_name> >output_file
echo “File converted from ASCII to UTF-8.”
else
echo “File is not in ASCII encoding.”
fi

Remove the logfile

rm logfile

Here’s an explanation of the script:

The script runs the file -i command on the input file and appends the output to a logfile.
It then extracts the encoding type from the logfile using awk and stores it in the encoding variable.
The script checks if the encoding is ASCII (us-ascii).
If the encoding is ASCII, it uses iconv to convert the file from ASCII to UTF-8 and saves the output to a separate file named output_file.
If the encoding is not ASCII, it displays a message indicating that the file is not in ASCII encoding.
Finally, the script removes the logfile.

Make sure to replace <file_name> with the actual name or path of the file you want to convert.

You can save the script in a file (e.g., convert_encoding.sh), make it executable (chmod +x convert_encoding.sh), and then run it using ./convert_encoding.sh in the terminal.

Please note that this script assumes that the file and iconv commands are available in your system.