Home | Trees | Indices | Help |
---|
|
This program removes invalid UTF-8 multibyte sequences from a files. It does so line by line. It also shrinks sequences of whitespace to one space and replaces some invalid XHTML entities.
Usage:: $./removeInvalidUTF8.py [options]
Command line options:
-i <file> | --input=<file> File with invalid utf-8 control characters -o <file> | --output=<file> Output file -h | --help This text
Author: Johannes Schwenk
Copyright: 2010, Johannes Schwenk
Version: 2.0
Date: 2010-09-15
|
|||
|
|||
|
|||
|
|||
|
|
This function opens the input and the output file, telling the codec
to replace the multibyte sequence with |
Home | Trees | Indices | Help |
---|
Generated by Epydoc 3.0.1 on Thu Sep 16 13:42:03 2010 | http://epydoc.sourceforge.net |