unicode - Dreaded python encoding errors, how to stop them? -


Why are they frustrating me? It seems that I can not control my console encoding I think my browser and word processor can handle it. I do not have a master list of all potential characters, who are kneeling on it, what is the best way to remove it without modifying my data?

  'charmap' can not encode the codec character I \ 'xca'  

Do you want to know the encoding of your console (the system, OS, etc ...?) - 'charmap' Unfortunately there is some ambiguous identification for the codec , As explained:

The second group of encoding (the so-called charm encoding) mapped the various subsets of all Unicode code points and these codepoints bytes to 0x0-0xff goes . To see how this is done, simply open Encoding / cp1252.py (which is an encoding that is mainly used on Windows). A string of 256 characters is constant, which shows you which matrix is ​​that byte value.

All of these encodings can only encode 256 encoded codepoints in 65536 (or 1114111) Unicode.

That is, it recognizes a set of possible codecs, not specific.

Once you know that your console changes a codec called 'foobar' , then your statement that now

  print ( Someunicode)  
  in print (someunicode.encode ('foobar'))  

Comments