3.1.python
https://stackoverflow.com/questions/5552555/unicodedecodeerror-invalid-continuation-byte
Because UTF-8 is multibyte and there is no char corresponding to your combination of \xe9 plus following space.
Why should it succeed in both utf-8 and latin-1?
Here how the same sentence should be in utf-8:
o.decode('latin-1').encode("utf-8") 'a test of \xc3\xa9 char'
https://stackoverflow.com/questions/3942888/unicodeencodeerror-latin-1-codec-cant-encode-character
https://stackoverflow.com/questions/878972/windows-cmd-encoding-change-causes-python-crash
https://www.ptt.cc/bbs/Python/M.1303532664.A.3D6.html https://www.v2ex.com/t/104648
Indicate a vertex component is detached or not
=================Check 'charmap' codec can't decode byte 0x8f in position 17: character maps to
=================Check 'charmap' codec can't decode byte 0x90 in position 4559: character maps to
In Python 3, files are opened text (decoded to Unicode) for you; you don't need to tell BeautifulSoup what codec to decode from.
If decoding of the data fails, that's because you didn't tell the open() call what codec to use when reading the file; add the correct codec with an encoding argument:
=================Check 'utf-8' codec can't decode byte 0xc7 in position 17: invalid continuation byte
=======================building line [2017-09-11 18:47:56]Export 'D:\DB_FILE\125098_75810_2017911_184748\line.txt' begins... [2017-09-11 18:47:56]Total Line count: 175 [2017-09-11 18:47:56]FormDBFormat end, Time spend: 0:00:00.080714 'gbk' codec can't encode character '\xf4' in position 262: illegal multibyte sequence
=======================building line [2017-09-11 19:06:30]Export 'D:\DB_FILE\125098_75812_2017911_19621\line.txt' begins... [2017-09-11 19:06:30]Total Line count: 175 [2017-09-11 19:06:30]FormDBFormat end, Time spend: 0:00:00.091773 'latin-1' codec can't encode characters in position 17-19: ordinal not in range(256)
Last updated
Was this helpful?