开发者

UnicodeDecodeError in Python with codecs module

开发者 https://www.devze.com 2023-03-17 10:12 出处:网络
I have a text file which comprises unicode strings \"aBiyukÙwa\", \"varcasÙva\" etc. When I try to开发者_开发问答 decode them in the python interpreter using the following code, it works fine and de

I have a text file which comprises unicode strings "aBiyukÙwa", "varcasÙva" etc. When I try to开发者_开发问答 decode them in the python interpreter using the following code, it works fine and decodes to u'aBiyuk\xd9wa':

"aBiyukÙwa".decode("utf-8")

But when I read it from a file in a python program using the codecs module in the following code it throws a UnicodeDecodeError.

file = codecs.open('/home/abehl/TokenOutput.wx', 'r', 'utf-8')
for row in file:

Following is the error message:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd9 in position 8: invalid continuation byte

Any ideas what is causing this strange behavior?


Your file is not encoded in UTF-8. Find out what it is encoded in, and then use that.

0

精彩评论

暂无评论...
验证码 换一张
取 消