Ignore Unicode Error_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2023-04-09 18:29 出处：网络

When I run a loop over a bunch of URLs to find all links (in certain Divs) on those pages I get back this error:

Traceback (most recent call last):
File "file开发者_如何转开发_location", line 38, in <module>
out.writerow(tag['href'])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 0: ordinal not in range(128)

The code I have written related to this error is:

out  = csv.writer(open("file_location", "ab"), delimiter=";")
for tag in soup_3.findAll('a', href=True):   
    out.writerow(tag['href'])

Is there a way to solve this, possibly using an if statement to ignore any URLs that have Unicode errors?

Thanks in advance for your help.

You can wrap the writerow method call in a try and catch the exception to ignore it:

for tag in soup_3.findAll('a', href=True):
    try:
        out.writerow(tag['href'])
    except UnicodeEncodeError:
        pass

but you almost certainly want to pick an encoding other than ASCII for your CSV file (utf-8 unless you have a very good reason to use something else), and open it with codecs.open() instead of the built-in open.

Ignore Unicode Error

精彩评论

关注公众号

热门标签

图文推荐

Ignore Unicode Error

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：