精通
英语
和
开源
,
擅长
开发
与
培训
,
胸怀四海
第一信赖
服务方向
联系方式
I don't know exactly what's the source of this error and how to fix it. I am getting it by running . 我不确切知道此错误的根源以及如何解决。我通过运行来获得它。
Traceback (most recent call last):
File "t1.py", line 86, in <module>
write_results(results)
File "t1.py", line 34, in write_results
dw.writerows(results)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 154, in writerows
return self.writer.writerows(rows)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
Any explanation is really appreciated! 任何解释都非常感谢!
I changed the code and now I get this error: 我更改了代码,现在出现此错误:
File "t1.py", line 88, in <module>
write_results(results)
File "t1.py", line 35, in write_results
dw.writerows(results)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/csv.py", line 154, in writerows
return self.writer.writerows(rows)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
Here's the change: 这是更改:
with codecs.open('results.csv', 'wb', 'utf-8') as f:
dw = csv.DictWriter(f, fieldnames=fields, delimiter='|')
dw.writer.writerow(dw.fieldnames)
dw.writerows(results)
The error is raised by this part of the code: 此部分代码引发错误:
with open('results.csv', 'w') as f:
dw = csv.DictWriter(f, fieldnames=fields, delimiter='|')
dw.writer.writerow(dw.fieldnames)
dw.writerows(results)
You're opening an ASCII file, and then you're trying to write non-ASCII data to it. I guess that whoever wrote that script happened to never encounter a non-ASCII character during testing, so he never ran into an error. 您正在打开一个ASCII文件,然后尝试向其写入非ASCII数据。我猜想写该脚本的人碰巧在测试过程中从未遇到过非ASCII字符,因此他从未遇到错误。
But if you look at the docs for the csv module, you'll see that the module can't correctly handle Unicode strings (which is what Beautiful Soup returns), that CSV files always have to be opened in binary mode, and that only UTF-8 or ASCII are safe to write. 但是,如果您查看``csv''模块的文档,则会发现该模块无法正确处理Unicode字符串(这是Beautiful Soup返回的内容),始终必须以二进制模式打开CSV文件,并且仅UTF-8或ASCII可以安全写入。
So you need to encode all the strings to UTF-8 before writing them. I first thought that it should suffice to encode the strings on writing, but the Python 2 csv module chokes on the Unicode strings anyway. So I guess there's no other way but to encode each string explicitly: 因此,您需要在编写所有字符串之前将所有字符串编码为UTF-8。我首先认为在编写时对字符串进行编码就足够了,但是Python 2 csv模块反正对Unicode字符串造成了阻塞。因此,我想除了对每个字符串进行显式编码外,别无其他方法:
In parse_results(), change the line 在parse_results()中更改行
results.append({'url': url, 'create_date': create_date, 'title': title})
to 至
results.append({'url': url, 'create_date': create_date, 'title': title.encode("utf-8")})
That might already be sufficient since I don't expect URLs or dates to contain non-ASCII characters. 这可能已经足够了,因为我不希望URL或日期包含非ASCII字符。