Question

0 0

python的编码问题,一个小例子让人很困惑


 # -*- coding:utf-8 -*-
'''
Created on 2015年10月8日
'''

def main():
    s = u"你好"
    d = {'id':001, 'text':s}
    s1 = "你好"
    d1 = {'id':002, 'text':s1}
    print d
    print s
    print "------------"
    print d1
    print s1

if __name__ == "__main__": main()

输出为:


 {'text': u'\u4f60\u597d', 'id': 1}
你好
------------
{'text': '\xe4\xbd\xa0\xe5\xa5\xbd', 'id': 2}
你好

为何直接打印的都是正常的汉字,但是,字典中的却是\uxxxx 或者 \x.. 之类的呢?
请高手解惑.
PS : 在使用 sqlite3 存储中文时, 及使用scrapy抓取中文数据时, 都遇到上面字典中的情况. 很头疼.

python2.7 python-爬虫字符编码

10 years, 8 months ago

迷离的大葱

share

迷离的大葱 10 years, 8 months ago

Answer 1

0

返回给你的是原始编码而已，你大可以淡定。
python中，print一个非str类型的对象会隐式调用对象的__str__这个方法（实际上就是做转换成字符串的操作）
而dict（也包括list，tuple等很多python内建对象）的__str__方法中，会对字符串做这种编码处理（从而使输出都是ascii编码的字符）

如果你 print d['text'] 或者 print d1['text'] 就可以看到你期望的结果了

给你个例子


 class A:
     def __str__(self):
         return "hello"

a = A()
print a

以上程序的的结果是 hello

answered 10 years, 8 months ago

一般会社員

share

一般会社員 answered 10 years, 8 months ago

python的编码问题,一个小例子让人很困惑

迷离的大葱

Answers

一般会社員

Your Answer