Python UnicodeEncodeError和TypeError：只能将str（而不是“；字节”；）连接到str_Python_Unicode_Python Unicode_Google Custom Search

Python UnicodeEncodeError和TypeError：只能将str（而不是“；字节”；）连接到str

python unicode

Python UnicodeEncodeError和TypeError：只能将str（而不是“；字节”；）连接到str,python,unicode,python-unicode,google-custom-search,Python,Unicode,Python Unicode,Google Custom Search,我有一个问题，那就是我尝试使用Google自定义python搜索api在结果中搜索，但当我搜索存储在变量中的内容而不是手动写入时，它会显示UnicodeEncodeError:“ascii”编解码器无法对位置104处的字符“\xa2”进行编码：序号不在范围内（128）。当我解决它的时候 .encode('ascii', 'ignore').decode('ascii') 它会显示另一个错误，例如google自定义搜索 TypeError: can only concaten

我有一个问题，那就是我尝试使用Google自定义python搜索api在结果中搜索，但当我搜索存储在变量中的内容而不是手动写入时，它会显示UnicodeEncodeError:“ascii”编解码器无法对位置104处的字符“\xa2”进行编码：序号不在范围内（128）。当我解决它的时候

    .encode('ascii', 'ignore').decode('ascii')

它会显示另一个错误，例如google自定义搜索

    TypeError: can only concatenate str (not "bytes") to str.

我也尝试过一些东西，比如str（）或.decode

编辑：当然，存储在变量中的输入来自读取图像文本的Pytesseract。所以，我将这些信息存储在一个变量中，然后我尝试在google自定义搜索API中搜索这些信息。当它显示一个Unicode错误时，我查看了解决方案，发现我可以尝试对变量进行解码，以便不再出现此问题。事实上，这个问题已经解决了，但现在出现了另一个问题，它是TypeError：只能将str（而不是“字节”）连接到str。因此，我不能使用.decode函数，因为它将显示另一个错误。我能做什么

编辑2.0

text_photo = pytesseract.image_to_string(img2) #this will read the text and put it in a variable
text_photo = text_photo.replace('\r', '').replace('\n', '') #this will elimininate de /n


rawData = urllib.request.urlopen(url_google_1 + text_photo1 + '+' + text_photo2 + url_google_2).read()

url_google 1包含用于google搜索的链接的第一部分（api键…），第二部分包含我想从google获得的内容。在中间，我添加变量，因为它是我想要搜索的。如果我写hello是完美的，问题是tesseract写的格式不兼容。我尝试使用str（text_photo）和.decode，但不起作用json_data=json.loads（rawData）

我无法理解您特定问题的所有细节，但我很确定根本原因如下：

Python3区分了两种字符串类型，

str

和

bytes

，它们相似但不兼容

一旦您了解了这意味着什么，它们中的每一个都可以/不能做什么，以及如何从一个到另一个，我相信您可以找到如何正确构造API调用的URL

不同类型，不兼容：

>>> type('abc'), type(b'abc')
(<class 'str'>, <class 'bytes'>)

>>> 'abc' + b'abc'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: must be str, not bytes

>>> b'abc' + 'abc'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can't concat str to bytes

str.encode

和

bytes.decode

方法采用可选的

encoding=

参数，默认为UTF-8。此参数定义

str

中的字符与

字节

对象中的八位字节之间的映射。如果使用给定编码将字符映射到字节时出现问题，您将遇到

unicodeincodeerror

。如果使用未在给定映射中定义的字符，则会发生这种情况：

>>> '5 £'.encode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\xa3' in position 2: ordinal not in range(128)

您可以使用

errors=“ignore”

策略避免异常，但这样会丢失信息：

>>> '5 £'.encode('ascii', errors='ignore')
b'5 '

通常，如果使用文本，则在任何地方都使用

str

。您也不应该经常需要直接使用

.encode/.decode

；通常，文件处理程序等会接受

str

，并在幕后将其转换为

字节
在您的情况下，您需要找出str
和bytes
混合在一起的位置和原因，然后在连接之前确保所有内容都具有相同的类型
>>> b = '5 £'.encode('utf8')
>>> b
b'5 \xc2\xa3'
>>> b.decode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 2: ordinal not in range(128)

>>> '5 £'.encode('ascii', errors='ignore')
b'5 '