Python Microsoft Translator API导致的Unicode、解码和编码问题
出现以下错误消息:Python Microsoft Translator API导致的Unicode、解码和编码问题,python,unicode,microsoft-translator,Python,Unicode,Microsoft Translator,出现以下错误消息: Internal Server Error: /Translator/ Traceback (most recent call last): File "D:\Python27\lib\site-packages\django\core\handlers\base.py", line 115, in get_response response = callback(request, *callback_args, **callback_kwargs) File
Internal Server Error: /Translator/
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\django\core\handlers\base.py", line 115, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "D:\Project\Reservation\Translator\views.py", line 72, in getParams
content = request.POST['content'].decode('utf-8').encode('utf-8')
File "D:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
[05/Oct/2013 23:32:51] "POST /Translator/ HTTP/1.1" 500 65244
Internal Server Error: /Translator/
Traceback (most recent call last):
File "D:\Python27\lib\site-packages\django\core\handlers\base.py", line 115, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "D:\Project\Reservation\Translator\views.py", line 72, in getParams
content = request.POST['content'].decode('utf-8').encode('utf-8')
File "D:\Python27\lib\encodings\utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 2: ordinal not in range(128)
我的代码如下:
def get_access_token():
post_data = urllib.urlencode({'client_id':client_id,'client_secret':client_secret, 'scope':ACCESS_TOKEN_SCOPE, 'grant_type':ACCESS_TOKEN_GRANT_TYPE })
token_data = json.loads(requests.post(ACCESS_TOKEN_URL,data=post_data).content)
access_token = token_data["access_token"]
return access_token
def detect(access_token,detect_text):
headers = {'Authorization': 'bearer'+ ' ' + access_token}
detect_url_all = DETECT_URL + "?" + urllib.urlencode({'text':detect_text})
detect_language = requests.get(detect_url_all,headers=headers).content[3:]
return detect_language
def Translator(text,orignal,access_token):
headers = {'Authorization': 'bearer'+ ' ' + access_token}
translation_ars = {
'text': text,
'to': 'zh',
'from': orignal
}
transate_url_all = TRANSLATE_URL + "?" + urllib.urlencode(translation_ars)
result = requests.get(transate_url_all,headers=headers).content
return result
def getParams(request):
if request.method == 'POST':
form = Junk(request.POST)
if form.is_valid():
content = request.POST['content'].decode('utf-8').encode('utf-8')
country = detect(get_access_token(),content)
result = Translator(content,country,get_access_token())
return render_to_response('Translator/translate.html',{'result':result})
else:
form = Junk()
return render_to_response('Translator/index.html',{'form': form})
首先,我想先检测文本的语言。我的程序无法知道文本的编码是什么,因此我无法进行解码或编码。我认为问题很简单,您使用了错误的方法。使用
result.encode
(而不是result.decode
)方法将其编码为utf-8
编码
对不起,我的回答太晚了。错误消息表明
请求。POST['content']
已经是Unicode.decode('utf-8')
希望对象是字节字符串,因此使用隐式.encode('ascii')
将Unicode字符串转换为字节字符串。因为它已经是一个Unicode字符串,而且您似乎想要一个UTF-8字节字符串,所以您只需要:
content = request.POST['content'].encode('utf-8')
request
必须作为Unicode字符串传递给getParams
,但您的代码部分不存在
如果您还有其他问题,此示例可能会有所帮助:因此您的问题是:无法检索文本的编码?感谢您的回答!是的,我不知道如何处理文本的编码类型,例如俄语、葡萄牙语、汉语等等。编码不总是Unicode吗?你需要把它转换成什么?是的,所以我必须把它们转换成utf-8@alKaamili,谢谢你的回答。我的代码可能有误。我修改了代码。我只想把从from得到的文本初始化为“utf-8”。现在我只会翻译英语。