Python UnicodeEncodeError:&x27；ascii'；编解码器可以'；t编码字符u'\xf3和x27；位置16：序号不在范围内（128）_Python

Python UnicodeEncodeError:&x27；ascii'；编解码器可以'；t编码字符u'\xf3和x27；位置16：序号不在范围内（128）

python

Python UnicodeEncodeError:&x27；ascii'；编解码器可以'；t编码字符u'\xf3和x27；位置16：序号不在范围内（128）,python,Python,这是我的密码： #!/usr/bin/python # Import modules import pandas as pd import requests import numpy as np # Set ipython's max row display pd.set_option('display.max_row', 1000) # Insert your CrisisNET API key api_key = xxxxxxxxxxxxxxx # Insert your Crisi

这是我的密码：

#!/usr/bin/python

# Import modules
import pandas as pd
import requests
import numpy as np

# Set ipython's max row display
pd.set_option('display.max_row', 1000)

# Insert your CrisisNET API key
api_key = xxxxxxxxxxxxxxx

# Insert your CrisisNET request API
api_url = 'http://api.crisis.net/item?sources=twitter&tags=weather'

# Create the request header
headers = {'Authorization': 'Bearer ' + api_key}

# Define how many data points you want
total = 10000

# Create a dataframe where the request data will go
df = pd.DataFrame()

# Define a function called get data,
def get_data(offset=0, limit=100, df=None):
    # create a variable called url, which has the request info,
    url = api_url + '&offset=' + str(offset) + '&limit=' + str(limit)
    # a variable called r, with the request data,
    r = requests.get(url, headers=headers)
    # convert the request data into a dataframe,
    x = pd.DataFrame(r.json())



    # expand the dataframe
    x = x['data'].apply(pd.Series)
    # add the dataframe's rows to the main dataframe, df, we defined outside the function
    df = df.append(x, ignore_index=True)

    # then, if the total is larger than the request limit plus offset,
    if total > offset + limit:
        # run the function another time
        return get_data(offset + limit, limit, df)
    # but if not, end the function
    return df

# Run the function
df = get_data(df=df)

# Check the number of data points retrieved
len(df)

# Check for duplicate data points
df['id'].duplicated().value_counts()

# Drop all duplicate data points
df = df.dropna(how='all')

df.to_csv('TwitterWeather.csv')

# View the first 10 data points
print df.head()

我得到以下错误：

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf3' in position 16: ordinal not in range(128)

如何修复它？

正如其他人所建议的，您有一个unicode字符串。如果试图将该字符串写入文件，则该字符串将返回错误。看起来在将数据帧保存到csv文件时会发生错误

要解决此问题，首先需要将字符串转换为unicode。您可以编写如下函数：

def change_text(text):
    return text.encode('utf-8')  # assuming the encoding is UTF-8

然后，您可以将其应用于具有unicode字符的列，如下所示：

df['<column_name>'] = df.apply(change_text,axis=1)

df['']=df.apply（更改文本，轴=1）

正如其他人所建议的，您有一个unicode字符串。如果试图将该字符串写入文件，则该字符串将返回错误。看起来在将数据帧保存到csv文件时会发生错误

要解决此问题，首先需要将字符串转换为unicode。您可以编写如下函数：

def change_text(text):
    return text.encode('utf-8')  # assuming the encoding is UTF-8

然后，您可以将其应用于具有unicode字符的列，如下所示：

df['<column_name>'] = df.apply(change_text,axis=1)

df['']=df.apply（更改文本，轴=1）

在某个地方，您有一个

unicode

对象，您试图

解码它，即使它已经被解码，或者试图像使用str
一样使用它（例如，将它添加到str
，或者将它传递给需要str
的函数），在这种情况下，您需要编码它。向我们展示整个回溯，它会告诉你是哪个表达式导致了这种情况，而不是让我们猜测。你能发布完整的回溯和到你的数据的链接吗？看起来您可能想要“解码”文本，也就是说，您可能想要将其转换为unicode。不过，要做到这一点，您需要知道正在处理的数据的字符编码。你知道字符编码（例如“utf-8”）吗？@Mona Jalal，我从你的问题中删除了你的api密钥。在某处，你有一个unicode
对象，你要么试图解码它，即使它已经解码，要么试图像使用str
（例如，将其添加到str
，或将其传递给需要str
的函数），在这种情况下，您需要对其进行编码。向我们展示整个回溯，它将告诉您是哪个表达式导致了这种情况，而不是让我们猜测。您可以发布完整的回溯和指向数据的链接吗？看起来您可能想要“解码”您的文本，也就是说，您可能希望将其转换为unicode。但是，要这样做，您需要知道您正在处理的数据的字符编码。您知道字符编码（例如“utf-8”）吗？@Mona Jalal，我从您的问题中删除了您的api密钥