Python TypeError:强制使用Unicode:需要字符串或缓冲区，找到浮点_Python_Unicode

Python TypeError:强制使用Unicode:需要字符串或缓冲区，找到浮点

python unicode

Python TypeError:强制使用Unicode:需要字符串或缓冲区，找到浮点,python,unicode,Python,Unicode,这是我的密码： import numpy as np import pandas as pd from tqdm import tqdm import re import time import os print u'read data ...' train_data = pd.read_csv('Train.csv', index_col='SentenceId', delimiter='\t', encoding='utf-8') test_data = pd.read_csv('Test

这是我的密码：

import numpy as np
import pandas as pd
from tqdm import tqdm
import re
import time
import os

print u'read data ...'
train_data = pd.read_csv('Train.csv', index_col='SentenceId', delimiter='\t', encoding='utf-8')
test_data = pd.read_csv('Test.csv', index_col='SentenceId', delimiter='\t', encoding='utf-8')
train_label = pd.read_csv('Label.csv', index_col='SentenceId', delimiter='\t', encoding='utf-8')
addition_data = pd.read_csv('addition_data.csv', header=None, encoding='utf-8')[0]
train_data.dropna(inplace=True) # drop some empty sentences
...
def findall(sub_string, string):
    start = 0
    idxs = []
    while True:
        idx = string[start:].find(sub_string)
        if idx == -1:
            return idxs
        else:
            idxs.append(start + idx)
            start += idx + len(sub_string)

tags = {'pos':1, 'neu':2, 'neg':3}

def label2tag(i):
    s = train_data.loc[i]['Content']
    r = np.array([0]*len(s))
    try:
        l = train_label.loc[[i]].as_matrix()
    except:
        return r
    for i in l:
        for j in findall(i[0], s):
            r[j:j+len(i[0])] = tags[i[1]]
    return r

print u'translating target into tags ...'
train_data['label'] = map(label2tag, tqdm(iter(train_data.index)))

这是我得到的错误的回溯：

Traceback (most recent call last):
  File "shibie.py", line 88, in <module>
    train_data['label'] = map(label2tag, tqdm(iter(train_data.index)))
  File "shibie.py", line 83, in label2tag
    for j in findall(i[0], s):
  File "shibie.py", line 66, in findall
    idx = string[start:].find(sub_string)
TypeError: coercing to Unicode: need string or buffer, float found

回溯（最近一次呼叫最后一次）：
文件“shibie.py”，第88行，在
列车数据['label']=map（label2tag，tqdm（iter（列车数据索引）））
label2tag中的文件“shibie.py”，第83行
对于findall中的j（i[0]，s）：
findall中第66行的文件“shibie.py”
idx=string[start:]查找（子字符串）
TypeError:强制使用Unicode:需要字符串或缓冲区，找到浮点

上面的代码在我自己的电脑上运行，但在我学校的Ubuntu上，它会出现很多错误。我不知道这是否是因为我的文件中有空格，但我发现我的文件没有空格。

请上传源代码。在

idx=string[start:]之前。find（sub_string）

函数中的findall行，插入

print（sub_string）

并确保打印的值符合您的预期，您希望此评估有什么行为：

findall（“baabaa”，“baababaaba”）

<代码>[0,6]或

[0,3,6]

？请上载源代码。在

idx=string[start:]之前。在findall
函数中查找（sub_string）

行，插入

打印（sub_string）

并确保打印的值符合您的预期。您希望此评估有什么行为：

findall（“baa”，“baabaa”）

<代码>[0,6]或

[0,3,6]

？