Python 将数据帧转换为系列创建NA'；s_Python_Python 2.7_Pandas

Python 将数据帧转换为系列创建NA'；s

python python-2.7 pandas

Python 将数据帧转换为系列创建NA'；s,python,python-2.7,pandas,Python,Python 2.7,Pandas,我已经下载了dataframe，并尝试从此dataframe创建pd.Series data = pd.read_csv(filepath_or_buffer = "train.csv", index_col = 0) data.columns Index([u'qid1',u'qid2',u'question1',u'question2'], dtype = 'object') 这里是数据框中的列，qid1是question1的ID，qid2是question2 此外，我的数据帧中没有Na

我已经下载了dataframe，并尝试从此dataframe创建pd.Series

data = pd.read_csv(filepath_or_buffer = "train.csv", index_col = 0)
data.columns

Index([u'qid1',u'qid2',u'question1',u'question2'], dtype = 'object')

这里是数据框中的列，

qid1

是

question1

的ID，

qid2

是

question2

此外，我的数据帧中没有

Nan

：

data.question1.isnull().sum()
0

我想从第一个问题创建pandas.Series（），并将

qid1

作为索引：

question1 = pd.Series(data.question1, index = data.qid1)
question1.isnull.sum()
68416

现在，在我的系列中有68416个空值。我的错误在哪里？

传递匿名值，以便

系列

ctor不会尝试对齐：

question1 = pd.Series(data.question1.values, index = data.qid1)

这里的问题是

question1

列有自己的索引，所以它将在构建过程中尝试使用它

例如：

In [12]:
df = pd.DataFrame({'a':np.arange(5), 'b':list('abcde')})
df

Out[12]:
   a  b
0  0  a
1  1  b
2  2  c
3  3  d
4  4  e

In [13]:
s = pd.Series(df['a'], index = df['b'])
s

Out[13]:
b
a   NaN
b   NaN
c   NaN
d   NaN
e   NaN
Name: a, dtype: float64

In [14]:
s = pd.Series(df['a'].values, index = df['b'])
s

Out[14]:
b
a    0
b    1
c    2
d    3
e    4
dtype: int32

实际上，这里发生的情况是，您正在使用传入的新索引对现有列重新编制索引，因为没有与您得到的

NaN