Python 将行添加到数据帧时,列的格式会更改
我向DataFrame添加了一行,当我调用Descripte时,它不再输出数字摘要值Python 将行添加到数据帧时,列的格式会更改,python,pandas,Python,Pandas,我向DataFrame添加了一行,当我调用Descripte时,它不再输出数字摘要值 import pandas as pd import numpy as np myDataset = {"Movie Title": ['Avengers: Endgame', 'Avatar', 'Titanic', 'The Lion King'], "Gross": [ 2797800564, 279
import pandas as pd
import numpy as np
myDataset = {"Movie Title": ['Avengers: Endgame', 'Avatar', 'Titanic', 'The Lion King'],
"Gross": [ 2797800564, 2790439000, 2194439542, 1656943394],
"Rotten Tomatos Rating": [ 94, 82, 89, 53],
"Year": [ 2019, 2009, 1997, 2019],
"Length": [ 181, 162, 194, 118]
}
print ("Original Data...")
df = pd.DataFrame(myDataset)
print(df.describe())
# Add a column
df['Seen It'] = [False, False, True, False]
print ("After adding the column...")
print(df.describe())
# Add a row
newRow = pd.Series(np.array(["Hoosiers", 29000000, 89 , 1986, 116, True]),
index=["Movie Title", "Gross", "Rotten Tomatos Rating", "Year", "Length", "Seen It" ])
row_df = pd.DataFrame([newRow])
newDF = df.append(row_df, ignore_index=True, sort=False)
print ("After adding a row...")
print(newDF.describe())
输出:
您可以通过在参数
columns=[“Movie Title”,…]中传递一个列表来指定列名,而不是创建一个长格式的df
,来更改您尝试附加的新数据帧的格式
newRow = pd.DataFrame([["Hoosiers", 29000000, 89 , 1986, 116, True]],
columns=["Movie Title", "Gross", "Rotten Tomatos Rating", "Year", "Length", "Seen It" ])
newDF = df.append(newRow, ignore_index=True, sort=False)
描述的输出:
Gross Rotten Tomatos Rating Year Length
count 5.000000e+00 5.000000 5.000000 5.000000
mean 1.893724e+09 81.400000 2006.000000 154.200000
std 1.145114e+09 16.440803 14.387495 35.821781
min 2.900000e+07 53.000000 1986.000000 116.000000
25% 1.656943e+09 82.000000 1997.000000 118.000000
50% 2.194440e+09 89.000000 2009.000000 162.000000
75% 2.790439e+09 89.000000 2019.000000 181.000000
max 2.797801e+09 94.000000 2019.000000 194.000000
检查row_df的数据类型(row_df.dtypes
)以查看此问题的根源。因此,append将所有列转换为对象(字符串)
问题在于,通过添加带有混合数据类型的np.array
(因此object
),您正在将数据帧中数字列的内部数据类型从numpy.int64
转换为object
我不是在附加np.array,我是在附加DataFrame。是的,但是您的DataFrame是由一个带有混合值的np.array
构建的,这就是为什么它会更改所附加到的DataFrame的底层数据类型。您可以通过在追加之前和之后在df上调用dtypes
来验证这一点
Out[43]:
Movie Title object
Gross object
Rotten Tomatos Rating object
Year object
Length object
Seen It object
dtype: object