Python 将行添加到数据帧时,列的格式会更改

Python 将行添加到数据帧时,列的格式会更改,python,pandas,Python,Pandas,我向DataFrame添加了一行,当我调用Descripte时,它不再输出数字摘要值 import pandas as pd import numpy as np myDataset = {"Movie Title": ['Avengers: Endgame', 'Avatar', 'Titanic', 'The Lion King'], "Gross": [ 2797800564, 279

我向DataFrame添加了一行,当我调用Descripte时,它不再输出数字摘要值

import pandas as pd
import numpy as np
myDataset = {"Movie Title":          ['Avengers: Endgame',    'Avatar',    'Titanic',  'The Lion King'],
            "Gross":                 [         2797800564,  2790439000,   2194439542,       1656943394],
            "Rotten Tomatos Rating": [                 94,          82,           89,               53],
            "Year":                  [               2019,        2009,         1997,             2019],
            "Length":                [                181,         162,          194,              118]                        
            }
print ("Original Data...")
df = pd.DataFrame(myDataset)
print(df.describe())

# Add a column
df['Seen It'] = [False, False, True, False]
print ("After adding the column...")
print(df.describe())

# Add a row
newRow = pd.Series(np.array(["Hoosiers", 29000000, 89 , 1986, 116, True]), 
                   index=["Movie Title", "Gross", "Rotten Tomatos Rating", "Year", "Length", "Seen It" ])
row_df = pd.DataFrame([newRow])

newDF = df.append(row_df, ignore_index=True, sort=False)
print ("After adding a row...")
print(newDF.describe())
输出:

您可以通过在参数
columns=[“Movie Title”,…]中传递一个列表来指定列名,而不是创建一个长格式的
df
,来更改您尝试附加的新数据帧的格式

newRow = pd.DataFrame([["Hoosiers", 29000000, 89 , 1986, 116, True]],
                      columns=["Movie Title", "Gross", "Rotten Tomatos Rating", "Year", "Length", "Seen It" ])
newDF = df.append(newRow, ignore_index=True, sort=False)
描述的输出:

              Gross  Rotten Tomatos Rating         Year      Length
count  5.000000e+00               5.000000     5.000000    5.000000
mean   1.893724e+09              81.400000  2006.000000  154.200000
std    1.145114e+09              16.440803    14.387495   35.821781
min    2.900000e+07              53.000000  1986.000000  116.000000
25%    1.656943e+09              82.000000  1997.000000  118.000000
50%    2.194440e+09              89.000000  2009.000000  162.000000
75%    2.790439e+09              89.000000  2019.000000  181.000000
max    2.797801e+09              94.000000  2019.000000  194.000000
检查row_df的数据类型(
row_df.dtypes
)以查看此问题的根源。因此,append将所有列转换为对象(字符串)


问题在于,通过添加带有混合数据类型的
np.array
(因此
object
),您正在将数据帧中数字列的内部数据类型从
numpy.int64
转换为
object

我不是在附加np.array,我是在附加DataFrame。是的,但是您的DataFrame是由一个带有混合值的
np.array
构建的,这就是为什么它会更改所附加到的DataFrame的底层数据类型。您可以通过在追加之前和之后在df上调用
dtypes
来验证这一点
Out[43]: 
Movie Title              object
Gross                    object
Rotten Tomatos Rating    object
Year                     object
Length                   object
Seen It                  object
dtype: object