Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/292.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫:选择列,如果不存在,则为默认值_Python_Pandas_Dataframe - Fatal编程技术网

Python 熊猫:选择列,如果不存在,则为默认值

Python 熊猫:选择列,如果不存在,则为默认值,python,pandas,dataframe,Python,Pandas,Dataframe,假设我有以下数据帧: >>> df val1 val2 val3 key 1 1 1 1 2 2 2 2 3 3 3 3 现在我要选择列val1、val2,这里是kicker:val4 我想要的是: >>> df.something(something) val1 val2 val4 key 1 1 1 NaN 2 2 2 Na

假设我有以下数据帧:

>>> df
     val1 val2 val3
key
  1     1    1    1
  2     2    2    2
  3     3    3    3
现在我要选择列val1、val2,这里是kicker:val4

我想要的是:

>>> df.something(something)
     val1 val2 val4
key
  1     1    1  NaN
  2     2    2  NaN
  3     3    3  NaN
IIUC重新索引

另外.loc也可以这样做,但会引发一个警告:传递list likes to.loc或[]以及任何缺少的标签将在将来引发keyrerror,您可以使用.reindex作为替代方法

df.loc[:,["val1", "val2", "val4"]]

像这样的事情应该让你开始:

import pandas as pd
import numpy as np

df = pd.DataFrame([[1, 1, 1], [2, 2, 2], [3, 3, 3]], columns=['val1', 'val2', 'val3'])

def check_columns(df, values):

    temp = pd.DataFrame()
    for i in values:
        try:
            temp[i] = df[i]
        except:
            temp[i] = np.nan
    return temp

print(check_columns(df, ['val1', 'val2', 'val3']))
print(check_columns(df, ['val1', 'val2', 'val4']))
给出:

   val1  val2  val3
0     1     1     1
1     2     2     2
2     3     3     3
   val1  val2  val4
0     1     1   NaN
1     2     2   NaN
2     3     3   NaN
import pandas as pd
import numpy as np

df = pd.DataFrame([[1, 1, 1], [2, 2, 2], [3, 3, 3]], columns=['val1', 'val2', 'val3'])

def check_columns(df, values):

    temp = pd.DataFrame()
    for i in values:
        try:
            temp[i] = df[i]
        except:
            temp[i] = np.nan
    return temp

print(check_columns(df, ['val1', 'val2', 'val3']))
print(check_columns(df, ['val1', 'val2', 'val4']))
   val1  val2  val3
0     1     1     1
1     2     2     2
2     3     3     3
   val1  val2  val4
0     1     1   NaN
1     2     2   NaN
2     3     3   NaN