Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 根据数组条件填写新的dataframe列_Python_Python 3.x_Pandas - Fatal编程技术网

Python 根据数组条件填写新的dataframe列

Python 根据数组条件填写新的dataframe列,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个数据帧: import numpy as np import pandas as pd arr = np.array([['a', 0, 1.2,12.5,3], ['a',1, 4,5.,6.885], ['a', 2, 2.3,3.133,4.3], ['a', 3, 5.678,6.,7.34556], ['a', 4, 6.5,7,8.1344], ['b',0, 10.7,11.4,12.1332],

我有一个数据帧:

import numpy as np
import pandas as pd

arr = np.array([['a', 0, 1.2,12.5,3], ['a',1, 4,5.,6.885],
                ['a', 2, 2.3,3.133,4.3], ['a', 3, 5.678,6.,7.34556],
                ['a', 4, 6.5,7,8.1344], ['b',0, 10.7,11.4,12.1332],
                ['b',1, 14.,15,16.0155], ['b',2, 17.3,18.,9.11],
                ['b', 3, 22.2, 33.233, 1.2323], 
                ['c', 0, 1.1, 2.2, 3.3], 
                ['c', 1, 2.2, 3.43, 54.5],
                ['d', 0 , 2.2, 2.2, 3.],
                ['d',1, 3.4, 4., 5.6],
                ['d', 2, 3.3, 4, 5.]])

df = pd.DataFrame(arr, columns=['name', 'id', 'x', 'y', 'z'])

df['id'] = pd.to_numeric(df['id'])
df['x'] = pd.to_numeric(df['x'])
df['y'] = pd.to_numeric(df['y'])
df['z'] = pd.to_numeric(df['z'])

df
    name    id  x       y       z
0   a       0   1.2     12.5    3
1   a       1   4       5.0     6.885
2   a       2   2.3     3.133   4.3
3   a       3   5.678   6.0     7.34556
4   a       4   6.5     7       8.1344
5   b       0   10.7    11.4    12.1332
6   b       1   14.0    15      16.0155
7   b       2   17.3    18.0    9.11
8   b       3   22.2    33.233  1.2323
9   c       0   1.1     2.2     3.3
10  c       1   2.2     3.43    54.5
11  d       0   2.2     2.2     3.0
12  d       1   3.4     4.0     5.6
13  d       2   3.3     4       5.0
我有一个大小相同的数组:

the_array = np.array([['a', 82.365],
                      ['a', 82.365],
                      ['a', 82.365],
                      ['a', 82.365],
                      ['b', 136.879],
                      ['b', 136.879],
                      ['b', 136.879],
                      ['b', 136.879],
                      [None, None],
                      [None, None],
                      [None, None],
                      [None, None],
                      [None, None],
                      [None, None]], dtype=object)
现在,我想在df中创建一个新列,在该列中,我将根据列
name
填充数组
的值

我希望在df中的每一行中,如果名称与
阵列中的名称相同,则具有相同的值(与
阵列中的名称相同)

我想要的结果:

    name    id  x         y       z         new_col
0   a       0   1.200   12.500  3.00000     82.365
1   a       1   4.000   5.000   6.88500     82.365
2   a       2   2.300   3.133   4.30000     82.365
3   a       3   5.678   6.000   7.34556     82.365
4   a       4   6.500   7.000   8.13440     82.365
5   b       0   10.700  11.400  12.13320    136.879
6   b       1   14.000  15.000  16.01550    136.879
7   b       2   17.300  18.000  9.11000     136.879
8   b       3   22.200  33.233  1.23230     136.879
9   c       0   1.100   2.200   3.30000     None
10  c       1   2.200   3.430   54.50000    None
11  d       0   2.200   2.200   3.00000     None
12  d       1   3.400   4.000   5.60000     None
13  d       2   3.300   4.000   5.00000     None
我试过:

df['new_col'] = np.where(df['name'] == the_array[:, 0], the_array[:, 1], the_array[:, 1])
但我收到:

    name    id  x   y   z   new_col
0   a       0   1.200   12.500  3.00000     82.365
1   a       1   4.000   5.000   6.88500     82.365
2   a       2   2.300   3.133   4.30000     82.365
3   a       3   5.678   6.000   7.34556     82.365
4   a       4   6.500   7.000   8.13440     136.879
5   b       0   10.700  11.400  12.13320    136.879
6   b       1   14.000  15.000  16.01550    136.879
7   b       2   17.300  18.000  9.11000     136.879
8   b       3   22.200  33.233  1.23230     None
9   c       0   1.100   2.200   3.30000     None
10  c       1   2.200   3.430   54.50000    None
11  d       0   2.200   2.200   3.00000     None
12  d       1   3.400   4.000   5.60000     None
13  d       2   3.300   4.000   5.00000     None
您可以通过以下方式完成此操作:

_数组
df
大小相同,但未对齐。它似乎表示一组唯一名称的映射
name->value
。因此,它应该用
dict
表示,而不是
数组。通过对数组行进行迭代的dict理解,很容易构造这个
dict

the_map = {k: v for k, v in the_array if k}
df['new_col'] = df['name'].map(the_map)
思考数据的含义以及数据的最佳表示方式是编写优雅代码并在这种情况下找到解决方案的好方法

the_map = {k: v for k, v in the_array if k}
df['new_col'] = df['name'].map(the_map)