Pandas/Python:基于行值和其他数据帧设置新列的值
当第二个数据帧的省名称与第一个数据帧的省名称匹配时,是否可以在列中添加值?我在寻找答案,却找不到任何对我的案子有用的东西 这是第一个数据帧Pandas/Python:基于行值和其他数据帧设置新列的值,python,pandas,conditional-statements,Python,Pandas,Conditional Statements,当第二个数据帧的省名称与第一个数据帧的省名称匹配时,是否可以在列中添加值?我在寻找答案,却找不到任何对我的案子有用的东西 这是第一个数据帧 date province confirmed released deceased 0 2020-03-30 Daegu 6624 3837 111 1 2020-03-30 Gyeongsangbuk-do 1298
date province confirmed released deceased
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
...
code province latitude longitude
0 12000 Daegu 35.872150 128.601783
1 60000 Gyeongsangbuk-do 36.576032 128.505599
2 20000 Gyeonggi-do 37.275119 127.009466
3 10000 Seoul 37.566953 126.977977
4 41000 Chungcheongnam-do 36.658976 126.673318
...
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111 35.872150 128.601783
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032 128.505599
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119 127.009466
3 2020-03-30 Seoul 426 92 0 37.566953 126.977977
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976 126.673318
这是第二个数据帧
date province confirmed released deceased
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
...
code province latitude longitude
0 12000 Daegu 35.872150 128.601783
1 60000 Gyeongsangbuk-do 36.576032 128.505599
2 20000 Gyeonggi-do 37.275119 127.009466
3 10000 Seoul 37.566953 126.977977
4 41000 Chungcheongnam-do 36.658976 126.673318
...
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111 35.872150 128.601783
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032 128.505599
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119 127.009466
3 2020-03-30 Seoul 426 92 0 37.566953 126.977977
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976 126.673318
我想把第一个数据帧变成这样
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111 35.872150 128.601783
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032 128.505599
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119 127.009466
3 2020-03-30 Seoul 426 92 0 37.566953 126.977977
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976 126.673318
...
谢谢 该方法就是您希望在此处使用的方法
使用示例数据帧:
import pandas as pd
df1 = pd.DataFrame(dict(
date = [
'2020-03-30','2020-03-30','2020-03-30',
'2020-03-30','2020-03-30',],
province = [
'Daegu', 'Gyeongsangbuk-do', 'Gyeonggi-do',
'Seoul', 'Chungcheongnam-do'],
confirmed = [6624, 1298, 463, 426, 127],
released = [3837, 772, 160, 92, 83],
deceased = [111, 38, 5, 0, 0],
))
df2 = pd.DataFrame(dict(
code = [12000, 60000, 20000, 10000, 41000],
province = [
'Daegu', 'Gyeongsangbuk-do', 'Gyeonggi-do',
'Seoul', 'Chungcheongnam-do'],
latitude = [
35.872150, 36.576032, 37.275119,
37.566953, 36.658976],
longitude = [
128.601783, 128.505599, 127.009466,
126.977977, 126.673318],
))
df3 = df1.merge(
df2[['province', 'latitude','longitude']],
on = 'province',
)
pd.set_option('display.max_columns', 7)
print(df3)
date province confirmed released deceased latitude \
0 2020-03-30 Daegu 6624 3837 111 35.872150
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119
3 2020-03-30 Seoul 426 92 0 37.566953
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976
longitude
0 128.601783
1 128.505599
2 127.009466
3 126.977977
4 126.673318
输出:
import pandas as pd
df1 = pd.DataFrame(dict(
date = [
'2020-03-30','2020-03-30','2020-03-30',
'2020-03-30','2020-03-30',],
province = [
'Daegu', 'Gyeongsangbuk-do', 'Gyeonggi-do',
'Seoul', 'Chungcheongnam-do'],
confirmed = [6624, 1298, 463, 426, 127],
released = [3837, 772, 160, 92, 83],
deceased = [111, 38, 5, 0, 0],
))
df2 = pd.DataFrame(dict(
code = [12000, 60000, 20000, 10000, 41000],
province = [
'Daegu', 'Gyeongsangbuk-do', 'Gyeonggi-do',
'Seoul', 'Chungcheongnam-do'],
latitude = [
35.872150, 36.576032, 37.275119,
37.566953, 36.658976],
longitude = [
128.601783, 128.505599, 127.009466,
126.977977, 126.673318],
))
df3 = df1.merge(
df2[['province', 'latitude','longitude']],
on = 'province',
)
pd.set_option('display.max_columns', 7)
print(df3)
date province confirmed released deceased latitude \
0 2020-03-30 Daegu 6624 3837 111 35.872150
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119
3 2020-03-30 Seoul 426 92 0 37.566953
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976
longitude
0 128.601783
1 128.505599
2 127.009466
3 126.977977
4 126.673318
您真正想要做的是基于
省
列合并两个数据帧
创建一个您想要的新数据帧
首先在第一个数据帧上运行一个循环,并添加其中的所有值。(保留不存在的列的值)
然后在第二个数据帧上运行一个循环,并通过比较province
的值与新数据帧中已经添加的值来添加its值
这里有一个例子
新数据帧
date province confirmed released deceased
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
...
code province latitude longitude
0 12000 Daegu 35.872150 128.601783
1 60000 Gyeongsangbuk-do 36.576032 128.505599
2 20000 Gyeonggi-do 37.275119 127.009466
3 10000 Seoul 37.566953 126.977977
4 41000 Chungcheongnam-do 36.658976 126.673318
...
date province confirmed released deceased latitude longitude
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111 35.872150 128.601783
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032 128.505599
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119 127.009466
3 2020-03-30 Seoul 426 92 0 37.566953 126.977977
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976 126.673318
在添加第一个数据帧之后
date province confirmed released deceased
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
...
code province latitude longitude
0 12000 Daegu 35.872150 128.601783
1 60000 Gyeongsangbuk-do 36.576032 128.505599
2 20000 Gyeonggi-do 37.275119 127.009466
3 10000 Seoul 37.566953 126.977977
4 41000 Chungcheongnam-do 36.658976 126.673318
...
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111 35.872150 128.601783
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032 128.505599
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119 127.009466
3 2020-03-30 Seoul 426 92 0 37.566953 126.977977
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976 126.673318
在添加第二个数据帧之后
date province confirmed released deceased
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
...
code province latitude longitude
0 12000 Daegu 35.872150 128.601783
1 60000 Gyeongsangbuk-do 36.576032 128.505599
2 20000 Gyeonggi-do 37.275119 127.009466
3 10000 Seoul 37.566953 126.977977
4 41000 Chungcheongnam-do 36.658976 126.673318
...
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111
1 2020-03-30 Gyeongsangbuk-do 1298 772 38
2 2020-03-30 Gyeonggi-do 463 160 5
3 2020-03-30 Seoul 426 92 0
4 2020-03-30 Chungcheongnam-do 127 83 0
date province confirmed released deceased latitude longitude
0 2020-03-30 Daegu 6624 3837 111 35.872150 128.601783
1 2020-03-30 Gyeongsangbuk-do 1298 772 38 36.576032 128.505599
2 2020-03-30 Gyeonggi-do 463 160 5 37.275119 127.009466
3 2020-03-30 Seoul 426 92 0 37.566953 126.977977
4 2020-03-30 Chungcheongnam-do 127 83 0 36.658976 126.673318
我感谢你的帮助!