Python 我们可以基于文本匹配合并两个数据帧吗?
我有一个DF看起来像这样:Python 我们可以基于文本匹配合并两个数据帧吗?,python,python-3.x,pandas,Python,Python 3.x,Pandas,我有一个DF看起来像这样: ID Rating Bin Price 0 864890 AA+ 4 97.14 1 691634 AA+ 4 14.21 2 792845 AA+ 3 101.25 3 506251 SP 3 100.31 4 689977 AA+ 3 97.37 ... ... ... ... 249995 873393 AA+ 5 110.42 249996 495709 AA+ 7 105
ID Rating Bin Price
0 864890 AA+ 4 97.14
1 691634 AA+ 4 14.21
2 792845 AA+ 3 101.25
3 506251 SP 3 100.31
4 689977 AA+ 3 97.37
... ... ... ...
249995 873393 AA+ 5 110.42
249996 495709 AA+ 7 105.47
249997 508123 AA+ 7 104.55
249998 650062 AA+ 8 105.37
249999 17658 AA+ 8 103.53
我有另一个DF看起来像这样
Rating RatingScores
0 AAA 10
1 AA+ 9.5
2 AA 9
3 A+ 8.5
4 A 8
.. ... ...
20 CC- 0
21 D 0
22 NA 0
23 NR 0
24 SP 0
我想知道是否有办法将第二个DF合并到第一个DF中。索引将不匹配,但两者的评级字段相同。或者,是否有一种简单的方法向第一个DF添加一列,并为非数字标签(评级)生成数字结果(评级分数)
到目前为止,我已经尝试过:
dataset['RatingScores'] = pd.merge(dataset, finalDF, on='Rating')
我现在明白了
dataset['RatingScores'] = pd.merge(dataset, finalDF, on='Rating')
Traceback (most recent call last):
File "<ipython-input-327-0afd66ad6da1>", line 1, in <module>
dataset['RatingScores'] = pd.merge(dataset, finalDF, on='Rating')
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 3472, in __setitem__
self._set_item(key, value)
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 3550, in _set_item
NDFrame._set_item(self, key, value)
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py", line 3381, in _set_item
self._data.set(key, value)
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1072, in set
self.insert(len(self.items), item, value)
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 1181, in insert
block = make_block(values=value, ndim=self.ndim, placement=slice(loc, loc + 1))
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 3267, in make_block
return klass(values, ndim=ndim, placement=placement)
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 2775, in __init__
super().__init__(values, ndim=ndim, placement=placement)
File "C:\Users\rshuell\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 128, in __init__
"{mgr}".format(val=len(self.values), mgr=len(self.mgr_locs))
ValueError: Wrong number of items passed 12, placement implies 1
还有
(finalDF.to_dict())
{'Rating': {0: 'AAA',
1: 'AA+',
2: 'AA',
3: 'A+',
4: 'A',
5: 'A-',
6: 'AA-',
7: 'BBB',
8: 'BB+',
9: 'BB',
10: 'B+',
11: 'B',
12: 'B-',
13: 'BB-',
14: 'CCC',
15: 'CC+',
16: 'CC',
17: 'C+',
18: 'C',
19: 'C-',
20: 'CC-',
21: 'D',
22: 'NA',
23: 'NR',
24: 'SP'},
'RatingScores': {0: 10.0,
1: 9.5,
2: 9.0,
3: 8.5,
4: 8.0,
5: 7.5,
6: 7.0,
7: 6.5,
8: 6.0,
9: 5.5,
10: 5.0,
11: 4.5,
12: 4.0,
13: 3.5,
14: 3.0,
15: 2.5,
16: 2.0,
17: 1.5,
18: 1.0,
19: 0.0,
20: 0.0,
21: 0.0,
22: 0.0,
23: 0.0,
24: 0.0}}
您的代码已经正确,但是
pd.merge()
返回一个新加入的DataFrame
,您将其放入pd.Series
,因此ValueError
因此,您可以将其放入一个新变量中,以保存合并的数据帧
merged_df = pd.merge(dataset, finalDF, on='Rating')
您的代码已经正确,但是
pd.merge()
返回一个新加入的DataFrame
,您将其放入pd.Series
,因此ValueError
因此,您可以将其放入一个新变量中,以保存合并的数据帧
merged_df = pd.merge(dataset, finalDF, on='Rating')
我将考虑隔离第二个数据帧中需要的列,然后使用“join”组合这两个数据帧
dat1.join(dat2)
我将考虑隔离第二个数据帧中需要的列,然后使用“join”组合这两个数据帧
dat1.join(dat2)
您可以使用熊猫合并和设置我尝试的评级键;没用。好吧,我试了试,效果很好。不过我没用你的字典输出。幸运的是,剪贴板方法在第一次输出时运行良好。我没有错,你可以使用熊猫合并和设置我试过的评级键;没用。好吧,我试了试,效果很好。不过我没用你的字典输出。幸运的是,剪贴板方法在第一次输出时运行良好。我没搞错,我去接达伦。哦,我明白了。我想我可以就地做。好的,明白了。谢谢。很高兴接达伦。哦,我明白了。我想我可以就地做。好的,明白了。谢谢