Python熊猫:将数据帧合并到series';s索引和数据帧';s柱
我有以下系列:Python熊猫:将数据帧合并到series';s索引和数据帧';s柱,python,python-3.x,pandas,dataframe,Python,Python 3.x,Pandas,Dataframe,我有以下系列: >>>counts = pd.Series({'0.0':5, '1.0':6, '2.0':14, '3.0':98}) >>>counts 0.0 5 1.0 6 2.0 14 3.0 98 dtype: int64 和数据帧: >>>topic_keywords = [(0, 0.0, 'challenge, web, language, require, bot'),
>>>counts = pd.Series({'0.0':5, '1.0':6, '2.0':14, '3.0':98})
>>>counts
0.0 5
1.0 6
2.0 14
3.0 98
dtype: int64
和数据帧:
>>>topic_keywords = [(0, 0.0, 'challenge, web, language, require, bot'),
(1, 3.0, 'time, huge, figure, image, run, develop'),
(2, 1.0, 'datum, user, access, speech, bandwidth'),
(3, 2.0, ' main, decide, audio, sensor, disabled, make'),
(4, 2.0, ' main, decide, audio, sensor, disabled, make'),
(5, 0.0, 'challenge, web, language, require, bot')]
>>> topicKeywordsDf = pd.DataFrame(topic_keywords, columns=['ID', 'Topic_Num', 'Topic_Keywords'])
>>> topicKeywordsDf = topicKeywordsDf.set_index('ID')
>>> topicKeywordsDf
Topic_Num Topic_Keywords
ID
0 0.0 challenge, web, language, require, bot
1 3.0 time, huge, figure, image, run, develop
2 1.0 datum, user, access, speech, bandwidth
3 2.0 main, decide, audio, sensor, disabled, make
4 2.0 main, decide, audio, sensor, disabled, make
5 0.0 challenge, web, language, require, bot
Topic_Num Count Topic_Keywords
0.0 5 challenge, web, language, require, bot
1.0 14 datum, user, access, speech, bandwidth
2.0 6 main, decide, audio, sensor, disabled, make
3.0 98 time, huge, figure, image, run, develop
我希望将序列上的数据框合并,其中序列的索引将与数据框的Topic_Num
列相匹配:
>>>topic_keywords = [(0, 0.0, 'challenge, web, language, require, bot'),
(1, 3.0, 'time, huge, figure, image, run, develop'),
(2, 1.0, 'datum, user, access, speech, bandwidth'),
(3, 2.0, ' main, decide, audio, sensor, disabled, make'),
(4, 2.0, ' main, decide, audio, sensor, disabled, make'),
(5, 0.0, 'challenge, web, language, require, bot')]
>>> topicKeywordsDf = pd.DataFrame(topic_keywords, columns=['ID', 'Topic_Num', 'Topic_Keywords'])
>>> topicKeywordsDf = topicKeywordsDf.set_index('ID')
>>> topicKeywordsDf
Topic_Num Topic_Keywords
ID
0 0.0 challenge, web, language, require, bot
1 3.0 time, huge, figure, image, run, develop
2 1.0 datum, user, access, speech, bandwidth
3 2.0 main, decide, audio, sensor, disabled, make
4 2.0 main, decide, audio, sensor, disabled, make
5 0.0 challenge, web, language, require, bot
Topic_Num Count Topic_Keywords
0.0 5 challenge, web, language, require, bot
1.0 14 datum, user, access, speech, bandwidth
2.0 6 main, decide, audio, sensor, disabled, make
3.0 98 time, huge, figure, image, run, develop
优选地,最终数据帧应基于主题\u Num
进行排序。如何合并这些
尝试:
counts_df = counts.to_frame()
merge = counts_df.merge(topicKeywordsDf, left_index=True, right_on="Topic_Num")
但是得到这个错误:
ValueError:您正在尝试合并object和float64列。如果
如果要继续,请使用pd.concat
你需要补充几点: 首先,您的counts_df没有列名,添加该名称将得到一个带有列名的数据帧
counts_df=pd.DataFrame({'Topic_Num':counts.index, 'value':counts.values})
你的合并现在起作用了。你应该放弃你不使用的列,并考虑如果你想要重复。如果您的counts_df已排序,则合并将如此
merge = counts_df.merge(topicKeywordsDf, left_index=True, right_on="Topic_Num").drop_duplicates()
为什么
counts
字符串的索引表示浮点数,而不仅仅是浮点数?我尝试过,但出现了以下错误:ValueError:您正在尝试合并int64和object列。如果您想继续,您应该使用pd.concat
它对我有效,但只是为了检查,您可以尝试counts=pd.Series({0.0:5,1.0:6,2.0:14,3.0:98})吗?现在没有错误,但结果不是预期的,它有6行(像第二个数据帧),而它应该有4行,像第一个数据帧一样。merge=counts\u df.merge(topicKeywordsDf,left_index=True,right_on=“Topic_Num”)。删除重复项()