Pandas 合并和排序包含不同排列的相同内容和不同关联值的两个数据帧_Pandas

Pandas 合并和排序包含不同排列的相同内容和不同关联值的两个数据帧

pandas

Pandas 合并和排序包含不同排列的相同内容和不同关联值的两个数据帧,pandas,Pandas,熊猫是新手。我创建了两个数据帧： df1 df2 属名称列内容相同，但在数据帧中的顺序不同。我想创建第三个数据帧，它包含所有三个数据帧的内容，这些数据帧按照域命中计数的降序排序，然后是基因组的数量。输出应如下所示： df3 我该怎么做呢？经过一些修补，我能够达到预期的输出。我的代码可能相当笨拙，所以请原谅noob的低效 # merge df1 and df2 by using the 'Genus-name' column df3 = df1.merge(df2, on = "G

熊猫是新手。我创建了两个数据帧：

df1

df2

属名称列内容相同，但在数据帧中的顺序不同。我想创建第三个数据帧，它包含所有三个数据帧的内容，这些数据帧按照域命中计数的降序排序，然后是基因组的数量。输出应如下所示：

df3

我该怎么做呢？

经过一些修补，我能够达到预期的输出。我的代码可能相当笨拙，所以请原谅noob的低效

# merge df1 and df2 by using the 'Genus-name' column
df3 = df1.merge(df2, on = "Genus-name")

# sort by columns in the declared order of priority
df3.groupby(['Domain-hit-counts', 'Num-of-genomes', 'Genus-name'])

# reorder columns
cols = ['Genus-name', 'Domain-hit-counts', 'Num-of-genomes']
df3 = df3[cols]

# reset index
df3.reset_index(drop = True, inplace = True) 

# display data frame
df3

请随时提出代码中的任何改进建议。：）

使用

TypeError:sort_values（）得到一个意外的关键字参数“ignore_index”。删除ignore_index选项会得到所需的输出。奇怪。您使用的是什么版本？从1.0.0开始，添加了doc ignore_索引。无论如何，我接受了你的编辑建议，因为它更容易阅读，谢谢。我使用的是熊猫版本0.25.1。也许这就是错误的原因。

    Num-of-genomes  Genus-name
0   221 Mycobacterium
1   193 Bacillus
2   70  Yersinia
... ...
207 1   Actinomadura
208 1   Acidothermus
209 1   Acaryochloris

Genus-name  Domain-hit-counts   Num-of-genomes
Bacillus    2228    193
Paenibacillus   467 40
Mycobacterium   415 221
... ...
Microbulbifer   1   1
Methylocella    1   1
Oceanobacillus  1   1

# merge df1 and df2 by using the 'Genus-name' column
df3 = df1.merge(df2, on = "Genus-name")

# sort by columns in the declared order of priority
df3.groupby(['Domain-hit-counts', 'Num-of-genomes', 'Genus-name'])

# reorder columns
cols = ['Genus-name', 'Domain-hit-counts', 'Num-of-genomes']
df3 = df3[cols]

# reset index
df3.reset_index(drop = True, inplace = True) 

# display data frame
df3

df3 = df1.merge(df2, on="Genus-name")
df3.sort_values(by=["Domain-hit-counts", "Num-of-genomes"], ascending=[False, False], inplace=True)
df3.reset_index(drop=True, inplace=True)