Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/332.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python/R:删除重复行-保留唯一的作者对_Python_R_Duplicates_Mapping_Data Manipulation - Fatal编程技术网

Python/R:删除重复行-保留唯一的作者对

Python/R:删除重复行-保留唯一的作者对,python,r,duplicates,mapping,data-manipulation,Python,R,Duplicates,Mapping,Data Manipulation,这是我从数据库中提取的一个示例。我在合作作者中使用可视化,所以基于此示例,我必须在两位作者中保持一种关系。例如,我必须删除布赖恩·诺顿中的一个---玛丽亚·鲁昂或玛丽亚·鲁昂---布赖恩·诺顿,以保持关系的唯一性 ------------------------------------------------------------------------------------------------- | article_title

这是我从数据库中提取的一个示例。我在合作作者中使用可视化,所以基于此示例,我必须在两位作者中保持一种关系。例如,我必须删除布赖恩·诺顿中的一个---玛丽亚·鲁昂或玛丽亚·鲁昂---布赖恩·诺顿,以保持关系的唯一性

-------------------------------------------------------------------------------------------------
|              article_title                                | author_name     |   coauthor_name |
-------------------------------------------------------------------------------------------------
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | Brian Norton
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Sarah McCormack | S. Shynu
-------------------------------------------------------------------------------------------------
理想的最终输出如下

-------------------------------------------------------------------------------------------------
|              article_title                                | author_name     |   coauthor_name |
-------------------------------------------------------------------------------------------------
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Maria Roo Ons
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Brian Norton    | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Max Ammann
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Maria Roo Ons   | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | S. Shynu
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | Max Ammann      | Sarah McCormack
A Metal Plate Solar Antenna for UMTS Pico-cell Base Station | S. Shynu        | Sarah McCormack
在这种情况下,我只想保持一排。在R或Python中如何处理它?
非常感谢您的帮助。

我假设您有一个单独的数据库,并且正在使用python与之连接

可能的办法:

1) 您可以根据
文章
列添加行号,然后执行重复数据消除。您可以查看答案,了解如何在SQL中实现它

然后,您可以使用python-db连接器运行查询


2) 您可以将记录拉入pandas数据框并在其中进行分析。善于处理和操纵数据。

我假设您的数据帧与我在下面展示的数据帧类似,因为您没有分享可能出现的其他可能性

article author1 author2
A       a       b
A       b       a
A       a       a
A       b       b
在R中,这就是如何获取您要查找的行的方法。我假设您的数据帧是
df1

# This will create a new dataframe df2 with only those rows where author1 and author2 are different

df2 <- df1[df1$author1 != df1$author2, ]

如果这是您需要的,请告诉我。

到目前为止,您有没有尝试过什么?您正在使用库/包吗?(Numpy/Pandas代表Python,dplyr或datatables代表R)谢谢你的回答,但我的问题不是删除重复列,问题是如果author1和author2在同一篇文章中具有相同的值但顺序不同,如何删除重复项。当你说相同的值时,你是说
author1
author2
都应该是
a
?不,这意味着在同一篇文章中,我只需要一条记录,如:文章a author1a author2b或文章a author1b author2查看我的更新答案。由于您没有为我提供更大的数据集来全面测试我的代码,我假设
author1
author2
将只包含
a
b
。如果这能帮你解决问题,请告诉我。非常感谢你的帮助。谢谢,我会努力的。
article author1 author2
  A       a       b
  A       b       a