Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/jquery-ui/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x python:删除一列中由逗号分隔的多个条目_Python 3.x_Pandas - Fatal编程技术网

Python 3.x python:删除一列中由逗号分隔的多个条目

Python 3.x python:删除一列中由逗号分隔的多个条目,python-3.x,pandas,Python 3.x,Pandas,我在excel中有一个名为sorted_list的表,如下所示: +-------------------+--------------------------------+---+----------------------------------------------------------------------------------------------------------+----------------------------------------------------

我在excel中有一个名为sorted_list的表,如下所示:

+-------------------+--------------------------------+---+----------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+------+------+
| P33151            | partially reviewed             | 9 | other code                                                                                               | Homo sapiens (Human); Pan troglodytes (Chimpanzee)                              |  784 | 100% |
| B4DMA7            | unreviewed                     | 1 | B4DMA7                                                                                                   | Homo sapiens (Human)                                                            |  779 | 100% |
| A8K0L9            | unreviewed                     | 1 | A8K0L9                                                                                                   | Homo sapiens (Human)                                                            |  828 | 100% |
| B4DTP0            | unreviewed                     | 1 | B4DTP0                                                                                                   | Homo sapiens (Human)                                                            |  525 | 100% |
| D3DSM0            | unreviewed                     | 1 | D3DSM0                                                                                                   | Homo sapiens (Human)                                                            |  712 | 100% |
| A8K0L1            | unreviewed                     | 1 | A8K0L1                                                                                                   | Homo sapiens (Human)                                                            |  781 | 100% |
| P06756,L7RXH0     | partially reviewed and UniParc | 8 | P06756; L7RXH0; UPI0001BE65FF; UPI000DF0CE97; UPI0003E68261; UPI0002A11580; UPI0000112063; UPI0012318420 | Homo sapiens (Human); ?                                                         | 1048 | 100% |
| Q59EQ1            | unreviewed                     | 8 | A0A2J8RMA6; Q59EQ1; H3BR78; H3BPQ2; H3BSM4; H3BQH2; H3BP26; H3BQB5                                       | Pongo abelii (Sumatran orangutan) (Pongo pygmaeus abelii); Homo sapiens (Human) |  670 | 100% |
| A0A024R8K7        | partially reviewed and UniParc | 3 | A0A024R8K7; P16144-2; UPI0003EAE94B                                                                      | Homo sapiens (Human)                                                            | 1752 | 100% |
| P11279,A0A024RDY3 | partially reviewed             | 3 | P11279; A0A024RDY3; B3KRY3                                                                               | Homo sapiens (Human)                                                            |  417 | 100% |
| B4DFP0            | unreviewed                     | 1 | B4DFP0                                                                                                   | Homo sapiens (Human)                                                            |  382 | 100% |
| J3KRI5            | unreviewed                     | 2 | J3KRI5; H2QB90                                                                                           | Homo sapiens (Human); Pan troglodytes (Chimpanzee)                              |  744 | 100% |
| B2RCN5            | unreviewed                     | 1 | B2RCN5                                                                                                   | Homo sapiens (Human)                                                            |  916 | 100% |
| Q9NR97            | reviewed                       | 1 | Q9NR97                                                                                                   | Homo sapiens (Human)                                                            | 1041 | 100% |
| Q02846            | reviewed                       | 1 | Q02846                                                                                                   | Homo sapiens (Human)                                                            | 1103 | 100% |
| Q9NY15            | reviewed                       | 1 | Q9NY15                                                                                                   | Homo sapiens (Human)                                                            | 2570 | 100% |
+-------------------+--------------------------------+---+----------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------+------+------+
我感兴趣的是将第一列的值与其他表匹配,但是col1中的某些行有多个值。 我希望用单个值提取每一行(删除“,”之后的部分),然后将其与其他表的
preppi
preppi['prot1']

到目前为止我使用的代码是

col_one_list = sorted_list['id'].tolist()
print(list(col_one_list))
filepath= "/Users/saheeba/Downloads/preppi_final.csv"
preppi = pd.read_csv(filepath)
df = preppi.loc[preppi['prot1'].isin(col_one_list)]
print(df.shape)
但它将数据保留在行中,第一列中有两个值,例如<代码>P06756,L7RXH0
关于如何避免这种情况,有什么建议吗?

尝试通过拆分分隔符上的第一列(此处为逗号)并保留第一个元素来创建一个新列。对于没有分隔符的行,您将获得剩余的唯一元素(元素本身就是分隔符),对于剩余的行,您将获得第一个元素。创建该列后,应用已使用该列的逻辑

这对我不起作用,因为对我来说,每个列中的项目数量不一样