Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/318.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python if else条件在dataframe中,并提取列值_Python_Pandas_Dataframe_If Statement - Fatal编程技术网

Python if else条件在dataframe中,并提取列值

Python if else条件在dataframe中,并提取列值,python,pandas,dataframe,if-statement,Python,Pandas,Dataframe,If Statement,我有一个数据帧(df),看起来像 +-----------------+-----------+----------------+---------------------+--------------+-------------+ | Gene | Gene name | Tissue | Cell type | Level | Reliability | +-----------------+-----------+--

我有一个数据帧(df),看起来像

+-----------------+-----------+----------------+---------------------+--------------+-------------+
|      Gene       | Gene name |     Tissue     |      Cell type      |    Level     | Reliability |
+-----------------+-----------+----------------+---------------------+--------------+-------------+
| ENSG00000001561 | ENPP4     | adipose tissue | adipocytes          | Low          | Approved    |
| ENSG00000001561 | ENPP4     | adrenal gland  | glandular cells     | High         | Approved    |
| ENSG00000001561 | ENPP4     | appendix       | glandular cells     | Medium       | Approved    |
| ENSG00000001561 | ENPP4     | appendix       | lymphoid tissue     | Low          | Approved    |
| ENSG00000001561 | ENPP4     | bone marrow    | hematopoietic cells | Medium       | Approved    |
| ENSG00000002586 | CD99      | adipose tissue | adipocytes          | Low          | Supported   |
| ENSG00000002586 | CD99      | adrenal gland  | glandular cells     | Medium       | Supported   |
| ENSG00000002586 | CD99      | appendix       | glandular cells     | Not detected | Supported   |
| ENSG00000002586 | CD99      | appendix       | lymphoid tissue     | Not detected | Supported   |
| ENSG00000002586 | CD99      | bone marrow    | hematopoietic cells | High         | Supported   |
| ENSG00000002586 | CD99      | breast         | adipocytes          | Not detected | Supported   |
| ENSG00000003056 | M6PR      | adipose tissue | adipocytes          | High         | Approved    |
| ENSG00000003056 | M6PR      | adrenal gland  | glandular cells     | High         | Approved    |
| ENSG00000003056 | M6PR      | appendix       | glandular cells     | High         | Approved    |
| ENSG00000003056 | M6PR      | appendix       | lymphoid tissue     | High         | Approved    |
| ENSG00000003056 | M6PR      | bone marrow    | hematopoietic cells | High         | Approved    |
+-----------------+-----------+----------------+---------------------+--------------+-------------+
预期产出:


+-----------+--------+-------------------------------+
| Gene name | Level  |            Tissue             |
+-----------+--------+-------------------------------+
| ENPP4     | Low    | adipose tissue, appendix      |
| ENPP4     | High   | adrenal gland, bronchus       |
| ENPP4     | Medium | appendix, breast, bone marrow |
| CD99      | Low    | adipose tissue, appendix      |
| CD99      | High   | bone marrow                   |
| CD99      | Medium | adrenal gland                 |
| ...       | ...    | ...                           |
+-----------+--------+-------------------------------+
使用的代码(从以下位置获取帮助):

错误:
KeyError:('Level',发生在索引172')
我不明白我做错了什么。有什么建议吗?

试试:

df.groupby(['Gene name','Level'], as_index=False)['Cell type'].agg(', '.join)
输出:

|    | Gene name   | Level        | Cell type                                                                                                       |
|---:|:------------|:-------------|:----------------------------------------------------------------------------------------------------------------|
|  0 | CD99        | High         | hematopoietic cells                                                                                             |
|  1 | CD99        | Low          | adipocytes                                                                                                      |
|  2 | CD99        | Medium       | glandular cells                                                                                                 |
|  3 | CD99        | Not detected | glandular cells     ,  lymphoid tissue     ,  adipocytes                                                        |
|  4 | ENPP4       | High         | glandular cells                                                                                                 |
|  5 | ENPP4       | Low          | adipocytes          ,  lymphoid tissue                                                                          |
|  6 | ENPP4       | Medium       | glandular cells     ,  hematopoietic cells                                                                      |
|  7 | M6PR        | High         | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells |
| Gene name   |  High                                                                                                           |  Low                                   |  Medium                                    |  Not detected                                            |
|:------------|:----------------------------------------------------------------------------------------------------------------|:---------------------------------------|:-------------------------------------------|:---------------------------------------------------------|
| CD99        | hematopoietic cells                                                                                             | adipocytes                             | glandular cells                            | glandular cells     ,  lymphoid tissue     ,  adipocytes |
| ENPP4       | glandular cells                                                                                                 | adipocytes          ,  lymphoid tissue | glandular cells     ,  hematopoietic cells | nan                                                      |
| M6PR        | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells | nan                                    | nan                                        | nan                                                      |
根据以下评论添加更新:

(df.groupby(['Gene name','Level'], as_index=False)['Cell type']
   .agg(','.join).set_index(['Gene name','Level'])['Cell type']
   .unstack().reset_index())
输出:

|    | Gene name   | Level        | Cell type                                                                                                       |
|---:|:------------|:-------------|:----------------------------------------------------------------------------------------------------------------|
|  0 | CD99        | High         | hematopoietic cells                                                                                             |
|  1 | CD99        | Low          | adipocytes                                                                                                      |
|  2 | CD99        | Medium       | glandular cells                                                                                                 |
|  3 | CD99        | Not detected | glandular cells     ,  lymphoid tissue     ,  adipocytes                                                        |
|  4 | ENPP4       | High         | glandular cells                                                                                                 |
|  5 | ENPP4       | Low          | adipocytes          ,  lymphoid tissue                                                                          |
|  6 | ENPP4       | Medium       | glandular cells     ,  hematopoietic cells                                                                      |
|  7 | M6PR        | High         | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells |
| Gene name   |  High                                                                                                           |  Low                                   |  Medium                                    |  Not detected                                            |
|:------------|:----------------------------------------------------------------------------------------------------------------|:---------------------------------------|:-------------------------------------------|:---------------------------------------------------------|
| CD99        | hematopoietic cells                                                                                             | adipocytes                             | glandular cells                            | glandular cells     ,  lymphoid tissue     ,  adipocytes |
| ENPP4       | glandular cells                                                                                                 | adipocytes          ,  lymphoid tissue | glandular cells     ,  hematopoietic cells | nan                                                      |
| M6PR        | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells | nan                                    | nan                                        | nan                                                      |
尝试:

输出:

|    | Gene name   | Level        | Cell type                                                                                                       |
|---:|:------------|:-------------|:----------------------------------------------------------------------------------------------------------------|
|  0 | CD99        | High         | hematopoietic cells                                                                                             |
|  1 | CD99        | Low          | adipocytes                                                                                                      |
|  2 | CD99        | Medium       | glandular cells                                                                                                 |
|  3 | CD99        | Not detected | glandular cells     ,  lymphoid tissue     ,  adipocytes                                                        |
|  4 | ENPP4       | High         | glandular cells                                                                                                 |
|  5 | ENPP4       | Low          | adipocytes          ,  lymphoid tissue                                                                          |
|  6 | ENPP4       | Medium       | glandular cells     ,  hematopoietic cells                                                                      |
|  7 | M6PR        | High         | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells |
| Gene name   |  High                                                                                                           |  Low                                   |  Medium                                    |  Not detected                                            |
|:------------|:----------------------------------------------------------------------------------------------------------------|:---------------------------------------|:-------------------------------------------|:---------------------------------------------------------|
| CD99        | hematopoietic cells                                                                                             | adipocytes                             | glandular cells                            | glandular cells     ,  lymphoid tissue     ,  adipocytes |
| ENPP4       | glandular cells                                                                                                 | adipocytes          ,  lymphoid tissue | glandular cells     ,  hematopoietic cells | nan                                                      |
| M6PR        | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells | nan                                    | nan                                        | nan                                                      |
根据以下评论添加更新:

(df.groupby(['Gene name','Level'], as_index=False)['Cell type']
   .agg(','.join).set_index(['Gene name','Level'])['Cell type']
   .unstack().reset_index())
输出:

|    | Gene name   | Level        | Cell type                                                                                                       |
|---:|:------------|:-------------|:----------------------------------------------------------------------------------------------------------------|
|  0 | CD99        | High         | hematopoietic cells                                                                                             |
|  1 | CD99        | Low          | adipocytes                                                                                                      |
|  2 | CD99        | Medium       | glandular cells                                                                                                 |
|  3 | CD99        | Not detected | glandular cells     ,  lymphoid tissue     ,  adipocytes                                                        |
|  4 | ENPP4       | High         | glandular cells                                                                                                 |
|  5 | ENPP4       | Low          | adipocytes          ,  lymphoid tissue                                                                          |
|  6 | ENPP4       | Medium       | glandular cells     ,  hematopoietic cells                                                                      |
|  7 | M6PR        | High         | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells |
| Gene name   |  High                                                                                                           |  Low                                   |  Medium                                    |  Not detected                                            |
|:------------|:----------------------------------------------------------------------------------------------------------------|:---------------------------------------|:-------------------------------------------|:---------------------------------------------------------|
| CD99        | hematopoietic cells                                                                                             | adipocytes                             | glandular cells                            | glandular cells     ,  lymphoid tissue     ,  adipocytes |
| ENPP4       | glandular cells                                                                                                 | adipocytes          ,  lymphoid tissue | glandular cells     ,  hematopoietic cells | nan                                                      |
| M6PR        | adipocytes          ,  glandular cells     ,  glandular cells     ,  lymphoid tissue     ,  hematopoietic cells | nan                                    | nan                                        | nan                                                      |

如果我可以问波士顿,你是如何读取数据的?我试过读取剪贴板(),但结果不是right@sammywemmy我用这些语句
df=pd.read_剪贴板(sep='|',header=None)
df=df.drop([0,7],axis=1)。设置_轴(['Gene','Gene name','Tissue','Cell type','Level','Reliability','axis=1)
Oh。。我没有复制标题内容,只是复制了数据。@sammywemmy我也可以用.str.strip来清理一些空白,但我认为大部分需要做的事情都在上面。@ScottBoston它成功了,谢谢!我正试图给它增加另一个复杂因素。我希望在新的df中得到这样的列:
Gene name | Level:High | Level:Medium | Level:Low
,这些列从第一个df得到相应的值。有什么建议吗?@ScottBoston.。这是一条很棒的班轮。谢谢你的帮助。如果我可以问波士顿,你是如何读取数据的?我试过读取剪贴板(),但结果不是right@sammywemmy我用这些语句
df=pd.read_剪贴板(sep='|',header=None)
df=df.drop([0,7],axis=1)。设置_轴(['Gene','Gene name','Tissue','Cell type','Level','Reliability','axis=1)
Oh。。我没有复制标题内容,只是复制了数据。@sammywemmy我也可以用.str.strip来清理一些空白,但我认为大部分需要做的事情都在上面。@ScottBoston它成功了,谢谢!我正试图给它增加另一个复杂因素。我希望在新的df中得到这样的列:
Gene name | Level:High | Level:Medium | Level:Low
,这些列从第一个df得到相应的值。有什么建议吗?@ScottBoston.。这是一条很棒的班轮。谢谢你的帮助(y)