Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/358.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 基于多个条件对数据帧进行排序_Python_Pandas_Pandas Groupby - Fatal编程技术网

Python 基于多个条件对数据帧进行排序

Python 基于多个条件对数据帧进行排序,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我有一个如下所示格式的数据帧: Id Name Mag Out Des 23 Yah 1.0 base n-0 23 Yah 1.0 base n-0 23 Yah 1.0 base n-0 24 Nah 0.99 base n-0 24 Nah 1.01 line-2 line-2 24 Nah 0.95 line-

我有一个如下所示格式的数据帧:

Id    Name    Mag    Out      Des

23    Yah     1.0    base     n-0
23    Yah     1.0    base     n-0
23    Yah     1.0    base     n-0
24    Nah     0.99   base     n-0
24    Nah     1.01   line-2   line-2
24    Nah     0.95   line-3   line-3
24    Nah     1.1    line-4   line-4
25    lol     1.0    line-1   line-1
25    lol     1.1    line-3   line-3
25    lol     0.9    line-4   line-4
25    lol     0.95   line-5   line-5
输出必须满足以下条件:

  • 对于相同的ID和名称,如果“out”列只有base,则只报告与第一行对应的项一次
  • 对于相同的ID和名称,如果“out”列至少有一个基本项,则报告与基本项对应的行以及“Mag”列的最小值和最大值
  • 输出必须采用以下格式:

    Id    Name    Mag    Out      Des
    
    23    Yah     1.0    base     n-0
    24    Nah     0.99   base     n-0
    24    Nah     0.95   line-3   line-3
    24    Nah     1.1    line-4   line-4
    25    lol     0.9    line-4   line-4
    25    lol     0.95   line-5   line-5
    25    lol     1.0    line-1   line-1
    25    lol     1.1    line-3   line-3
    

    这里有一个方法。为清楚起见,分几个步骤:

    def check_base(x):
        if all([elem == "base" for elem in x]):
            return ["keep"] + ["drop"] * (len(x)-1)
        elif "base" in list(x):
            return ["keep" if i=="base" else "maybe" for i in list(x)]
        else:
            return "keep"
    
    df["criteria"] = df.groupby(["Id", "Name"], as_index = False).Out.transform(check_base)
    
    g_min = df.groupby(["Id", "Name"]).Mag.transform("min")
    g_max = df.groupby(["Id", "Name"]).Mag.transform("max")
    
    df = df[(df.criteria == "keep") | (df.criteria == "maybe") & ((df.Mag == g_min) | (df.Mag == g_max))]
    
    结果是:

        Id Name   Mag     Out     Des criteria
    0   23  Yah  1.00    base     n-0     keep
    3   24  Nah  0.99    base     n-0     keep
    5   24  Nah  0.95  line-3  line-3    maybe
    6   24  Nah  1.10  line-4  line-4    maybe
    7   25  lol  1.00  line-1  line-1     keep
    8   25  lol  1.10  line-3  line-3     keep
    9   25  lol  0.90  line-4  line-4     keep
    10  25  lol  0.95  line-5  line-5     keep
    

    有实际问题吗?请提供一个,并请参阅,。以了解仅具有“base”的ID/名称组合-应维护哪一行?@Roy2012维护第一行数据。到目前为止,您尝试了什么?