Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/299.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 创建一列,根据条件删除字符串中不需要的部分_Python_Pandas_Loops_Split - Fatal编程技术网

Python 创建一列,根据条件删除字符串中不需要的部分

Python 创建一列,根据条件删除字符串中不需要的部分,python,pandas,loops,split,Python,Pandas,Loops,Split,我是python新手,我被困在这里。我有一个如下的数据框,我正在尝试创建一个新的列,其中只包含“类型”列的宏类型 数据帧: import pandas as pd d = {'Genres': ['Finance', 'Arcade', 'Business', 'Photography', 'Entertainment;Brain Games', 'Medical', 'Tools', 'Casual;Brain Games', 'Medical', 'Entertainment'],

我是python新手,我被困在这里。我有一个如下的数据框,我正在尝试创建一个新的列,其中只包含“类型”列的宏类型

数据帧:

import pandas as pd
d = {'Genres': ['Finance', 'Arcade', 'Business', 'Photography', 'Entertainment;Brain Games', 'Medical', 'Tools', 'Casual;Brain Games', 'Medical', 'Entertainment'], 
     'Last Updated': ['March 10, 2018', 'May 24, 2018', 'April 11, 2018', 'November 6, 2014', 'March 9, 2018', 'May 17, 2018', 'June 3, 2016', 'April 10, 2016', 'July 16, 2018', 'May 17, 2017']}
df = pd.DataFrame(data=d)
df

                       Genres        Last Updated
0                     Finance      March 10, 2018
1                      Arcade        May 24, 2018
2                    Business      April 11, 2018
3                 Photography    November 6, 2014
4   Entertainment;Brain Games       March 9, 2018
5                     Medical        May 17, 2018
6                       Tools        June 3, 2016
7          Casual;Brain Games      April 10, 2016
8                     Medical       July 16, 2018
9               Entertainment        May 17, 2017
所需的输出类似于:

                       Genres          macro_genres        Last Updated
0                     Finance               Finance      March 10, 2018
1                      Arcade                Arcade        May 24, 2018
2                    Business              Business      April 11, 2018
3                 Photography           Photography    November 6, 2014
4   Entertainment;Brain Games         Entertainment       March 9, 2018
5                     Medical               Medical        May 17, 2018
6                       Tools                 Tools        June 3, 2016
7          Casual;Brain Games                Casual      April 10, 2016
8                     Medical               Medical       July 16, 2018
9               Entertainment         Entertainment        May 17, 2017
我所尝试的:

def macro_genre(i):
    for i in df['Genres']:
        if ';' in i:
            j = i.split(';')[0]
            return j
        else:
            return i
                    
df['macro_genres'] = df['Genres'].apply(macro_genre)
但它不起作用。它创建列,但对整个列重复第一个值

当我尝试函数外部的
部分时,它工作正常


任何提示都将不胜感激!谢谢

您只需使用
str.split(“;”)
。如果
不存在于字符串中,不会发生任何事情->返回带有原始字符串的列表(因此您可以始终使用
[0]
):

印刷品:

                      Genres      Last_Updated   macro_genres
0                    Finance    March 10, 2018        Finance
1                     Arcade      May 24, 2018         Arcade
2                   Business    April 11, 2018       Business
3                Photography  November 6, 2014    Photography
4  Entertainment;Brain_Games     March 9, 2018  Entertainment
5                    Medical      May 17, 2018        Medical
6                      Tools      June 3, 2016          Tools
7         Casual;Brain Games    April 10, 2016         Casual
8                    Medical     July 16, 2018        Medical
9              Entertainment      May 17, 2017  Entertainment

您可以只使用
str.split(“;”)
。如果
不存在于字符串中,不会发生任何事情->返回带有原始字符串的列表(因此您可以始终使用
[0]
):

印刷品:

                      Genres      Last_Updated   macro_genres
0                    Finance    March 10, 2018        Finance
1                     Arcade      May 24, 2018         Arcade
2                   Business    April 11, 2018       Business
3                Photography  November 6, 2014    Photography
4  Entertainment;Brain_Games     March 9, 2018  Entertainment
5                    Medical      May 17, 2018        Medical
6                      Tools      June 3, 2016          Tools
7         Casual;Brain Games    April 10, 2016         Casual
8                    Medical     July 16, 2018        Medical
9              Entertainment      May 17, 2017  Entertainment

一种可能是使用
map

df['macro_games'] = df['Genres'].astype(str).map(lambda x : x.split(';')[0])
输出

>>> df
                       Genres          macro_genres        Last Updated
0                     Finance               Finance      March 10, 2018
1                      Arcade                Arcade        May 24, 2018
2                    Business              Business      April 11, 2018
3                 Photography           Photography    November 6, 2014
4   Entertainment;Brain Games         Entertainment       March 9, 2018
5                     Medical               Medical        May 17, 2018
6                       Tools                 Tools        June 3, 2016
7          Casual;Brain Games                Casual      April 10, 2016
8                     Medical               Medical       July 16, 2018
9               Entertainment         Entertainment        May 17, 2017
1k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
50k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
100k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

一种可能是使用
map

df['macro_games'] = df['Genres'].astype(str).map(lambda x : x.split(';')[0])
输出

>>> df
                       Genres          macro_genres        Last Updated
0                     Finance               Finance      March 10, 2018
1                      Arcade                Arcade        May 24, 2018
2                    Business              Business      April 11, 2018
3                 Photography           Photography    November 6, 2014
4   Entertainment;Brain Games         Entertainment       March 9, 2018
5                     Medical               Medical        May 17, 2018
6                       Tools                 Tools        June 3, 2016
7          Casual;Brain Games                Casual      April 10, 2016
8                     Medical               Medical       July 16, 2018
9               Entertainment         Entertainment        May 17, 2017
1k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
10k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
50k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
100k数据帧上的运行时比较:

#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
535 µs ± 12.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
1.36 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
527 µs ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
3.62 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#str split method (slowest)
>>> %timeit -n 1000 df['Genres'].str.split(';').str[0]
10 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
3.47 ms ± 59.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
17 ms ± 133 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].map(lambda x : x.split(';')[0])
16.7 ms ± 278 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#apply method
>>> %timeit -n 1000 df['Genres'].apply(lambda x: x.split(';')[0])
34.1 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#map method 
>>> %timeit -n 1000 df['Genres'].astype(str).map(lambda x : x.split(';')[0])
35.5 ms ± 596 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


我想说的是
df['macro_-genres']=df['genres'].str.split(';').str[0]
可能是一个更好的答案,但是如果你用
%timeit
快速回答,性能会更好。它是
258µs
而不是
474µs
@DavidErickson为什么不使用
map
。我认为这比申请要快。查看我的答案在10k数据帧上的运行时比较。@grayrigel我认为您不会看到应用程序和映射程序之间的性能差异,如您的答案所示。@DavidErickson是的。你是对的。我做了不同长度的测试。我收回我的陈述,即
map
更快。但是,在小型dfs上稍好一些,在大型dfs上似乎变得更慢。我想说的是
df['macro_genres']=df['genres'].str.split(“;”).str[0]
可能是一个更好的答案,但是使用快速
%timeit
的答案,性能会更好。它是
258µs
而不是
474µs
@DavidErickson为什么不使用
map
。我认为这比申请要快。查看我的答案在10k数据帧上的运行时比较。@grayrigel我认为您不会看到应用程序和映射程序之间的性能差异,如您的答案所示。@DavidErickson是的。你是对的。我做了不同长度的测试。我收回我的陈述,即
map
更快。在小型dfs上稍微好一些,但是在大型dfs上似乎变慢了。看到不同长度的数据帧进行比较会很好。@AndrejKesely感谢您的投票。用1K、10K、50K、100K数据帧更新了我的答案。我收回我的陈述,即
map
更快。在小dfs上稍微好一些,但是在大dfs上似乎变慢了。非常感谢@Grayrigel!我确信有一个干净简单的解决方案,但没有达到目的。它完全奏效了!速度重要吗?使用更干净/更惯用的方法不是更好吗?@AMC我认为速度很重要,尤其是在处理大型数据帧时。我不确定有什么比半行代码更好。你有什么建议?你有其他的方法吗?投票支持基准测试。看到不同长度的数据帧进行比较会很好。@AndrejKesely感谢您的投票。用1K、10K、50K、100K数据帧更新了我的答案。我收回我的陈述,即
map
更快。在小dfs上稍微好一些,但是在大dfs上似乎变慢了。非常感谢@Grayrigel!我确信有一个干净简单的解决方案,但没有达到目的。它完全奏效了!速度重要吗?使用更干净/更惯用的方法不是更好吗?@AMC我认为速度很重要,尤其是在处理大型数据帧时。我不确定有什么比半行代码更好。你有什么建议?你有其他的方法吗?请提供一个。很抱歉。我是新来的。你说哪一部分应该是最小可复制的,数据帧本身?我输入它是因为它只是一个更大数据框的一小部分。你说哪一部分应该是最小的可复制的,数据框本身?应该可以复制/粘贴您的代码和数据,并且能够立即运行代码。感谢AMC的提示。虽然已经给出了解决方案,但我已经包含了生成数据帧的代码。请提供一个。对此表示抱歉。我是新来的。你说哪一部分应该是最小可复制的,数据帧本身?我输入它是因为它只是一个更大数据框的一小部分。你说哪一部分应该是最小的可复制的,数据框本身?应该可以复制/粘贴您的代码和数据,并且能够立即运行代码。感谢AMC的提示。虽然已经给出了解决方案,但我已经包含了生成数据帧的代码。