Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/301.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python保留行(如果YYYY存在),否则删除行_Python_Regex_Date_Dataframe_Lambda - Fatal编程技术网

Python保留行(如果YYYY存在),否则删除行

Python保留行(如果YYYY存在),否则删除行,python,regex,date,dataframe,lambda,Python,Regex,Date,Dataframe,Lambda,我有一个数据框,它有一个日期列,我想从日期列中删除那些没有YYYY(例如,2018,它可以是任何年份)格式的行。 我曾将apply方法用于正则表达式,但不起作用 df[df.Date.apply(lambda x: re.findall(r'[0-9]{4}', x))] “日期”列可以具有以下值: 12/3/2018 March 12, 2018 stackoverflow Mar 12, 2018 no date text 3/12/2018 所以这里的输出应该是 12/3/2018 M

我有一个数据框,它有一个日期列,我想从日期列中删除那些没有YYYY(例如,2018,它可以是任何年份)格式的行。 我曾将apply方法用于正则表达式,但不起作用

df[df.Date.apply(lambda x: re.findall(r'[0-9]{4}', x))]
“日期”列可以具有以下值:

12/3/2018
March 12, 2018
stackoverflow
Mar 12, 2018
no date text
3/12/2018
所以这里的输出应该是

12/3/2018
March 12, 2018
Mar 12, 2018
3/12/2018

这是一种方法。使用带有
errors=“胁迫”

Ex:

import pandas as pd
df = pd.DataFrame({"Col1": ['12/3/2018', 'March 12, 2018', 'stackoverflow', 'Mar 12, 2018', 'no date text', '3/12/2018']})
df["Col1"] = pd.to_datetime(df["Col1"], errors="coerce")
df = df[df["Col1"].notnull()]
print(df)
        Col1
0 2018-12-03
1 2018-03-12
3 2018-03-12
5 2018-03-12
             Col1
0       12/3/2018
1  March 12, 2018
3    Mar 12, 2018
5       3/12/2018
输出:

import pandas as pd
df = pd.DataFrame({"Col1": ['12/3/2018', 'March 12, 2018', 'stackoverflow', 'Mar 12, 2018', 'no date text', '3/12/2018']})
df["Col1"] = pd.to_datetime(df["Col1"], errors="coerce")
df = df[df["Col1"].notnull()]
print(df)
        Col1
0 2018-12-03
1 2018-03-12
3 2018-03-12
5 2018-03-12
             Col1
0       12/3/2018
1  March 12, 2018
3    Mar 12, 2018
5       3/12/2018
或者,如果要维护原始数据

输出:

import pandas as pd
df = pd.DataFrame({"Col1": ['12/3/2018', 'March 12, 2018', 'stackoverflow', 'Mar 12, 2018', 'no date text', '3/12/2018']})
df["Col1"] = pd.to_datetime(df["Col1"], errors="coerce")
df = df[df["Col1"].notnull()]
print(df)
        Col1
0 2018-12-03
1 2018-03-12
3 2018-03-12
5 2018-03-12
             Col1
0       12/3/2018
1  March 12, 2018
3    Mar 12, 2018
5       3/12/2018

这是一种方法。使用带有
errors=“胁迫”

Ex:

import pandas as pd
df = pd.DataFrame({"Col1": ['12/3/2018', 'March 12, 2018', 'stackoverflow', 'Mar 12, 2018', 'no date text', '3/12/2018']})
df["Col1"] = pd.to_datetime(df["Col1"], errors="coerce")
df = df[df["Col1"].notnull()]
print(df)
        Col1
0 2018-12-03
1 2018-03-12
3 2018-03-12
5 2018-03-12
             Col1
0       12/3/2018
1  March 12, 2018
3    Mar 12, 2018
5       3/12/2018
输出:

import pandas as pd
df = pd.DataFrame({"Col1": ['12/3/2018', 'March 12, 2018', 'stackoverflow', 'Mar 12, 2018', 'no date text', '3/12/2018']})
df["Col1"] = pd.to_datetime(df["Col1"], errors="coerce")
df = df[df["Col1"].notnull()]
print(df)
        Col1
0 2018-12-03
1 2018-03-12
3 2018-03-12
5 2018-03-12
             Col1
0       12/3/2018
1  March 12, 2018
3    Mar 12, 2018
5       3/12/2018
或者,如果要维护原始数据

输出:

import pandas as pd
df = pd.DataFrame({"Col1": ['12/3/2018', 'March 12, 2018', 'stackoverflow', 'Mar 12, 2018', 'no date text', '3/12/2018']})
df["Col1"] = pd.to_datetime(df["Col1"], errors="coerce")
df = df[df["Col1"].notnull()]
print(df)
        Col1
0 2018-12-03
1 2018-03-12
3 2018-03-12
5 2018-03-12
             Col1
0       12/3/2018
1  March 12, 2018
3    Mar 12, 2018
5       3/12/2018