Python 如何从没有列或行名称的数据框中删除字符串值
我有一个csv文件“flagged_dates.csv”,其中包含字符串值。其中一些是日期,另一些值为零。我想摆脱零,但我正在努力找到解决办法。我曾想过使用str.rstrip之类的东西,但我需要我没有的列名。你能提出什么建议吗?提前感谢:) 以下是数据帧的一个示例:Python 如何从没有列或行名称的数据框中删除字符串值,python,pandas,dataframe,numpy,Python,Pandas,Dataframe,Numpy,我有一个csv文件“flagged_dates.csv”,其中包含字符串值。其中一些是日期,另一些值为零。我想摆脱零,但我正在努力找到解决办法。我曾想过使用str.rstrip之类的东西,但我需要我没有的列名。你能提出什么建议吗?提前感谢:) 以下是数据帧的一个示例: flagged_dates = pd.read_csv('/content/drive/MyDrive/shared/data/flag_raster.csv') print(flagged_dates.iloc[
flagged_dates = pd.read_csv('/content/drive/MyDrive/shared/data/flag_raster.csv')
print(flagged_dates.iloc[:10, :10].to_csv(index=False)) #The entire dataframe contains 100 rows and columns
Unnamed: 0,0,1,2,3,4,5,6,7,8
0,0,2019-10-10 21:59:17.074007,2015-10-13 00:55:55.544607,2017-05-24 06:00:15.959202,0,2016-12-07 09:01:04.729686,0,0,2019-05-29 11:16:44.130063
1,0,0,0,2019-02-21 07:15:12.114444,2017-04-29 17:44:49.584567,2017-06-28 11:26:30.686198,2019-03-25 10:18:28.397219,2019-05-01 01:27:21.282324,0
2,0,0,2016-09-22 04:08:48.025359,0,0,2016-09-24 17:35:26.833975,0,0,0
3,0,0,0,2015-07-12 21:13:44.182608,0,0,2017-10-03 22:17:52.257038,2019-01-04 08:20:07.684796,0
4,0,0,0,0,0,2016-03-04 10:12:21.341795,0,0,0
5,2016-08-23 09:22:21.965876,2018-05-01 09:12:19.511673,2017-12-12 07:00:04.313859,0,0,2016-10-23 15:30:11.193316,2016-03-01 01:22:12.548658,2015-06-14 21:36:41.142947,2018-08-19 23:37:49.534862
6,0,0,2019-01-29 16:23:27.070208,0,0,0,2016-08-08 01:13:21.147689,0,0
7,0,0,2017-12-04 22:51:46.265644,0,0,2016-05-13 05:11:55.885217,0,0,0
8,0,0,2018-03-16 03:45:21.555053,0,0,0,0,2019-12-04 04:04:20.600046,0
9,0,0,0,0,2018-01-10 08:28:51.902587,0,0,0,2015-05-05 06:25:43.124125
如果没有列名,只需重命名列(它们将按照列表中的顺序重命名) 对于4列数据帧:
flagged_dates = pd.read_csv('/content/drive/MyDrive/shared/data/flag_raster.csv')
print(flagged_dates.iloc[:10, :10].to_csv(index=False)) #The entire dataframe contains 100 rows and columns
Unnamed: 0,0,1,2,3,4,5,6,7,8
0,0,2019-10-10 21:59:17.074007,2015-10-13 00:55:55.544607,2017-05-24 06:00:15.959202,0,2016-12-07 09:01:04.729686,0,0,2019-05-29 11:16:44.130063
1,0,0,0,2019-02-21 07:15:12.114444,2017-04-29 17:44:49.584567,2017-06-28 11:26:30.686198,2019-03-25 10:18:28.397219,2019-05-01 01:27:21.282324,0
2,0,0,2016-09-22 04:08:48.025359,0,0,2016-09-24 17:35:26.833975,0,0,0
3,0,0,0,2015-07-12 21:13:44.182608,0,0,2017-10-03 22:17:52.257038,2019-01-04 08:20:07.684796,0
4,0,0,0,0,0,2016-03-04 10:12:21.341795,0,0,0
5,2016-08-23 09:22:21.965876,2018-05-01 09:12:19.511673,2017-12-12 07:00:04.313859,0,0,2016-10-23 15:30:11.193316,2016-03-01 01:22:12.548658,2015-06-14 21:36:41.142947,2018-08-19 23:37:49.534862
6,0,0,2019-01-29 16:23:27.070208,0,0,0,2016-08-08 01:13:21.147689,0,0
7,0,0,2017-12-04 22:51:46.265644,0,0,2016-05-13 05:11:55.885217,0,0,0
8,0,0,2018-03-16 03:45:21.555053,0,0,0,0,2019-12-04 04:04:20.600046,0
9,0,0,0,0,2018-01-10 08:28:51.902587,0,0,0,2015-05-05 06:25:43.124125
df.columns=['col1','col2','col3','col4']
我敢打赌,在某个时候,您将不得不处理它,因此在列名问题得到解决的情况下开始数据争论是一个好的做法。如果您没有列名,您可以简单地重命名您的列(它们将按照您在列表中声明的顺序重命名) 对于4列数据帧:
flagged_dates = pd.read_csv('/content/drive/MyDrive/shared/data/flag_raster.csv')
print(flagged_dates.iloc[:10, :10].to_csv(index=False)) #The entire dataframe contains 100 rows and columns
Unnamed: 0,0,1,2,3,4,5,6,7,8
0,0,2019-10-10 21:59:17.074007,2015-10-13 00:55:55.544607,2017-05-24 06:00:15.959202,0,2016-12-07 09:01:04.729686,0,0,2019-05-29 11:16:44.130063
1,0,0,0,2019-02-21 07:15:12.114444,2017-04-29 17:44:49.584567,2017-06-28 11:26:30.686198,2019-03-25 10:18:28.397219,2019-05-01 01:27:21.282324,0
2,0,0,2016-09-22 04:08:48.025359,0,0,2016-09-24 17:35:26.833975,0,0,0
3,0,0,0,2015-07-12 21:13:44.182608,0,0,2017-10-03 22:17:52.257038,2019-01-04 08:20:07.684796,0
4,0,0,0,0,0,2016-03-04 10:12:21.341795,0,0,0
5,2016-08-23 09:22:21.965876,2018-05-01 09:12:19.511673,2017-12-12 07:00:04.313859,0,0,2016-10-23 15:30:11.193316,2016-03-01 01:22:12.548658,2015-06-14 21:36:41.142947,2018-08-19 23:37:49.534862
6,0,0,2019-01-29 16:23:27.070208,0,0,0,2016-08-08 01:13:21.147689,0,0
7,0,0,2017-12-04 22:51:46.265644,0,0,2016-05-13 05:11:55.885217,0,0,0
8,0,0,2018-03-16 03:45:21.555053,0,0,0,0,2019-12-04 04:04:20.600046,0
9,0,0,0,0,2018-01-10 08:28:51.902587,0,0,0,2015-05-05 06:25:43.124125
df.columns=['col1','col2','col3','col4']
我敢打赌,在某个时候,你将不得不处理它,因此最好先解决你的列名问题,然后开始数据争论。如果你只对日期感兴趣,你可以执行以下操作,以获得不包括0的日期列表
将熊猫作为pd导入
df1=pd.read\u csv('myCsv.csv')
my_list=df1.values.flatte()
我的列表=我的列表[我的列表!='0']
打印(我的列表)#我的列表是一个numpy.ndarray
输入myCsv.csv
输出
['2019-02-21 07:15:12.114444' '2017-04-29 17:44:49.584567'
'2017-06-28 11:26:30.686198' '2019-03-25 10:18:28.397219'
'2019-05-01 01:27:21.282324' '2016-09-22 04:08:48.025359'
'2016-09-24 17:35:26.833975' '2015-07-12 21:13:44.182608'
'2017-10-03 22:17:52.257038' '2019-01-04 08:20:07.684796'
'2016-03-04 10:12:21.341795' '2016-08-23 09:22:21.965876'
'2018-05-01 09:12:19.511673' '2017-12-12 07:00:04.313859'
'2016-10-23 15:30:11.193316' '2016-03-01 01:22:12.548658'
'2015-06-14 21:36:41.142947' '2018-08-19 23:37:49.534862'
'2019-01-29 16:23:27.070208' '2016-08-08 01:13:21.147689'
'2017-12-04 22:51:46.265644' '2016-05-13 05:11:55.885217'
'2018-03-16 03:45:21.555053' '2019-12-04 04:04:20.600046'
'2018-01-10 08:28:51.902587' '2015-05-05 06:25:43.124125']
如果您只对日期感兴趣,则可以执行以下操作以获取不包括0的日期列表
将熊猫作为pd导入
df1=pd.read\u csv('myCsv.csv')
my_list=df1.values.flatte()
我的列表=我的列表[我的列表!='0']
打印(我的列表)#我的列表是一个numpy.ndarray
输入myCsv.csv
输出
['2019-02-21 07:15:12.114444' '2017-04-29 17:44:49.584567'
'2017-06-28 11:26:30.686198' '2019-03-25 10:18:28.397219'
'2019-05-01 01:27:21.282324' '2016-09-22 04:08:48.025359'
'2016-09-24 17:35:26.833975' '2015-07-12 21:13:44.182608'
'2017-10-03 22:17:52.257038' '2019-01-04 08:20:07.684796'
'2016-03-04 10:12:21.341795' '2016-08-23 09:22:21.965876'
'2018-05-01 09:12:19.511673' '2017-12-12 07:00:04.313859'
'2016-10-23 15:30:11.193316' '2016-03-01 01:22:12.548658'
'2015-06-14 21:36:41.142947' '2018-08-19 23:37:49.534862'
'2019-01-29 16:23:27.070208' '2016-08-08 01:13:21.147689'
'2017-12-04 22:51:46.265644' '2016-05-13 05:11:55.885217'
'2018-03-16 03:45:21.555053' '2019-12-04 04:04:20.600046'
'2018-01-10 08:28:51.902587' '2015-05-05 06:25:43.124125']
您想用什么替换“0”?我想完全删除它们。但是这会影响数据帧的结构,结构是否不重要?或者你想要一个空字符串来代替零吗?因为我只对实际日期感兴趣(因为零被认为我们没有从卫星上得到任何信息)我认为我根本不需要它们。将日期作为一个
列表就足够了吗?您想用什么替换“0”?我想完全删除它们。但是这会影响数据帧的结构,结构是否不重要?或者你想要一个空字符串来代替零吗?因为我只对实际日期感兴趣(因为零被认为我们没有从卫星上得到任何信息)我想我根本不需要它们。仅仅把日期列成一个列表就够了吗?我试图避免这样,但我想你是对的:p谢谢你的回答:)没问题,伙计,我们都在学习:)哈哈,没错:)我试图避免,但我想你是对的:p谢谢你的回答:)没问题,伙计,我们都在学习:)哈哈,没错:)非常感谢,这对我帮助很大!非常感谢,这对我帮助很大!