Python 删除列并复制数据
我有这个数据集Python 删除列并复制数据,python,pandas,Python,Pandas,我有这个数据集 Route STOP_ID AveOn AveOff AveLd PassingTime Period DAYCODE PATTERN_ID BLK RTE DIR PATTERN_QUALITY VEHICLE_ID STOP_TYPE DWELL_SEC DOOR_OPEN_SEC 0 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100
Route STOP_ID AveOn AveOff AveLd PassingTime Period DAYCODE PATTERN_ID BLK RTE DIR PATTERN_QUALITY VEHICLE_ID STOP_TYPE DWELL_SEC DOOR_OPEN_SEC
0 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100 3607 ST 0 0
1 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100 3608 ST 0 0
2 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100 3664 ST 0 0
3 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100 3608 ST 0 0
4 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100 3669 ST 0 0
5 65 9605 2.1 0 24.2 0.3625 AM 0 11065088 6513 65 N 100 3620 ST 0 0
2185 67 35322 8.2 0.2 8 0.318055556 AM 0 20067078 6515 67 S 95 3613 ST 1 1
2187 67 35322 8.2 0.2 8 0.318055556 AM 0 20067078 6515 67 S 95 3674 ST 1 1
3976 67 82237 0.2 0.1 6.6 0.692361111 PM 0 20067078 6508 67 S 95 3676 S 1 0
5203 67 35322 4.7 0 4.7 0.33125 AM 0 20067078 6511 67 S 100 3640 ST 1 1
6723 67 35322 7.5 0 7.5 0.369444444 AM 0 20067078 6507 67 S 100 3658 ST 1 1
6730 67 35322 7.5 0 7.5 0.369444444 AM 0 20067078 6507 67 S 100 3673 ST 1 1
我确实需要删除列和重复行,并删除列(停顿时间)值=0的值我的代码如下所示:
import pandas as pd
import numpy as np
from pandas import ExcelWriter
transit="C:\\Users\\Taqwa\\Desktop\\ttest.xlsx"
xlsx = pd.ExcelFile(transit)
df=pd.read_excel(transit,'Sheet1')
df.columns=df.columns.astype(str)
writer=ExcelWriter("C:\\Users\\Taqwa\\Desktop\\ttest2.xlsx")
df1 = df[df.DWELL_SEC != 0]
for name, sub_df in df.groupby("STOP_ID"):
sub_df.to_excel( writer, str(name))
writer.save()
是否有人可以帮助使用
删除重复项+任何布尔索引函数(我使用了查询):
如果DWELL\u SEC
是一个字符串列,请首先将其转换为:
df.DWELL_SEC = df.DWELL_SEC.astype(int)
df = df.drop_duplicates().query('DWELL_SEC != 0')
在发布此内容之前,您是否预览了格式?不过,我得到的结果是DWELL_SEC=0@TaqwaHadidi检查该列的数据类型。也许是一根绳子。另外,请确保您将结果分配回:df=df.drop_duplicates().query('DWELL_SEC!=0')
您能帮我做两件事吗?我正在尝试删除列,但没有成功,我将以下行df1=df[['AveOn','AveOff','AveLd,'DWELL_SEC']],以及删除3中的重复项columns@TaqwaHadidi所以你想要df=df[['AveOn','AveOff','AveLd,'DWELL_SEC']].删除重复项()
?最后一个问题,如果我的列没有名称怎么办?请处理您的格式设置。只需粘贴代码,突出显示它,然后按ctrl+K。我以前没有这样理解它
df.DWELL_SEC = df.DWELL_SEC.astype(int)
df = df.drop_duplicates().query('DWELL_SEC != 0')
import pandas as pd
import numpy as np
from pandas import ExcelWriter
transit="C:\\Users\\Taqwa\\Desktop\\ttest.xlsx"
xlsx = pd.ExcelFile(transit)
df=pd.read_excel(transit,'Sheet1')
df.DWELL_SEC = df.DWELL_SEC.astype(int)
df = df.drop_duplicates().query('DWELL_SEC != 0')
df.columns=df.columns.astype(str)
df1=df[['AveOn','AveOff','AveLd','DWELL_SEC','STOP_ID']]
df2=df1[['AveOn','AveOff','AveLd','DWELL_SEC','STOP_ID']].drop_duplicates()
writer=ExcelWriter("C:\\Users\\Taqwa\\Desktop\\ttest2.xlsx")
for name, sub_df2 in df2.groupby("STOP_ID"):
sub_df2.to_excel( writer, str(name))
writer.save()