Python 用新数据填充熊猫DF_Python_Pandas

Python 用新数据填充熊猫DF

python pandas

Python 用新数据填充熊猫DF,python,pandas,Python,Pandas,这是我关于StackOverflow的第一篇文章。如果我做错了什么或违反了网络规则，我表示歉意。我的问题是：我在python中使用pandas读取了一个csv文件。数据帧有五列，名为[yday，wday，time，stop，N]： yday是一年中的一天，从1到365 wday是一周中的一天，从1到7 time是一个1-144之间的数字（我将一天分成10分钟的间隔，每天1440分钟/10分钟=144） stop是公共汽车站的编号（1-4） N是进入巴士的乘客人数好吧，我想为每个间隙设置一个条

这是我关于StackOverflow的第一篇文章。如果我做错了什么或违反了网络规则，我表示歉意。我的问题是：我在python中使用

pandas

读取了一个csv文件。数据帧有五列，名为

[yday，wday，time，stop，N]

：

yday

是一年中的一天，从1到365

wday

是一周中的一天，从1到7

time

是一个1-144之间的数字（我将一天分成10分钟的间隔，每天1440分钟/10分钟=144）

stop

是公共汽车站的编号（1-4）

是进入巴士的乘客人数

好吧，我想为每个间隙设置一个条目，每天给出144行，但我有一些缺失的间隙，如您所见：

我的目标是添加新行以填补所有时间间隔，例如添加（基于给定的图像）：

320,6,81,1,1为冗长的代码道歉。但它解决了你的目的：
#Specify the input csv file path here. I am assuming your csv has only 
#the columns and column names you mentioned in your question. If not you have to 
#modify the below code to reflect your changed columns
df = pd.read_csv("Path_to_your_input_csv_file_here") 

df = df.sort_values(['yday', 'wday', 'time'], ascending=True) #Sort df values based upon yday, wday and time first
df = df.reset_index(drop=True) #Reset the indices after sorting

df2 = df.copy(deep=True) #Make a deep copy of this sorted dataframe

#The below for loop iterates through rows of 'df', finds differences between time values and adds up missing rows to 'df2'
for index, row in df.iterrows(): #Iterate through the rows of df
    if index == len(df)-1:
        break
    else:
        if row["yday"] == df.loc[index+1,"yday"] and row["wday"] == df.loc[index+1,"wday"] and row["time"] < df.loc[index+1,"time"]:
            differences = list(range(row["time"]+1,df.loc[index+1,"time"]))
            for item in differences:
                tempdf = pd.DataFrame([[row["yday"], row["wday"],item, row['stop'], 'NA' ]],columns = df2.columns)
                df2 = df2.append(tempdf)

#Now sort 'df2' based upon yday,wday and time
df2 = df2.sort_values(['yday', 'wday', 'time'], ascending=True)
df2 = df2.reset_index(drop=True) #Reset indices

print(df2)

干杯
 很抱歉代码太长。但它解决了你的目的：
#Specify the input csv file path here. I am assuming your csv has only 
#the columns and column names you mentioned in your question. If not you have to 
#modify the below code to reflect your changed columns
df = pd.read_csv("Path_to_your_input_csv_file_here") 

df = df.sort_values(['yday', 'wday', 'time'], ascending=True) #Sort df values based upon yday, wday and time first
df = df.reset_index(drop=True) #Reset the indices after sorting

df2 = df.copy(deep=True) #Make a deep copy of this sorted dataframe

#The below for loop iterates through rows of 'df', finds differences between time values and adds up missing rows to 'df2'
for index, row in df.iterrows(): #Iterate through the rows of df
    if index == len(df)-1:
        break
    else:
        if row["yday"] == df.loc[index+1,"yday"] and row["wday"] == df.loc[index+1,"wday"] and row["time"] < df.loc[index+1,"time"]:
            differences = list(range(row["time"]+1,df.loc[index+1,"time"]))
            for item in differences:
                tempdf = pd.DataFrame([[row["yday"], row["wday"],item, row['stop'], 'NA' ]],columns = df2.columns)
                df2 = df2.append(tempdf)

#Now sort 'df2' based upon yday,wday and time
df2 = df2.sort_values(['yday', 'wday', 'time'], ascending=True)
df2 = df2.reset_index(drop=True) #Reset indices

print(df2)

干杯
 您的reindex有什么问题？您只需要用缺少的time
值扩展索引，然后使用df。reindex
应该可以工作：这能回答您的问题吗？您的reindex有什么问题？您只需要用缺少的time
值扩展索引，然后使用df。reindex应该可以工作：这能回答您的问题吗？老兄，你无法想象我现在有多爱你，这很管用！！老兄，你无法想象我现在有多爱你，这很管用！！泰迪
    yday  wday  time  stop   N
0    320     6    81     1   1
1    320     6    82     1  NA
2    320     6    83     1  NA
3    320     6    84     1  NA
4    320     6    85     1   1
5    320     6    86     1  NA
6    320     6    87     1  NA
7    320     6    88     1  NA
8    320     6    89     1   1
9    320     6    90     1  NA
10   320     6    91     1  NA
11   320     6    92     1  NA
12   320     6    93     1   1
13   320     6    94     1  NA
14   320     6    95     1  NA
15   320     6    96     1  NA
16   320     6    97     1   1