Python 如何在pandas中每N行重复2列?
我有一个像这样的dfPython 如何在pandas中每N行重复2列?,python,pandas,dataframe,Python,Pandas,Dataframe,我有一个像这样的df id rent place 0 Yes colorado 0 yes Mexico 0 yes Brazil 1 yes colorado 1 yes Mexico 1 yes Brazil 2 yes colorado 2 yes Mexico 2 yes Brazil 3 yes colorado 3 yes Mexico 3 yes Brazil 我需要“id”列继续增加1,“place”列中的值每隔3行重复一次。我不知道怎么做。
id rent place
0 Yes colorado
0 yes Mexico
0 yes Brazil
1 yes colorado
1 yes Mexico
1 yes Brazil
2 yes colorado
2 yes Mexico
2 yes Brazil
3 yes colorado
3 yes Mexico
3 yes Brazil
我需要“id”列继续增加1,“place”列中的值每隔3行重复一次。我不知道怎么做。您可以生成一个新的,如下所示:
N = 200
from itertools import cycle
places = cycle(["colorado", "mexico", "brazil"])
data = {"id": [j//3 for j in range(N)], "rent": True, "place": [next(places) for j in range(N)]}
df = pd.DataFrame(data)
注意,我已经用布尔值替换了rent
,以减少出错的可能性
而不是文本。输出:
id rent place
0 0 True colorado
1 0 True mexico
2 0 True brazil
3 1 True colorado
4 1 True mexico
.. .. ... ...
195 65 True colorado
196 65 True mexico
197 65 True brazil
198 66 True colorado
199 66 True mexico
或者,您可以连接dfs,然后对其进行排序:
df = pd.DataFrame()
for place in ["brazil", "colorado", "mexico"]:
sub_df = pd.DataFrame({"id": range(N), "rent": True, "place": place})
df = pd.concat([df, sub_df], axis=0)
df = df.sort_values(["id"])
您可以生成一个新的,如下所示:
N = 200
from itertools import cycle
places = cycle(["colorado", "mexico", "brazil"])
data = {"id": [j//3 for j in range(N)], "rent": True, "place": [next(places) for j in range(N)]}
df = pd.DataFrame(data)
注意,我已经用布尔值替换了rent
,以减少出错的可能性
而不是文本。输出:
id rent place
0 0 True colorado
1 0 True mexico
2 0 True brazil
3 1 True colorado
4 1 True mexico
.. .. ... ...
195 65 True colorado
196 65 True mexico
197 65 True brazil
198 66 True colorado
199 66 True mexico
或者,您可以连接dfs,然后对其进行排序:
df = pd.DataFrame()
for place in ["brazil", "colorado", "mexico"]:
sub_df = pd.DataFrame({"id": range(N), "rent": True, "place": place})
df = pd.concat([df, sub_df], axis=0)
df = df.sort_values(["id"])
您可以逐行构建数据帧,并根据需要附加相关行
id = [0,1,2,3]
rent = [123, 'yes', 'yes']
place = ['colorado', 'Mexico', 'Brazil']
df = pd.DataFrame({'rent': [], 'place': []}, index=[]) #empty df
for i in range(len(id)):
for j in range(len(rent)):
df = df.append(pd.DataFrame({'rent': rent[j], 'place': place[j]}, index=[id[i]]))
df.reset_index(inplace=True)
df.rename(columns={'index': 'id'}, inplace=True)
输出df为:
id rent place
0 0 123 colorado
1 0 yes Mexico
2 0 yes Brazil
3 1 123 colorado
4 1 yes Mexico
5 1 yes Brazil
6 2 123 colorado
7 2 yes Mexico
8 2 yes Brazil
9 3 123 colorado
10 3 yes Mexico
11 3 yes Brazil
您可以逐行构建数据帧,并根据需要附加相关行
id = [0,1,2,3]
rent = [123, 'yes', 'yes']
place = ['colorado', 'Mexico', 'Brazil']
df = pd.DataFrame({'rent': [], 'place': []}, index=[]) #empty df
for i in range(len(id)):
for j in range(len(rent)):
df = df.append(pd.DataFrame({'rent': rent[j], 'place': place[j]}, index=[id[i]]))
df.reset_index(inplace=True)
df.rename(columns={'index': 'id'}, inplace=True)
输出df为:
id rent place
0 0 123 colorado
1 0 yes Mexico
2 0 yes Brazil
3 1 123 colorado
4 1 yes Mexico
5 1 yes Brazil
6 2 123 colorado
7 2 yes Mexico
8 2 yes Brazil
9 3 123 colorado
10 3 yes Mexico
11 3 yes Brazil
好啊代码是有效的,但在“实际租金”列中,一切都不一样。抱歉,应该创建更好的示例df。所以在租金栏中,既有数字也有文字。如何更改这部分代码?好的。代码是有效的,但在“实际租金”列中,一切都不一样。抱歉,应该创建更好的示例df。所以在租金栏中,既有数字也有文字。如何更改这部分代码?嗨@Lola Akinsehinwa,这是您需要的输出吗?“租金”列混合了数字和文本hi@Lola Akinsehinwa,这是您需要的输出吗?“租金”列包含数字和文本