Python 根据集合上的元素复制数据帧_Python_Pandas

Python 根据集合上的元素复制数据帧

python pandas

Python 根据集合上的元素复制数据帧,python,pandas,Python,Pandas,我需要将数据帧中的元素“复制”到集合中元素的倍（比如说一个列表来简化它）。这可能很难用语言来解释，因此我将展示我的代码： In [1]: data = {char: [] for char in 'abcd'} In [2]: n = 3 In [3]: properties = [i for i in range(1, n + 1)] In [4]: l = list(range(1, 11)) In [5]: for e in l: ...: data['a'].append

我需要将

数据帧中的元素“复制”到集合中元素的倍（比如说一个列表来简化它）。这可能很难用语言来解释，因此我将展示我的代码：
In [1]: data = {char: [] for char in 'abcd'}
In [2]: n = 3
In [3]: properties = [i for i in range(1, n + 1)]
In [4]: l = list(range(1, 11))
In [5]: for e in l:
    ...:     data['a'].append(e + e)
    ...:     data['b'].append(e * e)
    ...:     data['c'].append(e ** e)
    ...:     data['d'].append(1.0 / e)
    ...:
In [6]: df = pd.DataFrame(data)
In [7]: df
Out[7]: 
    a    b            c         d
0   2    1            1  1.000000
1   4    4            4  0.500000
2   6    9           27  0.333333
3   8   16          256  0.250000
4  10   25         3125  0.200000
5  12   36        46656  0.166667
6  14   49       823543  0.142857
7  16   64     16777216  0.125000
8  18   81    387420489  0.111111
9  20  100  10000000000  0.100000

根据属性，我需要生成以下DataFrame
：
     a    b            c         d  property
0    2    1            1  1.000000         1
1    4    4            4  0.500000         1
2    6    9           27  0.333333         1
3    8   16          256  0.250000         1
4   10   25         3125  0.200000         1
5   12   36        46656  0.166667         1
6   14   49       823543  0.142857         1
7   16   64     16777216  0.125000         1
8   18   81    387420489  0.111111         1
9   20  100  10000000000  0.100000         1
10   2    1            1  1.000000         2
11   4    4            4  0.500000         2
12   6    9           27  0.333333         2
13   8   16          256  0.250000         2
14  10   25         3125  0.200000         2
15  12   36        46656  0.166667         2
16  14   49       823543  0.142857         2
17  16   64     16777216  0.125000         2
18  18   81    387420489  0.111111         2
19  20  100  10000000000  0.100000         2
20   2    1            1  1.000000         3
21   4    4            4  0.500000         3
22   6    9           27  0.333333         3
23   8   16          256  0.250000         3
24  10   25         3125  0.200000         3
25  12   36        46656  0.166667         3
26  14   49       823543  0.142857         3
27  16   64     16777216  0.125000         3
28  18   81    387420489  0.111111         3
29  20  100  10000000000  0.100000         3

这是根据属性中的元素数量重复我的数据数组，并添加列属性。目前，我使用两个嵌套循环来实现它，如下所示：
new_data = {'a': [], 'b': [], c: [], d: [], 'property': []}
properties = [1, 2, 3]
for property_id in properties:
    for e in l:
        new_data['property'].append(property_id)
        new_data['a'].append(e + e)
        new_data['b'].append(e * e)
        new_data['c'].append(e ** e)
        new_data['d'].append(1.0 / e)
new_df = pd.DataFrame(new_data)

但是，我希望有一种方法可以简化这个逻辑，只需使用我拥有的原始数据
字典或df
并复制我拥有的尽可能多的属性
此问题的主要目标是改进此逻辑的性能。
concat
您是否在寻找自身的concat
df

df = pd.concat(
       [df] * len(properties), ignore_index=True
).assign(property=np.repeat(properties, len(df)))


reindex
+tile
concat
您是否在寻找自身的concat
df

df = pd.concat(
       [df] * len(properties), ignore_index=True
).assign(property=np.repeat(properties, len(df)))


reindex
+tile
您需要一个数据帧列表，然后每次添加一个新列来连接它们
dfs = [df.assign(property=k)  for k in properties]
data = pd.concat(dfs, ignore_index=True)

您需要一个数据帧列表，然后每次添加一个新列来连接它们
dfs = [df.assign(property=k)  for k in properties]
data = pd.concat(dfs, ignore_index=True)

我对你的回答感到震惊。你让它看起来很简单。是的，这正是我要找的。我知道concat
，但我不知道assign
@lmiguelvargasf没问题<代码>分配

是创建新列的简单方法。这与

df=pd.concat（…）相同；df['property']=…

。我获取

属性

列表，根据

df

@lmiguelvargasf的长度，根据需要重复多次，我们为您找到了另一个解决方案。。。测试并让我们知道它是否对您有效。@liliscent，谢谢您也提到了该解决方案。我使用

%%time

尝试了这两种解决方案，似乎

concat

总体上比

reindex

tile

快一点。我对您的答案感到震惊。你让它看起来很简单。是的，这正是我要找的。我知道

concat

，但我不知道

assign

@lmiguelvargasf没问题<代码>分配是创建新列的简单方法。这与

df=pd.concat（…）相同；df['property']=…

。我获取

属性

列表，根据

df

@lmiguelvargasf的长度，根据需要重复多次，我们为您找到了另一个解决方案。。。测试并让我们知道它是否适合您。@liliscent，也感谢您提及该解决方案。我已使用

%%时间尝试了这两种解决方案，一般来说，concat
比reindex
+tile
快一点。在pd.concat
@lmiguevargasf中使用ignore\u index=True
，省去自己的reset\u index
，当然可以。在pd.concat
@lmiguelvargasf中使用ignore\u index=True
，您就不用进行reset\u index
，当然可以。