Python 忽略列索引的数据帧算法
数据帧算法始终将索引名和列名对齐。如果我有两个列数相同但列名不同的dfs,似乎我无法在它们之间进行算术运算:Python 忽略列索引的数据帧算法,python,pandas,Python,Pandas,数据帧算法始终将索引名和列名对齐。如果我有两个列数相同但列名不同的dfs,似乎我无法在它们之间进行算术运算: Out[1]: length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2']) length Out[2]: length1 length2 0 -0.430872 1.087211 1 -0.788218 -0.440801 2
Out[1]:
length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
length
Out[2]:
length1 length2
0 -0.430872 1.087211
1 -0.788218 -0.440801
2 -0.540136 -1.217191
3 -0.561248 0.305545
4 0.158832 0.075283
height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(1,6),columns=['height1','height2'])
height
Out[3]:
height1 height2
1 -1.105751 1.089808
2 -0.360827 -0.803927
3 0.454469 -0.766144
4 0.476534 -0.855870
5 -0.007049 0.038307
length*height
Out[4]:
height1 height2 length1 length2
0 NaN NaN NaN NaN
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
这可能是一种安全措施,以确保您仅对预期数据进行操作。但我仍然想知道是否有一种方法可以在两个数据帧(具有相同的列数)之间执行操作,但只能在索引轴上对齐
编辑:原始示例过于简化,因为两个df具有相同的索引[0,1,2,3,4]。我将第二个df的索引移动了1,以使其成为更好的示例
ans=pd.DataFrame(length.values * height.values)
将其转换为numpy数组并执行类似的乘法
0 1
0 0.396724 -0.264562
1 -0.460419 -0.285086
2 0.126083 -0.494675
3 -0.272121 0.305155
4 -0.159292 0.444439
将其转换为numpy数组并执行类似的乘法
0 1
0 0.396724 -0.264562
1 -0.460419 -0.285086
2 0.126083 -0.494675
3 -0.272121 0.305155
4 -0.159292 0.444439
根据user3589054的做法,我认为这段代码可能适合您:
height.multiply(length.values, axis = 0)
以下是我的输出:
>>> length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
>>> height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['height1','height2'])
>>> length
length1 length2
0 1.000865 -0.758316
1 0.285942 -2.000440
2 -0.399625 0.686547
3 0.809561 1.238211
4 2.216696 -1.347227
>>> height
height1 height2
0 0.505477 -0.299634
1 -0.234154 -2.490459
2 -0.134534 1.063768
3 0.010025 0.435895
4 2.290053 -0.096494
>>> height.multiply(length.values, axis = 0)
height1 height2
0 0.505915 0.227217
1 -0.066954 4.982013
2 0.053763 0.730326
3 0.008116 0.539730
4 5.076352 0.129999
根据user3589054的做法,我认为这段代码可能适合您:
height.multiply(length.values, axis = 0)
以下是我的输出:
>>> length = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['length1','length2'])
>>> height = pd.DataFrame(data=np.random.normal(size=[5,2]),index=range(5),columns=['height1','height2'])
>>> length
length1 length2
0 1.000865 -0.758316
1 0.285942 -2.000440
2 -0.399625 0.686547
3 0.809561 1.238211
4 2.216696 -1.347227
>>> height
height1 height2
0 0.505477 -0.299634
1 -0.234154 -2.490459
2 -0.134534 1.063768
3 0.010025 0.435895
4 2.290053 -0.096494
>>> height.multiply(length.values, axis = 0)
height1 height2
0 0.505915 0.227217
1 -0.066954 4.982013
2 0.053763 0.730326
3 0.008116 0.539730
4 5.076352 0.129999
注意,这也会忽略索引对齐。我同意YS-L的观点,我不想失去索引对齐功能。我的例子过于简化。请注意,这也会忽略索引对齐。我同意YS-L的观点,我不想失去索引对齐功能。我的例子过于简化了。