Python 关于索引的bin隶属度的两个级数相加的泛音速方法安装程序_Python_Pandas

Python 关于索引的bin隶属度的两个级数相加的泛音速方法安装程序

python pandas

Python 关于索引的bin隶属度的两个级数相加的泛音速方法安装程序,python,pandas,Python,Pandas,假设： s1的指数保证单调递增 s2中没有小于s1最小索引的索引澄清：不能假定s2的索引有任何特定顺序期望结果我想将s1的值添加到s2的值中，如下所示（请参见注释进行解释）：企图我创建了s1和s2的版本，其中箱子是索引 >>> result 0 100 # 100 + 0, because index 0 is in [0, 2) 1 110 # 100 + 10, because index 1 is in [0, 2) 2 1020

假设：

```
s1
```
的指数保证单调递增
```
s2
```
中没有小于
```
s1
```
最小索引的索引

澄清：不能假定

s2

的索引有任何特定顺序

期望结果我想将

s1

的值添加到

s2

的值中，如下所示（请参见注释进行解释）：

企图我创建了

s1

和

s2

的版本，其中箱子是索引

>>> result
0      100 # 100 + 0, because index 0 is in [0, 2)
1      110 # 100 + 10, because index 1 is in [0, 2)
2     1020 # 1000 + 20, because index 2 is in [2, 5)
3     1030 # 1000 + 20, because index 3 is in [2, 5)
4     1040 # 1000 + 40, because index 4 is in [2, 5)
5    10050 # 10000 + 50, because index 5 is in [5, inf)
6    10060 # 10000 + 50, because index 6 is in [5, inf)
dtype: int64

然后我索引到

s1

，索引为

s2

，以获得要添加的值

>>> edges = [*s1.index, np.inf]
>>> s1_binned = pd.Series(s1.values, index=pd.cut(s1.index, bins=edges, right=False))
>>> s2_binned = pd.Series(s2.values, index=pd.cut(s2.index, bins=edges, right=False))
s1_binned
[0.0, 2.0)      100
[2.0, 5.0)     1000
[5.0, inf)    10000
dtype: int64
>>> s2_binned
[0.0, 2.0)     0
[0.0, 2.0)    10
[2.0, 5.0)    20
[2.0, 5.0)    30
[2.0, 5.0)    40
[5.0, inf)    50
[5.0, inf)    60
dtype: int32

最后，我可以将

的值添加到_add

到

s2

>>> to_add = s1_binned[s2_binned.index]
>>> to_add
[0.0, 2.0)      100
[0.0, 2.0)      100
[2.0, 5.0)     1000
[2.0, 5.0)     1000
[2.0, 5.0)     1000
[5.0, inf)    10000
[5.0, inf)    10000
dtype: int64

我觉得有更好的解决方案，但我没有太多将值“映射”到存储箱的经验。

你的感觉是正确的-有一种更泛音速的方式：

>>> s2 + to_add.values
0      100
1      110
2     1020
3     1030
4     1040
5    10050
6    10060
dtype: int64

让我们把它分解一下：

```
s1.reindex_like（s2）
```
返回其索引类似于
```
s2
```
的序列，其中
```
NaN
```
添加了新索引（即空行）：
```
fillna（method='ffill'）
```
用前面的非空值填充这些空行

编辑： OP在评论中解释说，

s2

的索引不被假定为已排序。
例如：

s2+s1.reindex_like(s2).fillna(method='ffill')

因此，

s2

是

import pandas as pd
import numpy as np
s1 = pd.Series([100, 1000, 10000], index=[0, 2, 5])
s2 = pd.Series(np.arange(7)*10)
s2 = s2[[2,3,5,1,0,4,6]]

显然，我的方法仍然有效：

2    20
3    30
5    50
1    10
0     0
4    40
6    60
dtype: int64

s2+s1.reindex_like(s2, method='ffill')

如果我理解正确，这就是所需的输出。

使用with

method='ffill'

创建的新系列：

2     1020
3     1030
5    10050
1      110
0      100
4     1040
6    10060
dtype: int64

结合@jezrael的答案和我的答案，你可以通过使用

reindex\u like

并传递

方法：s2+s1。reindex\u like（s2，method='ffill'）
这是否要求s2
的索引按月递增？是的，文档中这么说：对不起，在我的例子中，s2
的索引没有任何特定的顺序。显然，即使未排序s2
，它仍然有效。过来看。另外，请按照我的建议编辑您的问题，以获得s2
的未排序索引。这是否要求s2
的索引每月递增？很抱歉，s2
的索引没有任何特定的顺序，就在我的示例中。哎呀，测试了s1
的月增长@actual_panda-hmmm，很抱歉，s2的索引没有任何特定的顺序，就在我的示例中。
-不确定是否理解，但我认为最好在处理前对s1
和s2
进行排序，主要是用于重新索引，重新编制类似的索引
2     1020
3     1030
5    10050
1      110
0      100
4     1040
6    10060
dtype: int64

s1 = s1.sort_index()
s2 = s2.sort_index()

s = s2.add(s1.reindex(s2.index, method='ffill'))
#similar solution
#s = s2.add(s1.reindex_like(s2, method='ffill'))
print (s)
0      100
1      110
2     1020
3     1030
4     1040
5    10050
6    10060
dtype: int64