Python 给定2个整数列表，如何找到不重叠的范围？_Python_List_Pandas_Numpy_Range

Python 给定2个整数列表，如何找到不重叠的范围？

python list pandas numpy

Python 给定2个整数列表，如何找到不重叠的范围？,python,list,pandas,numpy,range,Python,List,Pandas,Numpy,Range,给定目标是迭代x_i，找到大于x_i但不大于x_i+1的y值假设两个列表都已排序且所有项目都是唯一的，给定x和y的所需输出为： x = [5, 30, 58, 72] y = [8, 35, 53, 60, 66, 67, 68, 73] 我试过： [(5, 8), (30, 35), (58, 60), (72, 73)] def per_窗口（顺序，n=1）： """ 从…起http://stackoverflow.com/q/42220614/610569 >>>列表（每个窗口（[

给定

目标是迭代

x_i

，找到大于

x_i

但不大于

x_i+1的y
值
假设两个列表都已排序且所有项目都是唯一的，给定x
和y
的所需输出为：
x = [5, 30, 58, 72]
y = [8, 35, 53, 60, 66, 67, 68, 73]

我试过：
[(5, 8), (30, 35), (58, 60), (72, 73)]

def per_窗口（顺序，n=1）：
"""
从…起http://stackoverflow.com/q/42220614/610569
>>>列表（每个窗口（[1,2,3,4]，n=2））
[(1, 2), (2, 3), (3, 4)]
>>>列表（每个窗口（[1,2,3,4]，n=3））
[(1, 2, 3), (2, 3, 4)]
"""
开始，停止=0，n
seq=列表（顺序）
停止XI和YJ <席普斯1：
r、 附加（（xi，yj））
打破
#对于最后一个x值。
#对于最后一个x值。
对于枚举（y）中的j，yj：
如果yj>xiplus1：
r、 追加（（xiplus1，yj））
打破

但是有没有一种更简单的方法可以用numpy
、pandas
或其他东西来实现同样的效果呢？
你可以使用numpy.searchsorted
和side='right'
找到y
中大于x
的第一个值的索引，然后用该索引提取元素；假设y
中总有一个值大于x
中任何元素的简单版本可以是：
def per_window(sequence, n=1):
    """
    From http://stackoverflow.com/q/42220614/610569
        >>> list(per_window([1,2,3,4], n=2))
        [(1, 2), (2, 3), (3, 4)]
        >>> list(per_window([1,2,3,4], n=3))
        [(1, 2, 3), (2, 3, 4)]
    """
    start, stop = 0, n
    seq = list(sequence)
    while stop <= len(seq):
        yield tuple(seq[start:stop])
        start += 1
        stop += 1

x = [5, 30, 58, 72]
y = [8, 35, 53, 60, 66, 67, 68, 73]

r = []

for xi, xiplus1 in per_window(x, 2):
    for j, yj in enumerate(y):
        if yj > xi and yj < xiplus1:
            r.append((xi, yj))
            break

# For the last x value.
# For the last x value.
for j, yj in enumerate(y):
    if yj > xiplus1:
        r.append((xiplus1, yj))
        break


给定的y
已排序：
x = np.array([5, 30, 58, 72])
y = np.array([8, 35, 53, 60, 66, 67, 68, 73])

np.column_stack((x, y[np.searchsorted(y, x, side='right')]))
#array([[ 5,  8],
#       [30, 35],
#       [58, 60],
#       [72, 73]])

返回y
中的第一个值的索引，该索引大于x中的相应值（列表1，列表2）：
np.searchsorted(y, x, side='right')
# array([0, 1, 3, 7])

最终=[]
对于范围内的i（len（列表1））：
pos=0
尝试：
尽管如此：
如果i+1==len（list1）和list1[i]，则可以通过迭代自身压缩的x
来构造一个新列表——由1个索引偏移，并附加y
的最后一个元素，然后迭代y，检查每个过程的条件并中断最内部的循环
def find(list1,list2):
    final = []
    for i in range(len(list1)):
        pos=0
        try:
            while True:
                if i+1==len(list1) and list1[i]<list2[pos]:
                    final.append((list1[i],list2[pos]))
                    raise Exception
                if list1[i]<list2[pos] and list1[i+1]>list2[pos]:
                    final.append((list1[i],list2[pos]))
                    raise Exception
                pos+=1
        except: pass
    return final

out=[]
对于拉链中的x_低，x_高（x，x[1:+y[-1:]）：
对于y中的yy：
如果（yy>x_-low）和（yy我们可以在列表中使用pd.DataFrame
和direction=forward，即
out = []
for x_low, x_high in zip(x, x[1:]+y[-1:]):
    for yy in y:
        if (yy>x_low) and (yy<=x_high):
            out.append((x_low,yy))
            break

out
# returns:
[(5, 8), (30, 35), (58, 60), (72, 73)]

如果不需要精确匹配来匹配，则需要通过允许精确匹配=False
来合并
输出：
new = pd.merge_asof(pd.DataFrame(x,index=x), pd.DataFrame(y,index=y),on=0,left_index=True,direction='forward')
out = list(zip(new[0],new.index))

当你在y
中有超过1个介于x[i]
和x[i+1]
之间的数字时，你希望发生什么？以第一个为例。实际上，最近的数字可能无法满足OP的需要，因为当你得到y=[8,35,53,57,60,66,67,68,73]
插入哪个57
，它不会得到预期的结果，我看我们可以使用forward如果你使用forward，当x=[5,30,58,59,72]时它可能不起作用。是的，现在我看到了缺点。@Tangfeifan我想情况就是问题评论中的意思，不是吗。
[(5, 8), (30, 35), (58, 60), (72, 73)]