Python 熊猫-当字符串匹配时，选择两个值之间的所有行_Python_Pandas

Python 熊猫-当字符串匹配时，选择两个值之间的所有行

python pandas

Python 熊猫-当字符串匹配时，选择两个值之间的所有行,python,pandas,Python,Pandas,我有两个数据帧： import pandas as pd import numpy as np d = {'fruit': ['apple', 'pear', 'peach'] * 5, 'values': np.random.randint(0,1000,15)} df = pd.DataFrame(data=d) d2 = {'fruit': ['apple', 'pear', 'peach'] * 2, 'min': [43, 196, 143, 174, 510, 450], 'max

我有两个数据帧：

import pandas as pd
import numpy as np
d = {'fruit': ['apple', 'pear', 'peach'] * 5, 'values': np.random.randint(0,1000,15)}
df = pd.DataFrame(data=d)

d2 = {'fruit': ['apple', 'pear', 'peach'] * 2, 'min': [43, 196, 143, 174, 510, 450], 'max': [120, 310, 311, 563, 549, 582]}
df2 = pd.DataFrame(data=d2)

我想选择

df

中的所有行，并在

min

和

max

之间匹配

fruit

到

df2

和

值
我正在尝试类似的东西：
df.loc[df['fruit'].isin(df2['fruit'])].loc[df['values'].between(df2['min'], df2['max'])]

但可以预见的是，这将返回一个ValueError：只能比较标记相同的系列对象
编辑：您会注意到，df2
中重复了fruit
。这是故意的。我仍然像上面那样试图抓取min
和max
之间的行，但我不只是想折叠水果，然后取绝对min
和max
之间的行
例如，在df1
中，其中fruit
==“apple”我想要值在43-120和174-563之间的所有行
df3 = df.merge(df2, on='fruit', how='inner') # Thanks for Henry Ecker for suggesting inner join
df3 = df3.loc[(df3['min'] < df3['values']) & (df3['max'] > df3['values'])]
df3

如果我们不希望输出中出现min
和max
列
    fruit   values  min max
3   apple   883     467 947
6   apple   805     467 947
9   apple   932     467 947
11  peach   331     307 618
12  apple   665     467 947

df3 = df3.drop(columns=['min', 'max'])
df3

    fruit   values
3   apple   883
6   apple   805
9   apple   932
11  peach   331
12  apple   665

输出
    fruit   values  min max
3   apple   883     467 947
6   apple   805     467 947
9   apple   932     467 947
11  peach   331     307 618
12  apple   665     467 947

df3 = df3.drop(columns=['min', 'max'])
df3

    fruit   values
3   apple   883
6   apple   805
9   apple   932
11  peach   331
12  apple   665

合并的表将以NAN结束，其中在df1
中有值，但在df2
中没有值。可能是内部合并，因为我们要在两个帧中查找值？合并在存在重复项的地方返回重复项。所以下面的答案应该有用。如果没有，请发布一个没有随机值的数据帧，并发布预期的输出，这样我们就可以清楚了。好的一点，忽略了这一点。