Python 用前两个选项总结字典_Python_Dictionary

Python 用前两个选项总结字典

python dictionary

Python 用前两个选项总结字典,python,dictionary,Python,Dictionary,字母表为商品名称，[]方括号中的第一个字段为商品类别，[]方括号中的第二个字段为价格，第三个字段为售出的数字我想为每个类别购买最贵（价格）的前两种商品。如果我没有在每个类别中至少有两种商品，我会放弃它。所以我应该得到以下结果 inventory = {'A':['Toy',3, 1000], 'B':['Toy',8, 1100], 'C':['Cloth',15, 1200], 'D':['Cloth',9, 1300], 'E'

字母表为商品名称，[]方括号中的第一个字段为商品类别，[]方括号中的第二个字段为价格，第三个字段为售出的数字

我想为每个类别购买最贵（价格）的前两种商品。如果我没有在每个类别中至少有两种商品，我会放弃它。所以我应该得到以下结果

inventory = {'A':['Toy',3, 1000], 'B':['Toy',8, 1100], 
              'C':['Cloth',15, 1200], 'D':['Cloth',9, 1300], 
               'E':['Toy',11, 1400], 'F':['Cloth', 18, 1500], 'G':['Appliance', 300, 50]}

你能帮我用什么代码来实现这一点吗？我需要一个，我可以潜在地使用，不仅是前两名的价格项目，但也前三名或四名的价格项目。我最终会对更大的数据集使用它，所以如果它可以是更通用的代码，那就更好了。另外，我很难理解lambda表达式，如果您选择提供一个带有lambda表达式的代码，请您解释一下它是如何工作的，这样我就能够在将来处理任何更改的需求

我的系统只提供以下模块：

对分， cmath，收藏，日期时间，功能工具， heapq， itertools，数学，努比，熊猫，皮茨，队列随机的重新，希皮， statsmodels， sklearn，塔利布，时间

zipline

要以最有效的方式获得任何系列的前N名，请使用。您必须为每个类别创建一个堆：

inventorySummary = {'B':['Toy',8, 1100], 'E':['Toy',11, 1400], 
                     'C':['Cloth',15, 1200], 'F':['Cloth', 18, 1500]}

该函数适用于您可能希望生成的任何top N：

>>> summarize_inventory(inventory)
{'B': ['Toy', 8, 1100], 'E': ['Toy', 11, 1400], 'D': ['Cloth', 9, 1300], 'F': ['Cloth', 18, 1500]}
>>> from pprint import pprint
>>> pprint(_)
{'B': ['Toy', 8, 1100],
 'D': ['Cloth', 9, 1300],
 'E': ['Toy', 11, 1400],
 'F': ['Cloth', 18, 1500]}

您可以使用创建组：

>>> summarize_inventory(inventory, 3)
{'A': ['Toy', 3, 1000], 'C': ['Cloth', 15, 1200], 'B': ['Toy', 8, 1100], 'E': ['Toy', 11, 1400], 'D': ['Cloth', 9, 1300], 'F': ['Cloth', 18, 1500]}
>>> summarize_inventory(inventory, 1)
{'E': ['Toy', 11, 1400], 'G': ['Appliance', 300, 50], 'F': ['Cloth', 18, 1500]}

例子 enrico.bacis比我先找到了一个解决方案，但如果它能帮你的话，这里是我的版本（我试着用FP风格来做）：

我从没想过你可以在

sort\u key

和

group\u key

中进行这样的模式匹配。很不错的！这是一个O（NlogN）解决方案；对于输入字典中的每N个元素，需要为解决方案执行N次日志N个步骤感谢enrico.bacis！但是，仅仅做到第二高的价格会很难吗？所以你得到{'B'：['Toy'，81100]，'C'：['Cloth'，151200]}。并保持其余逻辑相同。我认为您可以将组[：2]中的“for（code，item）]：”部分修改为其他内容。@UjaeKang:这非常简单，使用组[1:2]而不是组[：2]

>>> summarize_inventory(inventory, 3)
{'A': ['Toy', 3, 1000], 'C': ['Cloth', 15, 1200], 'B': ['Toy', 8, 1100], 'E': ['Toy', 11, 1400], 'D': ['Cloth', 9, 1300], 'F': ['Cloth', 18, 1500]}
>>> summarize_inventory(inventory, 1)
{'E': ['Toy', 11, 1400], 'G': ['Appliance', 300, 50], 'F': ['Cloth', 18, 1500]}

from itertools import groupby

def summarize_inventory(inventory):
    # We use -price so that we are also sorting by descending price
    sort_key = lambda (code, (cat, price, sold)): (cat, -price)
    group_key = lambda (code, (cat, price, sold)): cat

    new_dict = {}
    sorted_inventory = sorted(inventory.iteritems(), key=sort_key)
    for cat, group in groupby(sorted_inventory, key=group_key):
        group = list(group)
        if len(group) > 1:
            for (code, item) in group[:2]:
                new_dict[code] = item
    return new_dict

>>> summarize_inventory(inventory)

{'B': ['Toy', 8, 1100],
 'C': ['Cloth', 15, 1200],
 'E': ['Toy', 11, 1400],
 'F': ['Cloth', 18, 1500]}

def summarize_inventory(inventory, top_n=2):
    sort_key = lambda (id, (category, price, sold)): (category, price)
    group_key = lambda (id, (category, price, sold)): category

    # items in inventory grouped by their category
    items_by_category = (
        (category, list(items))
        for category, items in itertools.groupby(
            sorted(inventory.iteritems(), key=sort_key),
            group_key
        )
    )

    # the top_n items from each category if there are >= top_n items
    inventory_summary = dict(itertools.chain.from_iterable(
        items[-1 * top_n:]
        for category, items in items_by_category
        if len(items) >= top_n
    ))

    return inventory_summary