Python 3.x 在Python3中为文件中的字典键分配多个值
我对Python相当陌生,但还没有找到这个问题的答案。 我正在写一个简单的推荐程序,我需要一本字典,其中美食是关键,餐厅名称是价值。在一些情况下,我必须拆分一串几个美食名称,并确保具有相同美食的所有其他餐厅(值)都被分配到相同的美食(键)。以下是文件的一部分:Python 3.x 在Python3中为文件中的字典键分配多个值,python-3.x,dictionary,Python 3.x,Dictionary,我对Python相当陌生,但还没有找到这个问题的答案。 我正在写一个简单的推荐程序,我需要一本字典,其中美食是关键,餐厅名称是价值。在一些情况下,我必须拆分一串几个美食名称,并确保具有相同美食的所有其他餐厅(值)都被分配到相同的美食(键)。以下是文件的一部分: Georgie Porgie 87% $$$ Canadian, Pub Food Queen St. Cafe 82% $ Malaysian, Thai Mexican Grill 85% $$ Mexican Deep Fri
Georgie Porgie
87%
$$$
Canadian, Pub Food
Queen St. Cafe
82%
$
Malaysian, Thai
Mexican Grill
85%
$$
Mexican
Deep Fried Everything
52%
$
Pub Food
因此,这只是第一个和最后一个有相同的菜肴,但有更多的文件后面。
这是我的代码:
def new(file):
file = "/.../Restaurants.txt"
d = {}
key = []
with open(file) as file:
lines = file.readlines()
for i in range(len(lines)):
if i % 5 == 0:
if "," not in lines[i + 3]:
d[lines[i + 3].strip()] = [lines[i].strip()]
else:
key += (lines[i + 3].strip().split(', '))
for j in key:
if j not in d:
d[j] = [lines[i].strip()]
else:
d[j].append(lines[i].strip())
return d
它会打印所有键和值,但不会在应该打印的位置将两个值分配给同一个键。另外,使用最后一个“else”语句,第二个餐厅被分配到错误的键作为第二个值。这不应该发生。如果您有任何意见或帮助,我将不胜感激。如果只有一个类别,您不检查该关键字是否在词典中。您应该像在多个类别的情况下那样进行类似的操作,然后它就可以正常工作了 我不知道当你有一个文件然后被覆盖时,为什么你有一个文件作为参数 此外,您应该为每个结果设置“键”,而不是+=(将其添加到现有的“键”中) 当您检查j是否在字典中时,最简单的方法是检查j是否在键中(d.keys())
通常,我发现如果您使用字典键的名称,以后处理它们可能会更容易 在下面的示例中,我返回了一系列字典,每个餐厅一个。我还将处理值的功能包装在一个名为add_value()的方法中,以使代码更具可读性 在我的示例中,我使用编解码器对值进行解码。虽然没有必要,但根据您处理的字符,它可能会很有用。我还使用itertools通过迭代器读取文件行。同样,根据具体情况,也不必这样做,但如果您处理的是非常大的文件,它可能会很有用
import copy, itertools, codecs
class RestaurantListParser(object):
file_name = "restaurants.txt"
base_item = {
"_type": "undefined",
"_fields": {
"name": "undefined",
"nationality": "undefined",
"rating": "undefined",
"pricing": "undefined",
}
}
def add_value(self, formatted_item, field_name, field_value):
if isinstance(field_value, basestring):
# handle encoding, strip, process the values as you need.
field_value = codecs.encode(field_value, 'utf-8').strip()
formatted_item["_fields"][field_name] = field_value
else:
print 'Error parsing field "%s", with value: %s' % (field_name, field_value)
def generator(self, file_name):
with open(file_name) as file:
while True:
lines = tuple(itertools.islice(file, 5))
if not lines: break
# Initialize our dictionary for this item
formatted_item = copy.deepcopy(self.base_item)
if "," not in lines[3]:
formatted_item['_type'] = lines[3].strip()
else:
formatted_item['_type'] = lines[3].split(',')[1].strip()
self.add_value(formatted_item, 'nationality', lines[3].split(',')[0])
self.add_value(formatted_item, 'name', lines[0])
self.add_value(formatted_item, 'rating', lines[1])
self.add_value(formatted_item, 'pricing', lines[2])
yield formatted_item
def split_by_type(self):
d = {}
for restaurant in self.generator(self.file_name):
if restaurant['_type'] not in d:
d[restaurant['_type']] = [restaurant['_fields']]
else:
d[restaurant['_type']] += [restaurant['_fields']]
return d
然后,如果您运行:
p = RestaurantListParser()
print p.split_by_type()
你应该得到:
{
'Mexican': [{
'name': 'Mexican Grill',
'nationality': 'undefined',
'pricing': '$$',
'rating': '85%'
}],
'Pub Food': [{
'name': 'Georgie Porgie',
'nationality': 'Canadian',
'pricing': '$$$',
'rating': '87%'
}, {
'name': 'Deep Fried Everything',
'nationality': 'undefined',
'pricing': '$',
'rating': '52%'
}],
'Thai': [{
'name': 'Queen St. Cafe',
'nationality': 'Malaysian',
'pricing': '$',
'rating': '82%'
}]
}
你的解决方案很简单,所以没关系。我想提一提当我想到这类问题时想到的几个想法。这里是另一个例子,使用并简化事情
from collections import defaultdict
record_keys = ['name', 'rating', 'price', 'cuisine']
def load(file):
with open(file) as file:
data = file.read()
restaurants = []
# chop up input on each blank line (2 newlines in a row)
for record in data.split("\n\n"):
fields = record.split("\n")
# build a dictionary by zipping together the fixed set
# of field names and the values from this particular record
restaurant = dict(zip(record_keys, fields))
# split chops apart the type cuisine on comma, then _.strip()
# removes any leading/trailing whitespace on each type of cuisine
restaurant['cuisine'] = [_.strip() for _ in restaurant['cuisine'].split(",")]
restaurants.append(restaurant)
return restaurants
def build_index(database, key, value):
index = defaultdict(set)
for record in database:
for v in record.get(key, []):
# defaultdict will create a set if one is not present or add to it if one does
index[v].add(record[value])
return index
restaurant_db = load('/var/tmp/r')
print(restaurant_db)
by_type = build_index(restaurant_db, 'cuisine', 'name')
print(by_type)
from collections import defaultdict
record_keys = ['name', 'rating', 'price', 'cuisine']
def load(file):
with open(file) as file:
data = file.read()
restaurants = []
# chop up input on each blank line (2 newlines in a row)
for record in data.split("\n\n"):
fields = record.split("\n")
# build a dictionary by zipping together the fixed set
# of field names and the values from this particular record
restaurant = dict(zip(record_keys, fields))
# split chops apart the type cuisine on comma, then _.strip()
# removes any leading/trailing whitespace on each type of cuisine
restaurant['cuisine'] = [_.strip() for _ in restaurant['cuisine'].split(",")]
restaurants.append(restaurant)
return restaurants
def build_index(database, key, value):
index = defaultdict(set)
for record in database:
for v in record.get(key, []):
# defaultdict will create a set if one is not present or add to it if one does
index[v].add(record[value])
return index
restaurant_db = load('/var/tmp/r')
print(restaurant_db)
by_type = build_index(restaurant_db, 'cuisine', 'name')
print(by_type)