用其他列表中的数据替换2D列表中的所有元素-使用唯一ID-Python
我对python只有几个月的了解,所以请原谅我的代码太难看了。我有一个由唯一ID组成的数据集。考虑这种格式的3行,每个都有3个ID:用其他列表中的数据替换2D列表中的所有元素-使用唯一ID-Python,python,list,multidimensional-array,enumeration,Python,List,Multidimensional Array,Enumeration,我对python只有几个月的了解,所以请原谅我的代码太难看了。我有一个由唯一ID组成的数据集。考虑这种格式的3行,每个都有3个ID: zList = [[1412,2521,53522], [52632,1342,1453], [3413,342,25232]] 我试图用一些相应的数据(名字、姓氏、状态等)替换每个ID。理想输出如下所示: resultList = [[Bob, Smith, Ohio, Jane, Doe, Texas, John, Smith, Alaska], [Jim,
zList = [[1412,2521,53522],
[52632,1342,1453],
[3413,342,25232]]
我试图用一些相应的数据(名字、姓氏、状态等)替换每个ID。理想输出如下所示:
resultList = [[Bob, Smith, Ohio, Jane, Doe, Texas, John, Smith, Alaska],
[Jim, Bob, California, Jack, White, Virginia, John, Smith, Nevada],
[Hank, Black, Kentucy, Sarah, Hammy, Florida, Joe, Blow, Mississipi]]
我意识到在结果中添加一个新维度会更干净,因为我实际上是在将每个ID扩展到一个新列表中。我之所以避免这样做,是因为我认为保持平面会更容易,而且我害怕在二维上迭代任何东西!愿意考虑所有的选择…
我用于匹配的数据是您可能期望的:
matchData = [[1412, Bob, Smith, Ohio, lots of additional data],
[2521, Jane, Doe, Texas, Lots of Additional Data],
[3411], Jim, Black, New York, Lots of Additional Data],
[...etc...]]
以下是我如何尝试的:
resultList = []
for i, valz in enumerate(zList):
for j, ele in enumerate(valz):
check = False
for k, valm in enumerate(matchData):
if ele == valm[0]:
resultList.append(valm)
check = True
break
if check == False:
print "error during rebuild"
pprint.pprint(resultList, width=400)
现在,虽然它几乎可以工作,但它缺少了两个关键的东西,我无法弄清楚。我的代码将所有内容都转储到一个大列表中。我必须能够保持与原始数据集的顺序和逻辑分离。(请记住,原始数据集是3行3个ID)
如果没有找到匹配项,我还需要抛出一个错误。您可以在上面的代码中看到我的尝试,但它不能正常工作。我尝试在我的第一个if语句之后添加以下内容:
elif all(ele not in valm[15):
check = False
但是我得到了这个错误:
“TypeError:type'int'的参数不可iterable”
我认为您的主要问题是构造列表。从外观上看,zList中每行和resultList中每行有3个“条目”。我建议将zList更改为一维列表,并将结果列表中的不同条目放在自己的列表中(在结果列表中,如下所示:
zList = [ 1412, 2521, 53522, 52632, 1342, 1453, 3413, 342, 25232 ]
resultList = [[ "Bob", "Smith", "Ohio" ],[ "Jane", "Doe", "Texas" ],[ "John", "Smith", "Alaska" ],
[ "Jim", "Bob", "California" ],[ "Jack", "White", "Virginia" ],[ "John", "Smith", "Nevada" ],
[ "Hank", "Black", "Kentucy" ],[ "Sarah", "Hammy", "Florida" ],[ "Joe", "Blow", "Mississipi"]]
zList = [...]
resultList = [[...]]
matchList = [] #or whatever you want to call it
for i in range(len(zList)): #the index is needed, you can also use enumerate
element_list = []
element_list.append(zList[i]) #note zList[i] = 2nd iterator of enumerate
for j in resultList[i]: #the index is not needed, so use the value
element_list.append(j)
matchList.append(elementList)
>>> print matchList
[1412, 'Bob', 'Smith', 'Ohio']
[2521, 'Jane', 'Doe', 'Texas']
[53522, 'John', 'Smith', 'Alaska']
[52632, 'Jim', 'Bob', 'California']
[1342, 'Jack', 'White', 'Virginia']
[1453, 'John', 'Smith', 'Nevada']
[3413, 'Hank', 'Black', 'Kentucy']
[342, 'Sarah', 'Hammy', 'Florida']
[25232, 'Joe', 'Blow', 'Mississipi'] #split in separate lines for clarity here
现在,您可以检查两个列表是否具有相同的长度(本例中为9):
在这里,您可以使用字典或列表。作为一名程序员新手,您可能还不熟悉字典,所以请查看
列表:
只需循环列表的长度,将其添加到新列表中,然后将该新列表附加到输出列表中,如下所示:
zList = [ 1412, 2521, 53522, 52632, 1342, 1453, 3413, 342, 25232 ]
resultList = [[ "Bob", "Smith", "Ohio" ],[ "Jane", "Doe", "Texas" ],[ "John", "Smith", "Alaska" ],
[ "Jim", "Bob", "California" ],[ "Jack", "White", "Virginia" ],[ "John", "Smith", "Nevada" ],
[ "Hank", "Black", "Kentucy" ],[ "Sarah", "Hammy", "Florida" ],[ "Joe", "Blow", "Mississipi"]]
zList = [...]
resultList = [[...]]
matchList = [] #or whatever you want to call it
for i in range(len(zList)): #the index is needed, you can also use enumerate
element_list = []
element_list.append(zList[i]) #note zList[i] = 2nd iterator of enumerate
for j in resultList[i]: #the index is not needed, so use the value
element_list.append(j)
matchList.append(elementList)
>>> print matchList
[1412, 'Bob', 'Smith', 'Ohio']
[2521, 'Jane', 'Doe', 'Texas']
[53522, 'John', 'Smith', 'Alaska']
[52632, 'Jim', 'Bob', 'California']
[1342, 'Jack', 'White', 'Virginia']
[1453, 'John', 'Smith', 'Nevada']
[3413, 'Hank', 'Black', 'Kentucy']
[342, 'Sarah', 'Hammy', 'Florida']
[25232, 'Joe', 'Blow', 'Mississipi'] #split in separate lines for clarity here
要添加更多数据,只需增加resultList中列表的大小,即可添加作业,如:
resultList = [[ "Bob", "Smith", "Ohio", "Tech Support" ], ...
字典
我认为这是更简单的方法。只需创建一个dict,然后使用zList中的元素形成键,并将resultList中的相应元素作为条目,如下所示:
matchDict = {}
for n in range(len(zList)): #need the index, remember?
matchDict[zList[n]] = resultList[n]
>>> print matchDict
{ 1412 : ['Bob', 'Smith', 'Ohio'] ,
1453 : ['John', 'Smith', 'Nevada'] ,
25232 : ['Joe', 'Blow', 'Mississipi'] ,
53522 : ['John', 'Smith', 'Alaska'] ,
3413 : ['Hank', 'Black', 'Kentucy'] ,
342 : ['Sarah', 'Hammy', 'Florida'] ,
52632 : ['Jim', 'Bob', 'California'] ,
2521 : ['Jane', 'Doe', 'Texas'] ,
1342 : ['Jack', 'White', 'Virginia'] }
注意,你可以用字典来调用字典中的元素,所以打印匹配(1412)-> [鲍伯],“史米斯”,“俄亥俄”。同样,你可以将更多的信息添加到结果列表中,如上面所示。 < P>为了获得更干净的代码,你应该考虑使用类来封装数据。 让我们看看:
class Person(object):
def __init__(self, identifier, firstname, name, state):
self.id = identifier
self.firstname = firstname
self.name = name
self.state = state
def __repr__(self):
return "<{0} {1} (id : {2}) living in {3}>".format(self.firstname, self.name, self.id, self.state)
def as_list(self):
return [self.firstname, self.name, self.state]
class PersonList(list):
def __init__(self, *args, **kwargs):
list.__init__(self, *args, **kwargs)
def getById(self, identifier):
""" return the person of this list whose the id is equals to the requested identifier. """
# filter(boolean function, iterable collection) -> return a collection hat contain only element that are true according to the function.
# here it is used a lambda function, a inline function declaration that say for any given object x, it return x.id == identifier.
# the list is filtered to only get element with attribut id equals to identifier. See https://docs.python.org/3.4/library/functions.html#filter
tmp = list(filter(lambda x: x.id == identifier, self))
if len(tmp)==0:
raise Exception('Searched for a Person whose id is {0}, but no one have this id.'.format(identifier))
elif len(tmp) > 1:
raise Exception('Searched for a Person whose id is {0}, and many people seem to share this id. id are supposed to be unique.'.format(identifier))
return tmp[0]
##CONSTANTS##
#id list - modified to not instanciate 9 Person
ids = [[1412,2521,3411],#bob, jane, jim
[3411,1412,1412],#jim, bob, bob
[3411,2521,2521]]#jim, jane, jane
#person list
index=PersonList([Person(1412, 'Bob', 'Smith', 'Ohio'),
Person(2521, 'Jane', 'Doe', 'Texas'),
Person(3411, 'Jim', 'Black', 'New York')])
def computeResult(id_list, personList):
personList = [ [personList.getById(identifier) for identifier in subList] for subList in id_list]
resultList= []
for sublist in personList:
tmp = []
for person in sublist:
tmp += person.as_list()
resultList.append(tmp)
return resultList
if __name__ == "__main__":
print(computeResult(ids, index))
或
这将更加实用,因为您可以删除PersonList
类,然后仅使用index.get(1412)
来获取与id对应的数据
编辑:根据请求添加跟踪示例
此脚本保存在名为“sof.py”的文件中
python3
>>>进口软件
>>>软指数
[, ]
>>>软件索引getById(666)
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
getById中的文件“/home/vaisse/Bureau/sof.py”,第25行
引发异常('搜索id为{0}的人员,但没有人具有此id。'。格式(标识符))
例外:搜索id为666的人,但没有人具有此id。
正如您在出现错误时所看到的,一切都会停止。如果此行为不是您想要的行为,您还可以返回一个
None
值,并在某个地方跟踪失败的内容,而不是引发异常,然后继续处理数据。您应该看看是否希望您的应用程序即使在cas中也能运行错误的e。否则,简单的异常引发就足够了为什么不使用字典来匹配用户名和ID?这很有帮助,但非常重要的是我保留了与原始zList的逻辑分隔。例如:(第一组的bob、jane和john),(第二组的jim、jack和john)。我不确定你的回答是否考虑到了这一点?我会仔细看看。我喜欢在错误时引发异常的能力。原始列表可以有数百个“3人一组”,因此在尝试匹配时知道是否有错误很重要。好奇你是否可以发布示例输出?
index={1412: ['Bob', 'Doe', ...], 2500 : [...], ...}`
python3
>>> import sof
>>> sof.index
[<Bob Smith (id : 1412) living in Ohio>, <Jane Doe (id : 2521) living in Texas>, <Jim Black (id : 3411) living in New York>]
>>> sof.index.getById(666)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/vaisse/Bureau/sof.py", line 25, in getById
raise Exception('Searched for a Person whose id is {0}, but no one have this id.'.format(identifier))
Exception: Searched for a Person whose id is 666, but no one have this id.