Python中列表列表的简单聚类
我有以下列表,其中包含5个条目:Python中列表列表的简单聚类,python,algorithm,Python,Algorithm,我有以下列表,其中包含5个条目: my_lol = [['a', 1.01], ['x',1.00],['k',1.02],['p',3.00], ['b', 3.09]] 我想对上面的列表进行“分类”,大致如下: 1. Sort `my_lol` with respect to the value in the list ascending 2. Pick the lowest entry in `my_lol` as the key of first cluster 3. Calculat
my_lol = [['a', 1.01], ['x',1.00],['k',1.02],['p',3.00], ['b', 3.09]]
我想对上面的列表进行“分类”,大致如下:
1. Sort `my_lol` with respect to the value in the list ascending
2. Pick the lowest entry in `my_lol` as the key of first cluster
3. Calculate the value difference of the current entry with the previous one
4. If the difference is less than the threshold, include that as the member cluster of the first
entry, otherwise assign the current key as the key of the next cluster.
5. Repeat the rest until finish
在一天结束时,我想得到以下列表字典:
dol = {'x':['x','a','k'], 'p':['p','b']}
本质上,列表字典是一个包含两个集群的集群
我试过了,但从第三步开始就卡住了。正确的方法是什么
import operator
import json
from collections import defaultdict
my_lol = [['a', 1.01], ['x',1.00],['k',1.02],['p',3.00], ['b', 3.09]]
my_lol_sorted = sorted(my_lol, key=operator.itemgetter(1))
thres = 0.1
tmp_val = 0
tmp_ids = "-"
dol = defaultdict(list)
for ids, val in my_lol_sorted:
if tmp_ids != "-":
diff = abs(tmp_val - val)
if diff < thres:
print tmp_ids
dol[tmp_ids].append(tmp_ids)
tmp_ids = ids
tmp_val = val
print json.dumps(dol, indent=4)
导入操作符
导入json
从集合导入defaultdict
我的笑=[[a',1.01],'x',1.00],'k',1.02],'p',3.00],'b',3.09]]
my_lol_sorted=sorted(my_lol,key=operator.itemgetter(1))
thres=0.1
tmp_val=0
tmp_id=“-”
dol=defaultdict(列表)
对于ID,my_lol_中的val已排序:
如果tmp_ID!="-":
差异=绝对值(tmp_val-val)
如果差异
试试这个:
dol = defaultdict(list)
if len(my_lol) > 0:
thres = 0.1
tmp_ids, tmp_val = my_lol_sorted[0]
for ids, val in my_lol_sorted:
diff = abs(tmp_val - val)
if diff > thres:
tmp_ids = ids
dol[tmp_ids].append(ids)
tmp_val = val
import operator
import json
from collections import defaultdict
my_lol = [['a', 1.01], ['x',1.00],['k',1.02],['p',3.00], ['b', 3.09]]
my_lol_sorted = sorted(my_lol, key=operator.itemgetter(1))
thres = 0.1
tmp_val = 0
tmp_ids = "-"
dol = defaultdict(list)
for ids, val in my_lol_sorted:
if tmp_ids == "-":
tmp_ids = ids
else:
diff = abs(tmp_val - val)
if diff > thres:
tmp_ids = ids
dol[tmp_ids].append(ids)
tmp_val = val
print json.dumps(dol, indent=4)