Python 从字符串或列表创建dict 背景_Python_Dictionary_Hashtable

Python 从字符串或列表创建dict 背景

python dictionary

Python 从字符串或列表创建dict 背景,python,dictionary,hashtable,Python,Dictionary,Hashtable,我想为给定的字符串或列表生成一个哈希表。哈希表将元素视为键，将显示时间视为值。例如： s = 'ababcd' s = ['a', 'b', 'a', 'b', 'c', 'd'] dict_I_want = {'a':2,'b':2, 'c':1, 'd':1} 我的尝试真的通常，我使用上述两种方法。一个来自标准库，但在一些代码实践站点中是不允许的。另一本是白手起家写的，但我觉得太长了。如果我使用听写理解，我会想出另外两种方法： {i:s.count(i) for i in set(s)

我想为给定的字符串或列表生成一个哈希表。哈希表将元素视为

键

，将显示时间视为

值

。例如：

s = 'ababcd'
s = ['a', 'b', 'a', 'b', 'c', 'd']
dict_I_want = {'a':2,'b':2, 'c':1, 'd':1}

我的尝试真的

通常，我使用上述两种方法。一个来自标准库，但在一些代码实践站点中是不允许的。另一本是白手起家写的，但我觉得太长了。如果我使用听写理解，我会想出另外两种方法：

{i:s.count(i) for i in set(s)}
{i:s.count(i) for i in s}

问题: 我想知道是否有其他方法可以从列表的字符串中清晰或有效地初始化哈希表

我提到的4种方法的速度比较

最好的方法是使用内置计数器，否则，您可以使用defaultdict，这与您的第二次尝试非常类似

从集合导入defaultdict
d=defaultdict（int）#通过default使每个值都为0
对于字符串中的字母：
d[字母]+=1

最好的方法是使用内置计数器，否则，您可以使用defaultdict，这与您的第二次尝试非常类似

从集合导入defaultdict
d=defaultdict（int）#通过default使每个值都为0
对于字符串中的字母：
d[字母]+=1

我通常使用Counter或defaultdict来创建出现频率

令人惊讶的是，poster的自设法在大多数情况下都优于这两种方法

观察

from_集合（标记为“集合”）整体性能最佳

各种字典方法只适用于较小的字符串长度（即。 <100）

计数器方法仅适用于小范围的字符串长度

对于大字符串，from_set比defaultdict快2.3倍，比计数器快1.5倍

算法

from collections import Counter
from collections import defaultdict

import random,string,numpy,perfplot

def from_set(s):
    " Use builtin count function for each item in set "
    return {i:s.count(i) for i in set(s)}

def counter(s):
    " Uses counter module "
    return Counter(s)

def normal_dic(s):
  " Update dictionary by checking if item in it or not "
  d = {}
  for i in s:
    if i in d:
      d[i] += 1
    else:
      d[i] = 1

  return d

def setdefault_dic(s):
  " Use setdefault to preset unknown keys "
  d = {}
  for i in s:
    d.setdefault(i, 0)
    d[i] += 1

  return d

def default_dic(s):
    " Used defaultdict from collections module "
    d = defaultdict(int)
    for i in s:
        d[i] += 1
    return d

def try_dic(s):
    " Use try/except to check if item in dictionary "
    d = {}
    for i in s:
        try:
            d[i] += 1
        except:
            d[i] = 1

    return d

测试代码

图表

from collections import Counter
from collections import defaultdict

import random,string,numpy,perfplot

def from_set(s):
    " Use builtin count function for each item in set "
    return {i:s.count(i) for i in set(s)}

def counter(s):
    " Uses counter module "
    return Counter(s)

def normal_dic(s):
  " Update dictionary by checking if item in it or not "
  d = {}
  for i in s:
    if i in d:
      d[i] += 1
    else:
      d[i] = 1

  return d

def setdefault_dic(s):
  " Use setdefault to preset unknown keys "
  d = {}
  for i in s:
    d.setdefault(i, 0)
    d[i] += 1

  return d

def default_dic(s):
    " Used defaultdict from collections module "
    d = defaultdict(int)
    for i in s:
        d[i] += 1
    return d

def try_dic(s):
    " Use try/except to check if item in dictionary "
    d = {}
    for i in s:
        try:
            d[i] += 1
        except:
            d[i] = 1

    return d

绝对值

从图中的集合标签“集合”。在下面的相对图中比较容易找到相对性能，而不是绝对性能图

相对值

从图中的集合标签“集合”

“从集合”方法是水平线。对于较大的值，所有其他方法（包括Counter和defaultdict）都高于此值（更耗时）

表格

实际时间

       n  setdefault     try_dic  defaultdict    counter    from_set
     1.0       799.0       899.0       1299.0     6099.0     1399.0
     2.0      1099.0      1199.0       1599.0     6299.0     1699.0
     4.0      1699.0      1699.0       2199.0     6299.0     2399.0
     8.0      3199.0      3099.0       3499.0     6899.0     3699.0
    16.0      6099.0      5499.0       5899.0     7899.0     5900.0
    32.0     10899.0      9299.0       9899.0     8999.0    10299.0
    64.0     20799.0     15599.0      15999.0    11899.0    15099.0
   128.0     38499.0     25499.0      25899.0    16599.0    21899.0
   256.0     73100.0     44099.0      42700.0    26299.0    30299.0
   512.0    137999.0     77099.0      72699.0    43199.0    46699.0
  1024.0    286599.0    154500.0     144099.0    85700.0    79699.0
  2048.0    549700.0    289999.0     266799.0   157499.0   145699.0
  4096.0   1103899.0    577399.0     535499.0   309399.0   278999.0
  8192.0   2200099.0   1151500.0    1051799.0   606999.0   542499.0
 16384.0   4658199.0   2534399.0    2295300.0  1414199.0  1087799.0
 32768.0   9509200.0   5270200.0    4838000.0  3066600.0  2177200.0
 65536.0  19539500.0  10806300.0    9942100.0  6503299.0  4337599.0

我通常使用计数器或defaultdict来创建发生频率