Python 我想在文本文件中找到每个单词的长度_Python_Dictionary

Python 我想在文本文件中找到每个单词的长度

python dictionary

Python 我想在文本文件中找到每个单词的长度,python,dictionary,Python,Dictionary,我试图在我的文本文件中逐个查找单词的长度。我尝试了以下代码，但这段代码显示了这个单词在文件中使用的次数 text = open(r"C:\Users\israr\Desktop\counter\Bigdata.txt") d = dict() for line in text: line = line.strip() line = line.lower() words = line.split(" ") for word in words:

我试图在我的文本文件中逐个查找单词的长度。我尝试了以下代码，但这段代码显示了这个单词在文件中使用的次数

text = open(r"C:\Users\israr\Desktop\counter\Bigdata.txt") 
d = dict() 

for line in text: 
    line = line.strip() 
    line = line.lower()
    words = line.split(" ") 

    for word in words: 
        if word in d: 
            d[word] = d[word] + 1
        else: 
            # Add the word to dictionary with count 1 
            d[word] = 1

for key in list(d.keys()): 
    print(key, ":", d[key])

输出是这样的

中国：14
结果:1
as:16
1:5
总数：44
电话：108
世界人口：7
第一：2
文明：1
年：26
生育期：1
盆地：1
黄色：1
河流：1
北：1
普通的1.

基本上，我想要一个长度相同的单词列表，例如中国，第一，世界：5这5是所有这些单词的长度，等等其他列表中长度不同的单词当你查看处理每个单词的代码时，你会发现你的问题

for word in words: 
        if word in d: 
            d[word] = d[word] + 1
        else: 
            # Add the word to dictionary with count 1 
            d[word] = 1

在这里，你要检查字典里是否有一个单词。如果是，则在找到它时向其键添加1。如果不是，则将其初始化为1。这是计算重复次数的核心概念

如果你想计算单词的长度，你可以简单地做

for word in words: 
        if word not in d: 
            d[word] = len(word)

要输出您的dict，您可以

for k, v in d.items():
    print(k, ":", v)

İ如果您分别需要所有word的总长度，您可以使用以下公式找到它们：

len（word）*对words中的所有word
进行计数（word）

python中的EqualEvent：
d[key]*len（key）

将最后两行更改为以下内容：
for key in list(d.keys()):
    print(key, ":", d[key] * len(key))

----编辑----
我想这是你在评论中问的。下面的代码为您提供了成员长度相同的组
    for word in words:
        if len(word) in d:
            if word not in d[len(word)]:
                d[len(word)].append(word)
        else:
            # Add the word to dictionary with count 1
            d[len(word)] = [word]

for key in list(d.keys()):
    print(key, ":", d[key])

此代码的输出：
3 : ['the', 'bc,', '(c.', 'who', 'was', '100', 'bc)', 'and', 'xia', 'but', 'not', 'one', 'due', '8th', '221', 'qin', 'shi', 'for', 'his', 'han', '220', '206', 'has', 'war', 'all', 'far']
8 : ['earliest', 'describe', 'writings', 'indicate', 'commonly', 'however,', 'cultural', 'history,', 'regarded', 'external', 'internal', 'culture,', 'troubled', 'imperial', 'selected', 'replaced', 'republic', 'mainland', "people's", 'peoples,', 'multiple', 'kingdoms', 'xinjiang', 'present.', '(carried']
5 : ['known', 'china', 'early', 'shang', 'texts', 'grand', 'ruled', 'river', 'which', 'along', 'these', 'arose', 'years', 'their', 'rule.', 'began', 'first', 'those', 'huang', 'title', 'after', 'until', '1912,', 'tasks', 'elite', 'young', '1949.', 'unity', 'being', 'civil', 'parts', 'other', 'world', 'waves', 'basis']
7 : ['written', 'records', 'history', 'dynasty', 'ancient', 'century', 'mention', 'writing', 'period,', 'xia.[5]', 'valley,', 'chinese', 'various', 'centers', 'yangtze', "world's", 'cradles', 'concept', 'mandate', 'justify', 'central', 'country', 'smaller', 'period.', 'another', 'warring', 'created', 'himself', 'huangdi', 'marking', 'systems', 'enabled', 'emperor', 'control', 'routine', 'handled', 'special', 'through', "china's", 'between', 'periods', 'culture', 'western', 'foreign']
2 : ['of', 'as', 'wu', 'by', 'no', 'is', 'do', 'in', 'to', 'be', 'at', 'or', 'bc', '21', 'ad']
4 : ['date', 'from', '1250', 'bc),', 'king', 'such', 'book', '11th', '(296', 'held', 'both', 'with', 'zhou', 'into', 'much', 'qin,', 'fell', 'soon', '(206', 'ad).', 'that', 'vast', 'were', 'men,', 'last', 'qing', 'then', 'most', 'whom', 'eras', 'have', 'some', 'asia', 'form']
9 : ['1600–1046', 'mentioned', 'documents', 'chapters,', 'historian', '2070–1600', 'existence', 'neolithic', 'millennia', 'thousands', '(1046–256', 'pressures', 'following', 'developed', 'conquered', '"emperor"', 'beginning', 'dynasties', 'directly.', 'centuries', 'carefully', 'difficult', 'political', 'dominated', 'stretched', 'contact),']
6 : ['during', "ding's", '(early', 'bamboo', 'annals', 'before', 'shang,', 'yellow', 'cradle', 'river.', 'shang.', 'oldest', 'heaven', 'weaken', 'states', 'spring', 'autumn', 'became', 'warred', 'times.', 'china.', 'death,', 'peace,', 'failed', 'recent', 'steppe', 'china;', 'tibet,', 'modern']
12 : ['reign,[1][2]', 'twenty-first', 'longer-lived', 'bureaucratic', 'calligraphy,', '(1644–1912),', '(1927–1949).', 'occasionally', 'immigration,']
11 : ['same.[3][4]', 'independent', 'traditional', 'territories', 'well-versed', 'literature,', 'philosophy,', 'assimilated', 'population.', 'warlordism,']
10 : ['historical', 'originated', 'continuous', 'supplanted', 'introduced', 'government', 'eventually', 'splintered', 'literature', 'philosophy', 'oppressive', 'successive', 'alternated', 'influences', 'expansion,']
1 : ['a', '–']
13 : ['civilization.', 'civilizations', 'examinations.', 'statehood—the', 'assimilation,']
17 : ['civilizations,[6]']
16 : ['civilization.[7]']
0 : ['']
14 : ['administrative']
18 : ['scholar-officials.']


下面是代码的完整版本
text = open("bigdata.txt")
d = dict()

for line in text:
    line = line.strip()
    line = line.lower()
    words = line.split(" ")

    for word in words:
        if len(word) in d:
            if word not in d[len(word)]:
                d[len(word)].append(word)
        else:
            d[len(word)] = [word]

for key in list(d.keys()):
    print(key, ":", d[key])

您可以创建字长列表，然后通过python的内置计数器进行处理：
from collections import Counter

with open("mytext.txt", "r") as f:
    words = f.read().split()
    words_lengths = [len(word) for word in words]
    counter = Counter(words_lengths)

输出类似于smth：
In[1]:counter
Out[1]:Counter({7: 146, 9: 73, 5: 73, 4: 146, 1: 73})

其中，键是字长，值是它们出现的次数
你可以像使用普通字典一样使用它
 你能给出一个你期望的输出的例子吗？你有没有试着在打印结果时把d[key]
改成len（key）
？基本上我想要一个长度相同的单词列表，例如china，first，世界：5这5是所有这些单词的长度，等等，其他单词的长度不同list@IsrarAwan编辑了解决方案，但你知道你真正想要的是什么吗？@lapestand是的，我编辑了我的问题，很抱歉，事实上我是新来的，这是我第二次，但接下来我要问清楚，谢谢这是我在输出0:[''中得到的结果我添加了我收到的输出。你能分享你的文本文件吗，然后我也可以试试看@IsrarAwanYes这是可行的，但它从5个长度的单词开始，但应该从0开始。我们运行循环，它将删除每个单词并找到长度，然后保存在类似长度的列表中。嘿，谢谢你的帮助，这是可行的，但现在我想将所有单词存储在一个具有相同长度的列表中。。。。