Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/359.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在python中为字符串中的单词赋值_Python_String_Numbers_Compression - Fatal编程技术网

在python中为字符串中的单词赋值

在python中为字符串中的单词赋值,python,string,numbers,compression,Python,String,Numbers,Compression,嗨,我在python中有一个压缩任务来开发代码,如果输入是 “你好,是我,你好,你能听到我吗,你好,你在听吗” 那么输出应该是 1,2,3,1,4,5,6,3,1,7,5,8 基本上,每个单词都有一个数值,如果单词重复,那么这个单词也会重复。 此编码是用python编写的,请帮助我谢谢一种简单的方法是使用dict,当您发现一个新词时,使用递增变量添加一个键/值对,当您在刚刚打印dict中的值之前看到该词时: s = 'hello its me, hello can you hear me, he

嗨,我在python中有一个压缩任务来开发代码,如果输入是

“你好,是我,你好,你能听到我吗,你好,你在听吗”

那么输出应该是

1,2,3,1,4,5,6,3,1,7,5,8

基本上,每个单词都有一个数值,如果单词重复,那么这个单词也会重复。
此编码是用python编写的,请帮助我谢谢

一种简单的方法是使用dict,当您发现一个新词时,使用递增变量添加一个键/值对,当您在刚刚打印dict中的值之前看到该词时:

s = 'hello its me, hello can you hear me, hello are you listening'


def cyc(s):
    # set i to 1 
    i = 1
    # split into words on whitespace
    it = s.split()
    # create first key/value pair 
    seen = {it[0]: i}
    # yield 1 for first word
    yield i
    # for all var the first word
    for word in it[1:]:
        # if we have seen this word already, use it's value from our dict
        if word in seen:
            yield seen[word]
        # else first time seeing it so increment count
        # and create new k/v pairing
        else:
            i += 1
            yield i
            seen[word] = i


print(list(cyc(s)))
输出:

[1, 2, 3, 1, 4, 5, 6, 3, 1, 7, 5, 8]
您还可以通过使用
iter
并调用
next
弹出第一个单词来避免切片,如果您想使
foo==foo我们需要从字符串中删除任何标点符号,这可以通过str.rstrip完成:


一种简单的方法是使用dict,当您发现一个新词时,使用递增变量添加一个键/值对,当您在刚刚打印dict中的值之前看到该词时:

s = 'hello its me, hello can you hear me, hello are you listening'


def cyc(s):
    # set i to 1 
    i = 1
    # split into words on whitespace
    it = s.split()
    # create first key/value pair 
    seen = {it[0]: i}
    # yield 1 for first word
    yield i
    # for all var the first word
    for word in it[1:]:
        # if we have seen this word already, use it's value from our dict
        if word in seen:
            yield seen[word]
        # else first time seeing it so increment count
        # and create new k/v pairing
        else:
            i += 1
            yield i
            seen[word] = i


print(list(cyc(s)))
输出:

[1, 2, 3, 1, 4, 5, 6, 3, 1, 7, 5, 8]
您还可以通过使用
iter
并调用
next
弹出第一个单词来避免切片,如果您想使
foo==foo我们需要从字符串中删除任何标点符号,这可以通过str.rstrip完成:


如何使用item:index映射构建一个
dict

>>> s
'hello its me, hello can you hear me, hello are you listening'
>>> 
>>> l = s.split()
>>> d = {}
>>> i = 1
>>> for x in l:
        if x not in d:
            d[x]=i
            i += 1


>>> d
{'its': 2, 'listening': 8, 'hear': 6, 'hello': 1, 'are': 7, 'you': 5, 'me,': 3, 'can': 4}
>>> for x in l:
        print(x, d[x])


hello 1
its 2
me, 3
hello 1
can 4
you 5
hear 6
me, 3
hello 1
are 7
you 5
listening 8
>>> 
如果您不希望分割列表中有任何标点符号,则可以执行以下操作:

>>> import re
>>> l = re.split(r'(?:,|\s)\s*', s)
>>> l
['hello', 'its', 'me', 'hello', 'can', 'you', 'hear', 'me', 'hello', 'are', 'you', 'listening']

如何使用item:index映射构建一个
dict

>>> s
'hello its me, hello can you hear me, hello are you listening'
>>> 
>>> l = s.split()
>>> d = {}
>>> i = 1
>>> for x in l:
        if x not in d:
            d[x]=i
            i += 1


>>> d
{'its': 2, 'listening': 8, 'hear': 6, 'hello': 1, 'are': 7, 'you': 5, 'me,': 3, 'can': 4}
>>> for x in l:
        print(x, d[x])


hello 1
its 2
me, 3
hello 1
can 4
you 5
hear 6
me, 3
hello 1
are 7
you 5
listening 8
>>> 
如果您不希望分割列表中有任何标点符号,则可以执行以下操作:

>>> import re
>>> l = re.split(r'(?:,|\s)\s*', s)
>>> l
['hello', 'its', 'me', 'hello', 'can', 'you', 'hear', 'me', 'hello', 'are', 'you', 'listening']

你试过什么吗?StackOverflow不是代码编写服务。您尝试过什么吗?StackOverflow不是代码编写服务。