Python 使用上一项通过标准输入进行循环_Python_Bash_Stdin_Sys

Python 使用上一项通过标准输入进行循环

python bash

Python 使用上一项通过标准输入进行循环,python,bash,stdin,sys,Python,Bash,Stdin,Sys,我想将一行代码与前一行代码进行比较，而不在内存中存储任何内容（没有字典）样本数据： a 2 file 1 file 2 file 4 for 1 has 1 is 2 lines 1 small 1 small 2 test 1 test 2 this 1 this 2 two 1 伪代码： for line in sys.stdin: word, count = line.split() if word == pr

我想将一行代码与前一行代码进行比较，而不在内存中存储任何内容（没有字典）

样本数据：

a   2
file    1
file    2
file    4
for 1
has 1
is  2
lines   1
small   1
small   2
test    1
test    2
this    1
this    2
two 1

伪代码：

for line in sys.stdin:
    word, count = line.split()
    if word == previous_word:
        print(word, count1+count2)

我知道我会在数组上使用

enumerate

或

dict.iteritems

，但我不能在

sys.stdin

上使用

期望输出：

a   2
file    7
for 1
has 1
is  2
lines   1
small   3
test    3
this    3
two 1

我想将一行代码与前一行代码进行比较，而不在内存中存储任何内容（没有字典）

为了能够用相似的单词总结前面所有行的计数，您需要维护一些状态

通常此作业适合于

awk

。你可以考虑这个命令：

awk '{a[$1] += $2} p && p != $1{print p, a[p]; delete a[p]} {p = $1} 
END { print p, a[p] }' file

使用

delete

，此解决方案不会将整个文件存储在内存中。状态仅在处理具有相同第一个字的行时保持

Awk参考资料：

stdin_data = [
    "a   2",
    "file    1",
    "file    2",
    "file    4",
    "for 1",
    "has 1",
    "is  2",
    "lines   1",
    "small   1",
    "small   2",
    "test    1",
    "test    2",
    "this    1",
    "this    2",
    "two 1",
]  

previous_word = ""
word_ct = 0

for line in stdin_data:
    word, count = line.split()
    if word == previous_word:
        word_ct += int(count)
    else:
        if previous_word != "":
            print(previous_word, word_ct)
        previous_word = word
        word_ct = int(count)

# Print the final word and count
print(previous_word, word_ct)

a 2
file 7
for 1
has 1
is 2
lines 1
small 3
test 3
this 3
two 1

prev_word, prev_count = '', 0
for line in sys.stdin:
    word, count = line.split()
    count = int(count)
    if word == prev_word:
        prev_count += count
    elif prev_count:
        print(prev_word, prev_count)
        prev_word, prev_count = word, count

上一个单词

sys.stdin

awk

prev_word, prev_count = '', 0
for line in sys.stdin:
    word, count = line.split()
    count = int(count)
    if word == prev_word:
        prev_count += count
    elif prev_count:
        print(prev_word, prev_count)
        prev_word, prev_count = word, count