Python 尝试计算每个独特小时发生的次数_Python_Counting

Python 尝试计算每个独特小时发生的次数

python

Python 尝试计算每个独特小时发生的次数,python,counting,Python,Counting,我试图读取一个有两种不同时间格式的文件，然后计算第二种时间格式每小时出现的次数。这是我的第一个Python脚本，在我认为我取得了重大进展之后，我有点不知所措。我在我的输出文件中得到了唯一的小时数，但没有计算，我无法找出哪里出了问题我非常感谢你能提供的任何帮助。谢谢这是我的文件示例- KABH, 11:17:00, 04:30:00 KABH, 11:18:00, 04:31:00 KABH, 11:19:00, 04:33:00 KABH, 11:20:00, 05:34:00

我试图读取一个有两种不同时间格式的文件，然后计算第二种时间格式每小时出现的次数。这是我的第一个Python脚本，在我认为我取得了重大进展之后，我有点不知所措。我在我的输出文件中得到了唯一的小时数，但没有计算，我无法找出哪里出了问题

我非常感谢你能提供的任何帮助。谢谢

这是我的文件示例-

KABH, 11:17:00, 04:30:00  
KABH, 11:18:00, 04:31:00  
KABH, 11:19:00, 04:33:00  
KABH, 11:20:00, 05:34:00  
KABH, 11:32:00, 05:46:00  
KABH, 11:33:00, 02:47:00  
KABH, 11:34:00, 02:48:00   
KABH, 11:35:00, 02:49:00

Python libs
import sys, glob, os, subprocess, calendar, string

# Input file
infile = "test.txt"

# Open file
fin = open(infile,"r")
data = fin.readlines()

# Lists to hold counts
stn = []
cnts = []
UTC = []
NST = []
HRS = []


# Loop over each line and count the number of times each hour is found
for lines in data:

d = string.split(lines,", " )
if not d[0] in stn:
  stn.append(d[0])
  UTC.append(d[1])
  NST.append(d[2])

t = d[2].split(":")
if not t[0] in HRS:
  HRS.append(t[0])

# Loop over all the unique times and count how the number of occurrences
for h in HRS:
  cnt = 0
  for l in data:
    t2 = string.split(l,":")
    if t2[0] == h:
      cnt = cnt + 1
  cnts.append(cnt)

# Open a file and write the info
fout = open("data.csv","w")
cnt = 0
while cnt < len(HRS):
 fout.write('%02d,%02d\n' % (int(HRS[cnt]),int(cnts[cnt])))
 cnt = cnt + 1
fout.close()

04,00
05,00
02,00

这是我目前正在运行以获取结果的代码-

KABH, 11:17:00, 04:30:00  
KABH, 11:18:00, 04:31:00  
KABH, 11:19:00, 04:33:00  
KABH, 11:20:00, 05:34:00  
KABH, 11:32:00, 05:46:00  
KABH, 11:33:00, 02:47:00  
KABH, 11:34:00, 02:48:00   
KABH, 11:35:00, 02:49:00

Python libs
import sys, glob, os, subprocess, calendar, string

# Input file
infile = "test.txt"

# Open file
fin = open(infile,"r")
data = fin.readlines()

# Lists to hold counts
stn = []
cnts = []
UTC = []
NST = []
HRS = []


# Loop over each line and count the number of times each hour is found
for lines in data:

d = string.split(lines,", " )
if not d[0] in stn:
  stn.append(d[0])
  UTC.append(d[1])
  NST.append(d[2])

t = d[2].split(":")
if not t[0] in HRS:
  HRS.append(t[0])

# Loop over all the unique times and count how the number of occurrences
for h in HRS:
  cnt = 0
  for l in data:
    t2 = string.split(l,":")
    if t2[0] == h:
      cnt = cnt + 1
  cnts.append(cnt)

# Open a file and write the info
fout = open("data.csv","w")
cnt = 0
while cnt < len(HRS):
 fout.write('%02d,%02d\n' % (int(HRS[cnt]),int(cnts[cnt])))
 cnt = cnt + 1
fout.close()

04,00
05,00
02,00

您可以使用

dictionary

将

hour

另存为key，在第一次遇到key时创建一个空列表，并在列表中追加1。最后，检查每个键的列表长度

counter_dict = dict()
with open("sample.csv") as inputs:
    for line in inputs:
        column1, time1, time2 = line.split(",")
        counter_dict.setdefault(time2.split(":")[0].strip(), list()).append(1)

for key, value in counter_dict.iteritems():
    print "{0},{1}".format(key, len(value))

输出为：

02,3
04,3
05,2

由于您的代码现在缩进了，所以它是错误的。你能修一下吗？否则，我们会浪费时间对严重缩进的内容进行注释。这不是因为当您将

t2[0]

与

进行比较时，

t2[0]

是第一个时间列的小时，而不是第二个时间列吗？在数据中的l的

中的第一个循环中：

，

t2[0]

是

'KABH，11'

，而不是

'4'

，对吗？在您声明

t2

之后，立即打印它，您就会明白我的意思。将整行分割为一个冒号，因此

t2[0]

是第一个冒号左侧的所有内容。因此，

t2[0]==h

总是返回false，

cnt

永远不会递增。@jp你完全正确！我真不敢相信我竟然让这件事蒙蔽了我。不管出于什么原因，我以为我只是在拆分第三列，但我显然是在拆分字符串。谢谢你指出这一点。我知道这将是一件愚蠢而简单的事情。谢谢哇！这比我以前做的容易多了。非常感谢你这么做。