Python 循环完成后存储数组的每次迭代

Python 循环完成后存储数组的每次迭代,python,arrays,numpy,matrix,Python,Arrays,Numpy,Matrix,我对Python非常陌生。我已经广泛地寻找解决我问题的办法,但我左右为难 我使用以下代码生成了一系列数组: fh = open(short_seq, 'r') line_counter = 0 pos = [0] array = [0.0 for x in range(101)] for line in fh: line_counter += 1.0 for i in line: score = ord(i) - 33.0 array[pos

我对Python非常陌生。我已经广泛地寻找解决我问题的办法,但我左右为难

我使用以下代码生成了一系列数组:

fh = open(short_seq, 'r')
line_counter = 0
pos = [0]
array = [0.0 for x in range(101)]
for line in fh:
    line_counter += 1.0   
    for i in line:
        score = ord(i) - 33.0
        array[pos] += score
        pos += 1
在循环内部打印之后,我得到了一个大系列的数组

[1,2,3,4.....]
[2,3,4,5,6.....]
[3,4,5,6,7,8.....100]
...
我想使用NumPy在每一列上运行stats,它们以特定的对齐方式打印出来,但是一旦我在循环之外,我只能调用整个循环的总和。我尝试了np.concatenate,但仍然留下了数组的和。如果我在循环中使用NumPy,那么我只能在每个列上运行stats,一次迭代一次,而不是整个系列。我的下一个想法是将每次迭代都添加到二维矩阵中,但我不知道如何保持对齐

任何帮助都将不胜感激

编辑:这是我的数据示例(在文本编辑器中,四个字符串中的每一个都位于另一个字符串的正下方)。我正在尝试将几千行ascii转换为数值。每一行必须在一个100个字符长的数组中,然后我需要在每一列上运行stats

CCCFFFHHHHHHHHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJIIIfGfgiiIHGHGHGHGHEHHFDFFFFFDDDDDBDDDDDDDEEDD CCCFFFHHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ CCCFFFHHHHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJ< BCCFFFDFHHHHHJJJJJJJJJJJIIJJJI@HGIIIJJJJJIJJIJIIJJJJJJJJJHHHHHHFFFDDDDDDDDDDDDDDDD?BDDDD@CDDDDDBDDDDD

array = [0.0 for x in range(101)]
这是一份清单
array=np。零((101,),浮点)
是一个大小相同的数组

使用fh中的
行:
可以得到一行,一个字符串。我希望行中的I
迭代该字符串中的字符。这真的是你想要的吗

for i in line:
    score = ord(i) - 33.0
    array[pos] += score
    pos += 1
通常,当人们阅读文本文件时,他们希望列的值用空格或逗号分隔,例如

 123, 345, 344, 233
 343, 342, 343, 343
我们使用
lines.split(',')
将字符串拆分为子字符串。和
float
int
将它们转换为数字,例如

 data = [float(substring) for substring in line.split(',')]
向我们展示您的一些数据文件或简化版本。这将更容易帮助。一个关键问题是,跨行的“列”数量是否一致

通常,当我们迭代数组的行时,我们会在列表中收集行值。如果子列表中的元素数量一致,我们可以将其转换为2d数组

 lines = []
 for line in fh:
     data = [float(i) for i in line.split(',')]
     lines.append(data)
 print(lines)
 # A = np.array(lines) 
===============================

通过您的样品线,我可以做到:

In [258]: with open('stack38175089.txt') as f:
    lines=f.readlines()
   .....:     

In [259]: [len(l) for l in lines]
Out[259]: [102, 102, 102, 102]

In [260]: data=np.array([[ord(i) for i in l.strip()] for l in lines])

In [261]: data.shape
Out[261]: (4, 101)

In [262]: data
Out[262]: 
array([[67, 67, 67, 70, 70, 70, 70, 70, 72, 72, 72, 72, 72, 73, 74, 74, 74,
        74, 74, 74, 73, 74, 74, 74, 74, 74, 74, 74, 74, 73, 74, 74, 74, 73,
        74, 74, 74, 74, 74, 74, 74, 73, 74, 74, 73, 74, 74, 71, 73, 73, 73,
        72, 73, 73, 73, 70, 71, 73, 71, 70, 72, 70, 71, 73, 73, 73, 72, 73,
        72, 72, 71, 69, 72, 72, 70, 68, 70, 70, 70, 70, 70, 68, 68, 68, 68,
        68, 66, 68, 68, 68, 68, 68, 68, 68, 68, 69, 68, 69, 69, 68, 68],
       ...
       [66, 67, 67, 70, 70, 70, 68, 70, 72, 72, 72, 72, 72, 74, 74, 74, 74,
        74, 74, 74, 74, 74, 74, 74, 73, 73, 74, 74, 74, 73, 64, 72, 71, 73,
        73, 73, 74, 74, 74, 74, 74, 73, 74, 74, 73, 74, 73, 73, 74, 74, 74,
        74, 74, 74, 74, 74, 74, 72, 72, 72, 72, 72, 72, 70, 70, 70, 68, 68,
        68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 63, 66, 68,
        68, 68, 68, 64, 67, 68, 68, 68, 68, 68, 66, 68, 68, 68, 68, 68]])
使用这样的2d数组,我可以轻松地移动值(
-33
),并对行或列应用统计计算


我可以单独阅读这些行,并在一系列列表中收集这些值。但是这个示例,我怀疑您的整个文件,足够小,可以使用
readlines

尝试
numpy.sum(array,axis=0)
。感谢您的回复。原始数据(ascii字符)在文件的各行中是一致的,但是,当我转换字符并开始在循环中填充数组时,它是倾斜的,但仅在开始时。我将示例加载到2d数组中。