python：循环遍历txt文件并删除前几行字符串_Python_Numpy_Text_Valueerror

python：循环遍历txt文件并删除前几行字符串

python numpy text

python：循环遍历txt文件并删除前几行字符串,python,numpy,text,valueerror,Python,Numpy,Text,Valueerror,我有文本文件，每个文件有90列不同长度的timeseries数据。在这90列之前，我要删除6行垃圾字符串数据。从第7行开始，所有数据均为浮点型我尝试了以下操作，但没有更改我的文件： folder = '/Users/LR/Desktop/S2' files = os.listdir(folder) for filename in files: lines = open(filename).readlines() open(filename, 'w').writeli

我有文本文件，每个文件有90列不同长度的timeseries数据。在这90列之前，我要删除6行垃圾字符串数据。从第7行开始，所有数据均为浮点型

我尝试了以下操作，但没有更改我的文件：

folder = '/Users/LR/Desktop/S2'
files = os.listdir(folder)
for filename in files:
       lines = open(filename).readlines()
       open(filename, 'w').writelines(lines[6:])

我还尝试加载文件并跳过前6行，但是numpy.loadtxt不起作用，除非我将dtype设置为'str'。它成功地剪切了前6行。。但它作为字符串ndarray对象导入，我不知道如何将其转换为浮点数组

data = np.loadtxt('STS2.txt', delimiter = '\t', skiprows=6, dtype='str')
data = data.astype(float) # this gives the error: ValueError: could not convert string to float:

当我设置dtype=float时，我得到了相同的ValueError：

data_float = np.loadtxt('STS2.txt', delimiter='\t', dtype=float, skiprows=7) # this gives the error: ValueError: could not convert string to float:

有人知道解决这个问题的方法吗？

你可以使用

pandas

来帮助你。使用以下代码：

import pandas as pd
import numpy as np

df = pd.read_csv('STS1.txt', delimiter='\t', skiprows=[0,1,2], index_col=0)
df = df.T.set_index(np.nan, append=True).T

我能够加载下表：

请注意，您的列现在是分层的。您可以检查您的类型：

df.dtypes

输出：

1      float64
2      float64
3      float64
4      float64
...

您还可以轻松地将数据转换为

int

：

df = df.fillna(0).astype(int)

每行的最后一个字段是空字符串，因此

numpy

无法将其解析为

float

。您只对前90列感兴趣，因此添加

usecols=range（90）

：

（当然，如果您已经切掉了前六行，现在可以删除

skiprows=6

）

编辑

由于第一列似乎只是一个索引，您可以使用

usecols=range（1，90）

忽略它。

您可能需要使用

os.path.join（folder，filename）

。我应该在哪里将其添加到代码中？对不起，我是python新手，一般都是

open（os.path.join（folder，filename））

和

open（os.path.join（folder，filename），'w'）

这对删除前6行很有效，谢谢！但是，当我尝试在中读取文本文件时，仍然会遇到相同的值错误：无法将字符串转换为浮点，即使数据只是数字。我认为您必须至少共享文件的一些内容，以便任何人都能提供帮助。完美。非常感谢。我怎么用清理过的文件覆盖旧的txt文件呢？我不知道你的意思。什么文件被清理了？@thymeandspace我想你想要什么？我该如何保存顶部不包含6个垃圾行的新numpy数组，或者如何保存额外的列以使它们覆盖旧文件？我编辑了我的帖子以显示我的问题所在。再次感谢你的帮助。

np.loadtxt('STS2.txt', skiprows=6, usecols=range(90))