如何通过python跳过大量CSV文件的某些行&；csv？_Python_Pandas_Csv

如何通过python跳过大量CSV文件的某些行&；csv？

python pandas csv

如何通过python跳过大量CSV文件的某些行&；csv？,python,pandas,csv,Python,Pandas,Csv,我将许多CSV文件放在一个文件夹中，希望先跳过某一行（例如第10行），然后每五行取一行。我可以做第一步，但不知道第二步谢谢 import pandas as pd import csv, os # Loop through every file in the current working directory. for csvFilename in os.listdir('path'): if not csvFilename.endswith('.csv'): c

我将许多CSV文件放在一个文件夹中，希望先跳过某一行（例如第10行），然后每五行取一行。
我可以做第一步，但不知道第二步

谢谢

import pandas as pd
import csv, os


# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
    if not csvFilename.endswith('.csv'):
        continue
    # Now let's read the dataframe
    # total row number
    total_line = len(open('path' + csvFilename).readlines())
    # put the first and last to a list
    line_list = [total_line] + [1]
    df = pd.read_csv('path' + csvFilename, skiprows=line_list)
    new_file_name = csvFilename

    # And output
    df.to_csv('path' + new_file_name, index=False)

正确的代码如下所示

import numpy as np
import pandas as pd
import csv, os

# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
    if not csvFilename.endswith('.csv'):
        continue
    # Now let's read the dataframe
    total_line = len(open('path' + csvFilename).readlines())
    skip = np.arange(total_line)
    # skip 5 rows
    skip = np.delete(skip, np.arange(0, total_line, 5))
    # skip the certain row you would like, e.g. 10
    skip = np.append(skip, 10)
    df = pd.read_csv('path' + csvFilename, skiprows=skip)

    new_file_name = '2' + csvFilename
    # And output
    df.to_csv('path' + new_file_name, index=False)

您可以将函数与

skiprows

一起使用

我在下面编辑了您的代码：

将numpy导入为np
导入csv，操作系统
#循环浏览当前工作目录中的每个文件。
对于os.listdir（'path'）中的csvFilename：
如果不是csvFilename.endswith（'.csv'）：
持续
#现在让我们阅读数据帧
total_line=len（打开（'path'+csvFilename）.readlines（））
df=pd.read\u csv（'path'+csvFilename，skiprows=lambda x:x在列表中（范围（总行））[1:-1:5]）
新建文件名=csvFilename
#和输出
df.to_csv（'path'+新文件名，index=False）

这是否回答了您的问题？如果你想补充一些东西，你可以回答这个问题，或者如果你有答案，你可以补充（回答你自己的问题也可以）。如果我链接的问题回答了您的问题，您可以接受副本。：）谢谢你的帮助。我已经更新了我的代码，但是仍然有一些问题。没有问题

skip

包含要跳过的行，因此需要删除行

np.delete（skip，total_line-1，0）

和

np.delete（skip，1，0）

。对于最后一个，您可能应该从1开始：

np.delete（skip，np.arange（1，total\u line，5））

。对于最后一行，您需要确保它位于

skip

列表中，或者您可以使用

read\u csv

中的

skipfooter

参数。感谢您的帮助。我已经解决了这个问题，出了点问题。如果我这样做，它将跳过我真正想要的内容。您可以将代码中的“[1:-1:5]”部分更改为“[1:-1:6]”或将其更改为“[1:-1:4]”，您将得到您想要的内容。