Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用python3.x每隔5行检查一次csv_Python_Python 3.x_Pandas_Csv_Time - Fatal编程技术网

使用python3.x每隔5行检查一次csv

使用python3.x每隔5行检查一次csv,python,python-3.x,pandas,csv,time,Python,Python 3.x,Pandas,Csv,Time,csv数据: >c1,v1,c2,v2,Time >13.9,412.1,29.7,177.2,14:42:01 >13.9,412.1,29.7,177.2,14:42:02 >13.9,412.1,29.7,177.2,14:42:03 >13.9,412.1,29.7,177.2,14:42:04 >13.9,412.1,29.7,177.2,14:42:05 >0.1,415.1,1.3,-0.9,14:42:06 >0.1,

csv数据:

>c1,v1,c2,v2,Time

>13.9,412.1,29.7,177.2,14:42:01

>13.9,412.1,29.7,177.2,14:42:02

>13.9,412.1,29.7,177.2,14:42:03

>13.9,412.1,29.7,177.2,14:42:04

>13.9,412.1,29.7,177.2,14:42:05

>0.1,415.1,1.3,-0.9,14:42:06

>0.1,408.5,1.2,-0.9,14:42:07

>13.9,412.1,29.7,177.2,14:42:08

>0.1,413.4,1.3,-0.9,14:42:09

>0.1,413.8,1.3,-0.9,14:42:10
我当前拥有的代码:

import pandas as pd
import csv 
import datetime as dt


#Read .csv file, get timestamp and split it into date and time separately
Data = pd.read_csv('filedata.csv', parse_dates=['Time_Stamp'], infer_datetime_format=True)
Data['Date'] = Data.Time_Stamp.dt.date
Data['Time'] = Data.Time_Stamp.dt.time
#print (Data)
print (Data['Time_Stamp'])
Data['Time_Stamp'] = pd.to_datetime(Data['Time_Stamp'])
#Read timestamp within a certain range
mask = (Data['Time_Stamp'] > '2017-06-12 10:48:00') & (Data['Time_Stamp']<= '2017-06-12 11:48:00')
june13 = Data.loc[mask]
#print (june13)
将熊猫作为pd导入
导入csv
将日期时间导入为dt
#读取.csv文件,获取时间戳并将其分别拆分为日期和时间
Data=pd.read\u csv('filedata.csv',parse\u dates=['Time\u Stamp'],expert\u datetime\u format=True)
数据['Date']=Data.Time\u Stamp.dt.Date
数据['Time']=Data.Time\u Stamp.dt.Time
#打印(数据)
打印(数据[‘时间戳’])
数据['Time\u Stamp']=pd.to\u日期时间(数据['Time\u Stamp'])
#读取一定范围内的时间戳

mask=(Data['Time_Stamp']>'2017-06-12 10:48:00')&(Data['Time_Stamp']我不知道csv文件周围的模块,因此我的答案可能看起来很简单,我不太确定您在这里试图实现什么,但您想过以文本方式处理该文件吗

从我得到的信息来看,你需要读取每个c1,检查值并修改它

要读取和修改文件,您可以执行以下操作:

with open('filedata.csv', 'r+') as csv_file:
    lines = csv_file.readlines()

    # for each line, isolate data part and check - and modify, the first one if needed.
    # I'm seriously not sure, you might have wanted to read only one out of five lines. 
    # For that, just do a while loop with an index, which increments through lines by 5.
    for line in lines:
        line = line.split(',')  # split comma-separated-values

        # Check condition and apply needed change.
        if float(line[0]) >= 10:
            line[0] = "0"  # Directly as a string. 

        # Transform the list back into a single string.
        ",".join(line)

    # Rewrite the file.
    csv_file.seek(0)
    csv_file.writelines(lines)

    # Here you are ready to use the file just like you were already doing.
    # Of course, the above code could be put in a function for known advantages.
(我这里没有python,所以我无法测试它,可能会有打字错误。)

如果只需要数据帧而不需要修改文件:

with open('filedata.csv', 'r+') as csv_file:
    lines = csv_file.readlines()

    # for each line, isolate data part and check - and modify, the first one if needed.
    # I'm seriously not sure, you might have wanted to read only one out of five lines. 
    # For that, just do a while loop with an index, which increments through lines by 5.
    for line in lines:
        line = line.split(',')  # split comma-separated-values

        # Check condition and apply needed change.
        if float(line[0]) >= 10:
            line[0] = "0"  # Directly as a string. 

        # Transform the list back into a single string.
        ",".join(line)

    # Rewrite the file.
    csv_file.seek(0)
    csv_file.writelines(lines)

    # Here you are ready to use the file just like you were already doing.
    # Of course, the above code could be put in a function for known advantages.
老实说也差不多。 您可以执行以下操作,而不是在最后写入文件:

from io import StringIO  # pandas needs stringIO instead of strings.

# Above code here, but without the last 6 lines.

Data = pd.read_csv(
    StringIo("\n".join(lines)),
    parse_dates=['Time_Stamp'],
    infer_datetime_format=True
)
这将为您提供所拥有的数据,并在需要时更改值

希望这不是完全不可能的。另外,有些人可能会觉得这种方法很糟糕;我们已经编写了工作模块来做这类事情,那么为什么还要自己修改和处理粗糙的原始数据呢?就个人而言,我认为这通常比学习我生活中使用的所有外部模块要容易得多,如果我不尝试的话了解如何使用文件的文本表示。您的意见可能会有所不同

此外,此代码可能会导致性能降低,因为我们需要在文本中迭代两次(pandas在读取时会这样做)。但是,我不认为您可以像以前那样读取csv,然后遍历数据以检查条件,从而获得更快的结果。(您可能会赢得按c1检查值进行的转换,但差异很小,根据当前优化的状态,遍历pandas dataframe的速度也可能比列表慢。)


当然,如果您真的不需要pandas数据帧格式,您可以完全手动完成,只需要再多几行(或者不需要,tbh)而且不应该太慢,因为迭代的数量将被最小化:您可以在读取数据的同时检查数据的条件。时间越来越晚了,我相信您可以自己解决这个问题,所以我不会在我伟大的编辑器(称为stackoverflow)中对其进行编码,询问是否有任何问题!

只是一个旁注,在函数和参数之间插入空格被认为是不好的做法,例如
print(Data)
。我想你可以在Pep8中找到这方面的内容。()我会清除空白并查看链接。谢谢你让我知道,我也花了更多的时间尝试和回答,但我可能完全不知道你想做什么,如果是这样的话就这么说,如果你给出一些精确的答案,如果我能找到另一个解决方案,我会编辑我的答案。我确实需要阅读c1的每一行,但我实际上想做的是一次读取5行,检查这5行中是否只有1个大于10.0的c1值,如果满足条件,则替换c1值(>10.0)使用0.0。也感谢您分享您的答案,但我不认为这是我想要得到的。我首先开始实现您在我脑海中所说的,然后我意识到。如果您正在更改的值大于10.0,那么读取5与逐个读取以检查相同条件有何区别?----无论如何,如果您是真的如果您想这样做,仍然可以使用
index
变量和它的末端等价物(index+5)(小心,索引可能超出范围)迭代五个元素列表,并使用切片。。。