Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python连接_Python_Csv_Pandas_Concat - Fatal编程技术网

Python连接

Python连接,python,csv,pandas,concat,Python,Csv,Pandas,Concat,我正在下载数据并将其写入CSV,这些数据稍后将被读入python。一旦进入python(使用pandas.read_csv),特定的数据列将被连接起来。但是,有时我下载到CSV中的数据不可用,最终导致CSV文件不存在。当文件丢失时,连接过程(pandas.concat)由于“NameError:'xyz'未定义”而阻塞。有没有办法让pandas.concat忽略这样丢失的数据 下面是我的脚本 #!/usr/bin/python3.5 import pandas import time impo

我正在下载数据并将其写入CSV,这些数据稍后将被读入python。一旦进入python(使用pandas.read_csv),特定的数据列将被连接起来。但是,有时我下载到CSV中的数据不可用,最终导致CSV文件不存在。当文件丢失时,连接过程(pandas.concat)由于“NameError:'xyz'未定义”而阻塞。有没有办法让pandas.concat忽略这样丢失的数据

下面是我的脚本

#!/usr/bin/python3.5

import pandas
import time
import os

GFS = pandas.read_csv('/home/user/NWP/TEXT/GFS/PJMS/' + time.strftime("%Y%m%d") + '/' + 'PJMS_GFS_temps.csv', index_col=1)
ARW = pandas.read_csv('/home/user/NWP/TEXT/ARW/PJMS/' + time.strftime("%Y%m%d") + '/' + 'PJMS_ARW_temps.csv', index_col=1)
HRDPS = pandas.read_csv('/home/user/NWP/TEXT/HRDPS/PJMS/' + time.strftime("%Y%m%d") + '/' + 'PJMS_HRDPS_temps.csv', index_col=1)
NAM4 = pandas.read_csv('/home/user/NWP/TEXT/NAM4/PJMS/' + time.strftime("%Y%m%d") + '/' + 'PJMS_NAM4_temps.csv', index_col=1)
GFSMOS = pandas.read_csv('/home/user/NWP/TEXT/GFSMOS/PJMS/' + time.strftime("%Y%m%d") + '/' + 'PJMS_GFSMOS_temps.csv', index_col=1)
ICON = pandas.read_csv('/home/user/NWP/TEXT/ICON/PJMS/' + time.strftime("%Y%m%d") + '/' + 'PJMS_ICON_temps.csv', index_col=1)


COMP = pandas.concat([GFS['PJMS GFS Temp'], ARW['PJMS ARW Temp'], HRDPS['PJMS HRDPS Temp'], NAM4['PJMS NAM4 Temp'], GFSMOS['PJMS GFSMOS Temp'], ICON['PJMS ICON Temp']], axis=1)

path = "/home/user/NWP/TEXT/COMPOSITES/" + time.strftime("%Y%m%d")
COMP.to_csv(os.path.join(path,'PJMS_ALL_temps.csv'))

exit()
以及CSV的一个示例。它们的格式都是一样的

,Date,SHD Temp,HEF Temp,OFP Temp,NTU Temp,PGV Temp,ROA Temp,ADW Temp,PJMS GFS Temp
0,2017040101,47.93,44.87,51.53,58.37,59.09,50.27,47.21,51.073885
1,2017040102,47.75,45.59,51.35,58.01,57.47,50.45,45.77,51.058891
2,2017040103,46.85,45.05,51.17,57.11,56.39,49.19,46.31,50.441292999999995
3,2017040104,46.85,45.23,50.09,55.85,56.03,49.01,46.31,49.91994100000001
4,2017040105,47.21,42.71,49.91,54.77,54.23,49.01,46.13,48.67237900000001
5,2017040106,47.75,43.79,50.09,53.69,53.15,49.73,45.41,48.780829
6,2017040107,47.93,44.51,49.55,53.15,52.25,50.09,44.51,48.728466999999995
7,2017040108,48.11,44.87,49.01,52.43,51.53,50.27,44.33,48.527857000000004
8,2017040109,48.29,45.95,48.83,51.53,50.99,50.09,43.97,48.578959000000005
9,2017040110,48.83,48.11,48.11,51.17,50.81,49.37,43.97,49.049803000000004
10,2017040111,48.83,48.11,47.21,50.99,50.63,48.29,45.23,48.790405
11,2017040112,49.37,47.39,49.19,52.25,52.43,48.11,47.03,49.451023000000006
这是由此产生的错误


COMP=pandas.concat([GFS['PJMS GFS Temp'],ARW['PJMS ARW Temp'],HRDPS['PJMS HRDPS Temp'],NAM4['PJMS NAM4 Temp'],GFSMOS['PJMS GFSMOS Temp'],ICON['PJMS ICON Temp'],ICON['PJMS ICON Temp'],axis=1)回溯(最后一次调用):文件“”,第1行,名称错误:名称“GFSMOS”未定义。下面是一些代码,用于调查与
read\u csv
相关的各种事情

import pandas as pd

try:
    df = pd.read_csv('nosuchfile.csv')
    print("df=",df)
except Exception as e:
    print("no file raised:", e)

try:
    f = open('empty.csv', 'w')
    f.close()
    df = pd.read_csv('empty.csv')
    print(df)
except Exception as e:
    print('empty file raised:', e)

df = pd.DataFrame({'a': [1,2,3], 'b':[4,5,6]})
print(df)

if 'a' in df:
    print("yes a")
else:
    print("no a")

if 'xyz' in df:
    print("yes xyz")
else:
    print("no xyz")
输出如下所示:

no file raised: File b'nosuchfile.csv' does not exist
empty file raised: No columns to parse from file
   a  b
0  1  4
1  2  5
2  3  6
yes a
no xyz
这告诉我您的输入csv文件不是空的/不存在的(因为您没有询问这些错误),而且您可以使用Python
in
操作符提前检查列

(当然是EAFP,所以您可以捕获异常并从那里开始…)

更新

根据注释,我修改了代码,将内容存储在字典中,由明显的字符串名称索引,而不是存储在单独的变量中。这使得它更容易处理,也更容易使用循环进行操作

修改后的版本会忽略FileNotFound错误,并跳过不可用的数据。我不知道这是否真的有用,但这似乎是你想要的

import pandas
import time
import os

Csv_fmt = "{home}/TEXT/{type}/PJMS/{timestamp}/PJMS_{type}_temps.csv"
Home_dir = os.environ('HOME')
Timestamp = time.strftime("%Y%m%d")

# Note: ordering here taken from your concat statement.
Temp_types = "GFS ARW HRDPS NAM4 GFSMOS ICON".split()

Temp_dataframes = {}

for temp in Temp_types:
    try:
        path = Csv_fmt.format(home=Home_dir, type=temp, timestamp=Timestamp)
        df = pandas.read_csv(path, index_col=1)
        Temp_dataframes[temp] = df
    except FileNotFoundError:
        print("File not found: ",temp)
        pass

Col_fmt = 'PJMS {} Temp'

concat_cols = [Temp_dataframes[tt][Col_fmt.format(tt)] for tt in Temp_types if tt in Temp_dataframes]
COMP = pandas.concat(concat_cols)

#COMP = pandas.concat([GFS['PJMS GFS Temp'], ARW['PJMS ARW Temp'], HRDPS['PJMS HRDPS Temp'], NAM4['PJMS NAM4 Temp'], GFSMOS['PJMS GFSMOS Temp'], ICON['PJMS ICON Temp']], axis=1)

path = "/home/user/NWP/TEXT/COMPOSITES/" + time.strftime("%Y%m%d")
COMP.to_csv(os.path.join(path,'PJMS_ALL_temps.csv'))

exit()

你需要显示你的代码和一个不存在的文件将引发的数据示例。一个完全空的文件将引发。如果没有得到异常,则可能是文件不是空的,或者是捕获错误并将其抑制。向我们显示您的代码和数据。添加的代码和数据GFSMOS csv文件是什么样子的?在本例中,它不存在,因为源数据不可用。FileNotFoundError:文件b'/home/user/NWP/TEXT/GFSMOS/PJMS/20170401/PJMS\u GFSMOS\u temps.csv'不存在>>关联的python错误必须调整home\u dir行(home\u dir=os.environ[“home”])和COMP行(COMP=pandas.concat(concat\u cols,axis=1)),但在那之后,这正是我所需要的。谢谢