Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/video/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python:在';日期';并保存在磁盘上_Python_Pandas_Csv_Merge - Fatal编程技术网

Python:在';日期';并保存在磁盘上

Python:在';日期';并保存在磁盘上,python,pandas,csv,merge,Python,Pandas,Csv,Merge,我对Python完全陌生,正在尝试为这个问题编写代码: A) 目录中有多个*.csv文件,所有这些文件都具有相同的列标题和结构。文件名示例: Google.csv、Alphabet.csv、Teva.csv、Bosch.csv 名为Google.csv的文件中的内容示例: Date,Open,High,Low,Close 2000-01-06,15,32,33.7,49.2 2000-01-07,33.1,10.1,57.3,62 2000-01-10,221,62.4,66.9,790.5 2

我对Python完全陌生,正在尝试为这个问题编写代码:

A) 目录中有多个*.csv文件,所有这些文件都具有相同的列标题和结构。文件名示例: Google.csv、Alphabet.csv、Teva.csv、Bosch.csv

名为Google.csv的文件中的内容示例:

Date,Open,High,Low,Close
2000-01-06,15,32,33.7,49.2
2000-01-07,33.1,10.1,57.3,62
2000-01-10,221,62.4,66.9,790.5
2000-01-11,3.3,1.78,43.2,52.1
2000-01-12,73.2,54.0,121.6,89.4
名为Teva.csv的文件中的内容示例:

Date,Open,High,Low,Close
2000-01-01,115,312,332.7,449.2
2000-01-02,33.1,10.1,59.3,662
2000-01-03,22.1,623.4,663.9,794.5
2000-01-06,34.3,13.78,43.2,52.1
2000-01-07,703.2,504.0,121.6,879.4
B) 有一个文件“List.csv”,它包含一些公司名称,是上述目录中提到的csv文件的子集。示例内容:

Company
Google
Teva
Date,
2000-01-01,
2000-01-02,
2000-01-03,
2000-01-06,
2000-01-07,
2000-01-08,
2000-01-09,
C) 还有另一个文件“Dates.txt”,它只包含一些日期。示例内容:

Company
Google
Teva
Date,
2000-01-01,
2000-01-02,
2000-01-03,
2000-01-06,
2000-01-07,
2000-01-08,
2000-01-09,
我的目标是只合并List.txt(B)中列出的那些*.csv文件(A),以Dates.txt(C)中的“Date”作为键,只选择标题为“Low”的列,并将其保存在磁盘上作为csv文件

保存在磁盘上的最终csv文件应如下所示:

Date,Google,Teva
2000-01-01,,332.7
2000-01-02,,59.3
2000-01-03,,663.9
2000-01-06,33.7,43.2
2000-01-07,57.3,121.6
这是我设法拼凑的代码:

import os; import numpy as np; import csv; import pandas as pd; from shutil import copyfile
pd.set_option('display.max_rows', 500); pd.set_option('display.max_columns', 500); pd.set_option('display.width', 1000)
os.chdir('D:/SO/'); #print (os.getcwd())

open('temp.txt', 'a').close()
dst = 'Dates.txt';   temp1 = 'temp.txt'
path = "D:/SO/dir/";   directory = os.fsencode(path)

with open('temp.txt', 'w', newline='') as temp_date:
    copyfile(dst, temp1)
    f1 = pd.read_csv('Dates.txt', index_col = 1);  df1 = pd.DataFrame(f1);  # Read the dates in Dates.txt for joining
    with open('List.csv','r') as mylist:
        data = csv.reader(mylist, delimiter = ",")
        #next(data, None) # discard the header
        for i in data:
            c =i[0] + '.csv';  #print (c)#Add .csv to each line (CompanyName) in List.txt for searching the directory
            for file in os.listdir(path):       # Search for the file in directory
                if c in file:                 # if found,
                    print (file)
                    f2 = pd.read_csv(os.path.join(path, file));     df2 = pd.DataFrame(f2);  #print(df2.head(5))
                    f3= f1.merge(f2, how='left',on=['Date']); df3 = pd.DataFrame(f3); 
                    df3 = df3.drop(df3.columns[[1,2,4]], axis=1);  print(df3.head(10), '\n')  # merge
            continue
迄今为止的产出:

Google.csv
         Date   Low
0  2000-01-01   NaN
1  2000-01-02   NaN
2  2000-01-03   NaN
3  2000-01-06  33.7
4  2000-01-07  57.3
5  2000-01-08   NaN
6  2000-01-09   NaN 

Teva.csv
         Date    Low
0  2000-01-01  332.7
1  2000-01-02   59.3
2  2000-01-03  663.9
3  2000-01-06   43.2
4  2000-01-07  121.6
5  2000-01-08    NaN
6  2000-01-09    NaN 
查询: 上述代码确实分别加入/合并Dates.txt和所需文件。然而,我的要求是获得一个csv文件,日期在第0列,第2列的第一家公司,第3列的第二家公司,等等。有人能帮忙吗?我完全不了解Python,在这个论坛上找不到任何关于这个问题的问答

在Windows上使用Python 3.8.0

更新:

import os; import numpy as np; import csv; import pandas as pd; from shutil import copyfile
pd.set_option('display.max_rows', 500); pd.set_option('display.max_columns', 500); pd.set_option('display.width', 1000)
os.chdir('D:/SO/'); #print (os.getcwd())

open('temp.txt', 'a').close()
dst = 'Dates.txt';   temp1 = 'temp.txt'
path = "D:/SO/dir/";   directory = os.fsencode(path)

with open('temp.txt', 'w', newline='') as temp_date:
    copyfile(dst, temp1)
    f1 = pd.read_csv('Dates.txt', index_col = 1);  df1 = pd.DataFrame(f1);  # Read the dates in Dates.txt for joining
    with open('List.csv','r') as mylist:
        data = csv.reader(mylist, delimiter = ",")
        #next(data, None) # discard the header
        for i in data:
            c =i[0] + '.csv';  #print (c)#Add .csv to each line (CompanyName) in List.txt for searching the directory
            for file in os.listdir(path):       # Search for the file in directory
                if c in file:                 # if found,
                    print (file)
                    f2 = pd.read_csv(os.path.join(path, file));     df2 = pd.DataFrame(f2);  #print(df2.head(5))
                    f3= f1.merge(f2, how='left',on=['Date']); df3 = pd.DataFrame(f3); 
                    df3 = df3.drop(df3.columns[[1,2,4]], axis=1);  print(df3.head(10), '\n')  # merge
            continue

正如所建议的,通过将列表列表转换为简单列表,我能够实现我想要的:

with open('temp.txt', 'r') as List_txt:
    list_csv = csv.reader(List_txt);     #print(reader, '\n');
    flat_list = [val for sublist in list_csv for val in sublist];   #print(flat_list, '\n');

使用pandas和list comprehension,您可以执行以下操作:

import pandas as pd

# List of csv to retrieve
list_csv = pd.read_csv('../List.csv').tolist()

# List of dates
dates = pd.read_csv('../Dates.txt').tolist()

#Load only the csv's in the list
df = pd.concat([pd.read_csv(f'../{ticker}.csv', index_col='Date', usecols=['Date', 'Low']).rename(columns={'Low': ticker}) for ticker in list_csv], axis=1)

# Filter dates
df = df[df.index.isin(dates)]

# Write to a new csv
df.to_csv('../merged_file.csv')

第1行和第2行给出的错误是DataFrame对象没有属性“tolist”。所以我让它
pd.read_csv('List.csv').values.tolist()
但是,第三行
df=pd.concat([pd.read_csv(f'D:/SO/dir/{ticker}.csv',index_col='Date',usecols=['Date',Low'])。为列表中的ticker重命名(columns={'Low':ticker}),axis=1)
给出以下错误:文件b“D:/SO/dir/['Google'].csv”不存在:b“D:/SO/dir/['Google].csv”事实上,你看到括号内的“谷歌”让我觉得你有一个列表,而不是一个股票行情列表。因为我不知道您的数据的格式,所以您应该设法使用该格式。例如,您可以加载、访问列以获取熊猫系列,然后按照我的想法使用.tolist()。完成!谢谢你的帮助!