在Python中减去数据帧
我试图减去两个csv文件(mycsv.csv,mycsv2.csv)的第二列,同时保持两个csv文件的第一列相同。正如您在下面看到的,后者做得非常好,但是价格列(2和4)只会返回NaN在Python中减去数据帧,python,pandas,dataframe,csv,Python,Pandas,Dataframe,Csv,我试图减去两个csv文件(mycsv.csv,mycsv2.csv)的第二列,同时保持两个csv文件的第一列相同。正如您在下面看到的,后者做得非常好,但是价格列(2和4)只会返回NaN col2 col4 col1 MMM NaN NaN WBAI NaN NaN WUBA NaN NaN EGHT NaN NaN AHC NaN NaN col2 col4 col1 MMM
col2 col4
col1
MMM NaN NaN
WBAI NaN NaN
WUBA NaN NaN
EGHT NaN NaN
AHC NaN NaN
col2 col4
col1
MMM NaN NaN
WBAI NaN NaN
WUBA NaN NaN
EGHT NaN NaN
AHC NaN NaN
我不知道这个错误是从哪里来的,所以我为太多的代码道歉。谢谢你能给予的任何帮助
data_sheet1 = pd.read_excel('C:\\Users\\sss\\Downloads\\Book1.xlsx')
data_impor = data_sheet1['DDD'].tolist()
def get_ohlc(symbols):
data = get_quotes(symbol=symbols)
symbols_and_lastPrices = [] #create empty list
for symbol in symbols:
symbols_and_lastPrices.append([symbol, data[symbol]['lastPrice']]) #append [symbol, lastPrice]-pairs to list.
return symbols_and_lastPrices #return list
csv_data = get_ohlc(data_impor) #save returned list
#write csv
with open ("mycsv.csv", "w" , newline='' ) as f:
thewriter = csv.writer(f)
thewriter.writerow(['col1', 'col2'])
thewriter.writerows(csv_data) #write all data rows at the same time
with open('mycsv.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(row)
time.sleep(2)
csv2_data = get_ohlc(data_impor) #save returned list
#write csv
with open ("mycsv2.csv", "w" , newline='' ) as f:
thewriter = csv.writer(f)
thewriter.writerow(['col3', 'col4'])
thewriter.writerows(csv2_data) #write all data rows at the same time
with open('mycsv2.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(row)
df1 = pd.read_csv('mycsv.csv', index_col = 'col1')
df2 = pd.read_csv('mycsv2.csv', index_col = 'col3')
df3 = df1.sub(df2)
print(df3.head())
我试图减去两个csv文件(mycsv.csv,mycsv2.csv)的第二列,同时保持两个csv文件的第一列相同。正如您在下面看到的,后者做得非常好,但是价格列(2和4)只会返回NaN
col2 col4
col1
MMM NaN NaN
WBAI NaN NaN
WUBA NaN NaN
EGHT NaN NaN
AHC NaN NaN
col2 col4
col1
MMM NaN NaN
WBAI NaN NaN
WUBA NaN NaN
EGHT NaN NaN
AHC NaN NaN
我不知道这个错误是从哪里来的,所以我为太多的代码道歉。谢谢你能给予的任何帮助
data_sheet1 = pd.read_excel('C:\\Users\\sss\\Downloads\\Book1.xlsx')
data_impor = data_sheet1['DDD'].tolist()
def get_ohlc(symbols):
data = get_quotes(symbol=symbols)
symbols_and_lastPrices = [] #create empty list
for symbol in symbols:
symbols_and_lastPrices.append([symbol, data[symbol]['lastPrice']]) #append [symbol, lastPrice]-pairs to list.
return symbols_and_lastPrices #return list
csv_data = get_ohlc(data_impor) #save returned list
#write csv
with open ("mycsv.csv", "w" , newline='' ) as f:
thewriter = csv.writer(f)
thewriter.writerow(['col1', 'col2'])
thewriter.writerows(csv_data) #write all data rows at the same time
with open('mycsv.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(row)
time.sleep(2)
csv2_data = get_ohlc(data_impor) #save returned list
#write csv
with open ("mycsv2.csv", "w" , newline='' ) as f:
thewriter = csv.writer(f)
thewriter.writerow(['col3', 'col4'])
thewriter.writerows(csv2_data) #write all data rows at the same time
with open('mycsv2.csv', 'r') as f:
reader = csv.DictReader(f)
for row in reader:
print(row)
df1 = pd.read_csv('mycsv.csv', index_col = 'col1')
df2 = pd.read_csv('mycsv2.csv', index_col = 'col3')
df3 = df1.sub(df2)
print(df3.head())
Pandas使用索引对齐执行大多数操作,请参见此示例。数据帧中的行索引和列标题都是
pd.Index
,因此熊猫将水平(行)和垂直(列)对齐数据
注意,结果中同时包含col1和col2,并且由于对齐而为NaN
print(df1 - df2.rename(columns={'col1':'col2'}))
# col2 col4
# No
# MMM 5.0 5.0
# N300 NaN NaN
# SCOT NaN NaN
# WBAI 5.0 5.0
# WUBA 5.0 5.0
这里,我们有正确的列对齐方式,但有两个索引没有对齐pandas使用数据对齐方式对索引执行大部分操作。行索引和列标题。因此,我认为您的问题在于您正在导入具有不同列标题的csv文件。要减去这两个,您需要从csv导入中重命名一个数据帧,以匹配其他列标题。谢谢!你想说的话把我弄糊涂了。所以我需要重命名我称之为数据帧的东西?我真的不明白这与专栏标题有什么关系见下面答案中的示例。现在我明白了,谢谢!