Python concat将值转换为nan数据

Python concat将值转换为nan数据,python,pandas,Python,Pandas,我有以下代码: gg=df_met[['Less','Middle','Greater']].resample('h').mean() Filtered_mean=Filtered[['Conc']].resample('h').mean() result = pd.concat([Filtered_mean, gg], axis=1, join_axes=[df1.index]) Reduced_result=result.dropna(axis=0,how='any') 其中gg是一个文

我有以下代码:

gg=df_met[['Less','Middle','Greater']].resample('h').mean()
Filtered_mean=Filtered[['Conc']].resample('h').mean()

result = pd.concat([Filtered_mean, gg], axis=1, join_axes=[df1.index])
Reduced_result=result.dropna(axis=0,how='any')
其中gg是一个文件:

                         Less    Middle   Greater
Date                                             


2004-02-27 00:00:00  0.000000  1.000000  0.000000
2004-02-27 01:00:00  0.000000  1.000000  0.000000
2004-02-27 02:00:00  0.000000  1.000000  0.000000
2004-02-27 03:00:00  0.083333  0.916667  0.000000
2004-02-27 04:00:00  0.583333  0.416667  0.000000
2004-02-27 05:00:00  0.083333  0.916667  0.000000
2004-02-27 06:00:00  0.666667  0.333333  0.000000
2004-02-27 07:00:00  0.750000  0.250000  0.000000
2004-02-27 08:00:00  0.250000  0.750000  0.000000
2004-02-27 09:00:00  1.000000  0.000000  0.000000
2004-02-27 10:00:00  0.250000  0.750000  0.000000
2004-02-27 11:00:00  1.000000  0.000000  0.000000
2004-02-27 12:00:00  0.916667  0.083333  0.000000
2004-02-27 13:00:00  0.000000  1.000000  0.000000
2004-02-27 14:00:00  0.000000  1.000000  0.000000
2004-02-27 15:00:00  0.000000  1.000000  0.000000
2004-02-27 16:00:00  0.000000  1.000000  0.000000
2004-02-27 17:00:00  0.000000  1.000000  0.000000
2004-02-27 18:00:00  0.000000  1.000000  0.000000
2004-02-27 19:00:00  0.083333  0.916667  0.000000
2004-02-27 20:00:00  0.000000  0.500000  0.500000
2004-02-27 21:00:00  0.000000  0.000000  1.000000
2004-02-27 22:00:00  0.000000  0.000000  1.000000
2004-02-27 23:00:00  0.000000  0.000000  1.000000
2004-02-28 00:00:00  0.000000  0.666667  0.333333
2004-02-28 01:00:00  0.000000  0.833333  0.166667
2004-02-28 02:00:00  0.000000  0.166667  0.833333
2004-02-28 03:00:00  0.000000  0.000000  1.000000
2004-02-28 04:00:00  0.000000  0.000000  1.000000
2004-02-28 05:00:00  0.000000  0.000000  1.000000

你的意思是:

                       Conc
2004-02-27 15:00  30.166667
2004-02-27 16:00  24.218182
2004-02-27 17:00  44.781818
2004-02-27 18:00  15.200000
2004-02-27 19:00  33.490000
2004-02-27 20:00  17.100000
2004-02-27 21:00  15.470000
2004-02-27 22:00  13.100000
2004-02-27 23:00  17.736364
2004-02-28 00:00  19.225000
2004-02-28 01:00   9.760000
2004-02-28 02:00   2.737500
2004-02-28 03:00   4.175000
2004-02-28 04:00   2.990000
2004-02-28 05:00   4.983333
2004-02-28 06:00   3.370000
2004-02-28 07:00   2.983333
2004-02-28 08:00   3.508333
2004-02-28 09:00   2.641667
2004-02-28 10:00   4.916667
2004-02-28 11:00   7.100000
2004-02-28 12:00  11.609091
2004-02-28 13:00   5.540000
2004-02-28 14:00   3.025000
2004-02-28 15:00   5.127273
2004-02-28 16:00  11.660000
2004-02-28 17:00   5.833333
2004-02-28 18:00   8.183333
2004-02-28 19:00  -0.158333
2004-02-28 20:00   6.575000
当我接触他们的时候

                      Conc  Less  Middle  Greater
Date                                              
2004-02-27 15:00  30.166667   NaN     NaN      NaN
2004-02-27 15:00  30.166667   NaN     NaN      NaN
2004-02-27 15:00  30.166667   NaN     NaN      NaN
2004-02-27 16:00  24.218182   NaN     NaN      NaN
这是因为索引列是整数吗

dtype='int64', length=34342, freq='H')
对于“gg”来说,这是一个约会时间

dtype='datetime64[ns]', name='Date', length=42479, freq='H')
如果是,一个如何将整个帧转换为另一个

完整代码:

import pandas as pd
import datetime as dt
import io 
import numpy as np
names=['Date','Wind Speed','Wind Direction']
df2 = pd.read_csv('Met_12_13.csv', index_col=0, names=names, parse_dates=[0])
df_met=df2
df_met.insert(2,'Less','Nan')
df_met.insert(3,'Middle','Nan')
df_met.insert(4,'Greater','Nan')
for line in df2:
    flag1=(df2['Wind Speed']<4)
    flag1=flag1.astype(int)
    flag2=(df2['Wind Speed']>=4 ) & (df2['Wind Speed']<=10)
    flag2=flag2.astype(int)
    flag3=(df2['Wind Speed']>10)
    flag3=flag3.astype(int)

    df_met['Less']=flag1
    df_met['Middle']=flag2
    df_met['Greater']=flag3



aethalometer=['Date','Chanel0','Chanel1','Chanel2','Chanel3','Chanel4','Chanel5','Chanel6','Chanel7']
#df1=pd.read_csv('result.txt', index_col=0,sep='\n', names=aethalometer, parse_dates=[0])
df1 = pd.read_csv('Ath_12_13.csv', sep=',', names=aethalometer ) #Spirows=1
df1['Date'] = pd.to_datetime(df1['Date'], errors='coerce')
for y in range (0,6):
    x=y+1
    df1[aethalometer[x]]= pd.to_numeric(df1[aethalometer[x]], errors='coerce')
    df1=df1[df1[aethalometer[x]]>-250]
    df1=df1[df1[aethalometer[x]]<500]
    df1['Date'] = pd.to_datetime(df1['Date'], errors='coerce')
    df1.index



print(len(df1))
#df1 = pd.read_csv(io.StringIO('Output14.csv'), parse_dates=[0], names=['Date','A','B','C','D','E','F','G', 'H'])
#df_mean = df1[['Conc']].resample('h').mean()
print("here")

#df1.index = df1.index.to_period('h')
df_met['per'] = df_met.index.to_period('h')

#df_mean.index=df_mean.index.to_period('h')
#print(len(df_mean)) 
pers = df_met.loc[(df2['Wind Direction'] > 340) | (df_met['Wind Direction'] < 12) , 'per'].unique()

print (pers)
print("here")
#%%
Filtered=df1.drop(pers)
#del Filtered['Date']

a=Filtered['Chanel1']
a.index = pd.to_datetime(a.index, errors='coerce')

b=Filtered['Chanel2']
b.index = pd.to_datetime(b.index, errors='coerce')

c=Filtered['Chanel3']
c.index = pd.to_datetime(c.index, errors='coerce')

d=Filtered['Chanel4']
d.index = pd.to_datetime(d.index, errors='coerce')

e=Filtered['Chanel5']
e.index = pd.to_datetime(e.index, errors='coerce')

f=Filtered['Chanel0']
f.index = pd.to_datetime(f.index, errors='coerce')

g=Filtered['Chanel7']
g.index = pd.to_datetime(g.index, errors='coerce')

a=a.resample('h').mean()
a_median=a.resample('h').median()  #This is how you would make it median
b=b.resample('h').mean()
c=c.resample('h').mean()
d=d.resample('h').mean()
e=e.resample('h').mean()
f=f.resample('h').mean()
g = pd.to_numeric(g, errors='coerce')
g=g.resample('h').mean()

Series=pd.concat([a,b,c,d,e,f,g],join='outer',axis=1)

gg=df_met[['Less','Middle','Greater']].resample('h').mean()

result_mean = pd.concat([Series, gg], axis=1, join_axes=[gg.index])
Reduced_result_mean=result_mean.dropna(axis=0,how='any')

Reduced_result_mean.to_csv("Final2012-13.csv")
将熊猫作为pd导入
将日期时间导入为dt
输入io
将numpy作为np导入
名称=[“日期”、“风速”、“风向”]
df2=pd.read\u csv('Met\u 12\u 13.csv',index\u col=0,name=name,parse\u dates=[0])
df_met=df2
df_met.插入(2,'Less','Nan')
df_met.插入(3,'Middle','Nan')
df_met.插入(4,'morer','Nan')
对于df2中的行:
flag1=(df2[“风速”]=4)和(df2[“风速”]10)
flag3=flag3.aType(int)
df_met['Less']=flag1
df_met['Middle']=flag2
df_符合['Greater']=flag3
aethalometer=['Date'、'Chanel0'、'Chanel1'、'Chanel2'、'Chanel3'、'Chanel4'、'Chanel5'、'Chanel6'、'Chanel7']
#df1=pd.read\u csv('result.txt',index\u col=0,sep='\n',name=aethalometer,parse\u dates=[0])
df1=pd.read_csv('Ath_12_13.csv',sep=',',NAME=aethalometer)#Spirows=1
df1['Date']=pd.to_datetime(df1['Date'],errors='concurve')
对于范围(0,6)内的y:
x=y+1
df1[aethalometer[x]]=pd.to_numeric(df1[aethalometer[x]],errors='concurve')
df1=df1[df1[aethalometer[x]]>-250]
df1=df1[df1[aethalometer[x]]340)|(df_气象[‘风向’]<12),‘每’)。唯一()
印刷品(每件)
打印(“此处”)
#%%
过滤=df1。下降(pers)
#del筛选['Date']
a=已过滤['Chanel1']
a、 index=pd.to_datetime(a.index,errors='concurve')
b=已过滤['Chanel2']
b、 index=pd.to_datetime(b.index,errors='concure')
c=已过滤['Chanel3']
c、 index=pd.to_datetime(c.index,errors='concurve')
d=已过滤['Chanel4']
d、 index=pd.to_datetime(d.index,errors='concurve')
e=已过滤['Chanel5']
e、 index=pd.to_datetime(e.index,errors='concurve')
f=已过滤['Chanel0']
f、 index=pd.to_datetime(f.index,errors='concurve')
g=已过滤['Chanel7']
g、 index=pd.to_datetime(g.index,errors='concurve')
a=a.重采样('h')。平均值()
a_median=a.resample('h').median()#这是如何使其成为中值的
b=b.重采样('h')。平均值()
c=c.重采样('h')。平均值()
d=d.重采样('h')。平均值()
e=e.重采样('h')。平均值()
f=f.重采样('h')。平均值()
g=pd.to_numeric(g,errors='concurve')
g=g.重采样('h')。平均值()
系列=局部混凝土([a、b、c、d、e、f、g],连接=外部,轴=1)
gg=df_满足[[较小的','中间的','较大的']]。重采样('h')。平均值()
结果平均值=pd.concat([Series,gg],轴=1,连接轴=[gg.index])
减少的结果平均值=结果平均值.dropna(轴=0,how='any')
将结果平均值减少到csv(“2012-13年最终csv”)

确实是。您应该在两个数据帧之间具有一致的索引类型

使用


现在,
filtered\u意味着
gg
都应该有日期时间索引。

它不识别日期是什么,我如何转换过滤的\u意味着与索引本身一起工作?我也为测试文件这样做,它工作得很好,但在那种情况下,由于某种原因,这两个元素都被检测为:整数而不是日期。那么我能从一个日期把gg变成一个整数吗?实际上可能更好,两者都是datetime@SLE,列名称是特定于大小写的。检查您的日期变量是否称为'date'或'date'其'date',因此我使用了它,但我认为这是由于列的数量。
filtered_mean.reset_index(inplace=True)
filtered_mean['date']=pd.to_datetime(filtered_mean['date'])
filtered_mean.set_index('date',inplace=True)