Python使用xarray从NETCDF文件中提取多个lat/long
我有一个NC文件time,lat,long,我正在尝试提取多个站点lat/long点的时间序列。 因此,我尝试通过这种方式读取坐标并从NC文件中提取最接近的值:Python使用xarray从NETCDF文件中提取多个lat/long,python,pandas,netcdf,python-xarray,data-extraction,Python,Pandas,Netcdf,Python Xarray,Data Extraction,我有一个NC文件time,lat,long,我正在尝试提取多个站点lat/long点的时间序列。 因此,我尝试通过这种方式读取坐标并从NC文件中提取最接近的值: import pandas as pd import xarray as xr nc_file = r"C:\Users\lab\Desktop\harvey\example.nc" NC = xr.open_dataset(nc_file) csv = r"C:\Users\lab\Desktop\harvey\stations.cs
import pandas as pd
import xarray as xr
nc_file = r"C:\Users\lab\Desktop\harvey\example.nc"
NC = xr.open_dataset(nc_file)
csv = r"C:\Users\lab\Desktop\harvey\stations.csv"
df = pd.read_csv(csv,delimiter=',')
Newdf = pd.DataFrame([])
# grid point lists
lat = df["Lat"]
lon = df["Lon"]
point_list = zip(lat,lon)
for i, j in point_list:
dsloc = NC.sel(lat=i,lon=j,method='nearest')
DT=dsloc.to_dataframe()
Newdf=Newdf.append(DT,sort=True)
代码工作正常,并返回以下内容:
EVP lat lon
time
2019-01-01 19:00:00 0.0546 40.063 -88.313
2019-01-01 23:00:00 0.0049 40.063 -88.313
2019-01-01 19:00:00 0.0052 41.938 -93.688
2019-01-01 23:00:00 0.0029 41.938 -93.688
2019-01-01 19:00:00 0.0101 52.938 -124.938
2019-01-01 23:00:00 0.0200 52.938 -124.938
2019-01-01 19:00:00 0.1644 39.063 -79.438
2019-01-01 23:00:00 -0.0027 39.063 -79.438
但是,我需要将每个坐标的原始lat/long文件中的桩号ID关联起来,如下所示:
Station-ID Lat Lon time EVP lat lon
0 Bo1 40.00620 -88.29040 1/1/2019 19:00 0.0546 40.063 -88.313
1 1/1/2019 23:00 0.0049 40.063 -88.313
2 Br1 41.97490 -93.69060 1/1/2019 19:00 0.0052 41.938 -93.688
3 1/1/2019 23:00 0.0029 41.938 -93.688
4 Brw 71.32250 -156.60917 1/1/2019 19:00 0.0101 52.938 -124.938
5 1/1/2019 23:00 0.0200 52.938 -124.938
6 CaV 39.06333 -79.42083 1/1/2019 19:00 0.1644 39.063 -79.438
7 1/1/2019 23:00 -0.0027 39.063 -79.438
有没有想过如何像所提供的示例那样合并我的数据帧?如果在zip命令中包含站点名称,然后像这样将ID插入pandas数据帧行中呢?顺便说一句,我无法访问您的CSV文件,因此我使用虚拟列表稍微简化了示例
import pandas as pd
import xarray as xr
nc_file = "example.nc"
NC = xr.open_dataset(nc_file)
#dummy locations and station id as I can't access the CSV
lat=[40,42,41]
lon=[-100,-105,-99]
name=["a","b","c"]
Newdf = pd.DataFrame([])
for i,j,id in zip(lat,lon,name):
dsloc = NC.sel(lat=i,lon=j,method='nearest')
DT=dsloc.to_dataframe()
# insert the name with your preferred column title:
DT.insert(loc=0,column="station",value=id)
Newdf=Newdf.append(DT,sort=True)
print(Newdf)
这给了我:
EVP lat lon station
time
2019-01-01 19:00:00 0.0527 39.938 -99.938 a
2019-01-01 23:00:00 0.0232 39.938 -99.938 a
2019-01-01 19:00:00 0.0125 41.938 -104.938 b
2019-01-01 23:00:00 0.0055 41.938 -104.938 b
2019-01-01 19:00:00 0.0527 40.938 -98.938 c
2019-01-01 23:00:00 0.0184 40.938 -98.938 c
这主意不错!是的,使用Id压缩并插入到数据框架非常有效。谢谢你。是的,一旦我获得了15个声誉,我就会参加投票。现在我投票,但直到我获得了15个声誉,它才显示出来!我不知道!顺便说一下,欢迎来到stackexchange