如何使用Python读取NetCDF文件并写入CSV

我的目标是从netcdf文件访问数据并以下列格式写入CSV文件.

Latitude  Longitude Date1  Date2  Date3
100       200       <-- MIN_SFC values -->

到目前为止,我已经访问了变量,将标题写入文件并填充了lat / lons.

如何访问指定的lon,lat坐标和日期的MIN_SFC值,然后写入CSV文件.

我是一个蟒蛇新手,如果有更好的方法来解决这个问题请告诉我.

NetCDF文件信息:

Dimensions:
  time = 7 
  latitude = 292
  longitude =341

Variables:
  float MIN_SFC (time=7, latitude = 292, longitude = 341)

这是我尝试过的:

 from netCDF4 import Dataset, num2date

 filename = "C:/filename.nc"

 nc = Dataset(filename, 'r', Format='NETCDF4')
 print nc.variables

 print 'Variable List'

 for var in nc.variables:
    print var, var.units, var.shape

 # get coordinates variables
 lats = nc.variables['latitude'][:]
 lons = nc.variables['longitude'][:]

 sfc= nc.variables['Min_SFC'][:]
 times = nc.variables['time'][:]

 # convert date, how to store date only strip away time?
 print "Converting Dates"
 units = nc.variables['time'].units
 dates = num2date (times[:], units=units, calendar='365_day')

 #print [dates.strftime('%Y%m%d%H') for date in dates]

 header = ['Latitude', 'Longitude']

 # append dates to header string

 for d in dates:
    print d
    header.append(d)

 # write to file
 import csv

 with open('Output.csv', 'wb') as csvFile:
    outputwriter = csv.writer(csvFile, delimiter=',')
    outputwriter.writerow(header)
    for lat, lon in zip(lats, lons):
      outputwriter.writerow( [lat, lon] )
 
 # close the output file
 csvFile.close()

 # close netcdf
 nc.close()

更新:

我已经更新了写入CSV文件的代码,有一个属性错误,因为lat / lon是双倍的.

AttributeError:’numpy.float32’对象没有属性’append’

有什么方法可以在python中转换为字符串?你认为它会起作用吗?

当我向控制台打印值时,我注意到一些值返回为“ – ”.我想知道这是否代表fillValue或missingValue定义为-32767.0.

我也想知道是否应该通过lats = nc.variables [‘latitude’] [:] [:]或lats = nc.variables [‘latitude’] [:] [:,]来访问3d数据集的变量]?

# the csv file is closed when you leave the block
with open('output.csv', 'wb') as csvFile:
    outputwriter = csv.writer(csvFile, delimiter=',')
    for time_index, time in enumerate(times): # pull the dates out for the header
         t = num2date(time, units = units, calendar='365_day')
         header.append(t)
    outputwriter.writerow(header)  
    for lat_index, lat in enumerate(lats):
        content = lat
        print lat_index
        for lon_index, lon in enumerate(lons):
            content.append(lon)
            print lon_index    
            for time_index, time in enumerate(times): # for a date
                # pull out the data 
                data = sfc[time_index,lat_index,lon_index]
                content.append(data)
                outputwriter.writerow(content)
我会将数据加载到Pandas中,这有助于分析和绘制时间序列数据,以及写入CSV.

因此,这是一个真实的工作示例,它从全局预测模型数据集中的指定lon,lat位置提取时间序列的波高.

注意:这里我们访问一个OPeNDAP数据集,这样我们就可以从远程服务器中提取所需的数据,而无需下载文件.但netCDF4对于删除OPeNDAP数据集或本地NetCDF文件的工作方式完全相同,这是一个非常有用的功能!

import netCDF4
import pandas as pd
import matplotlib.pyplot as plt

# NetCDF4-Python can read a remote OPeNDAP dataset or a local NetCDF file:
url='http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/WW3/Global/Best'
nc = netCDF4.Dataset(url)
nc.variables.keys()

lat = nc.variables['lat'][:]
lon = nc.variables['lon'][:]
time_var = nc.variables['time']
dtime = netCDF4.num2date(time_var[:],time_var.units)

# determine what longitude convention is being used [-180,180], [0,360]
print lon.min(),lon.max()

# specify some location to extract time series
lati = 41.4; loni = -67.8 +360.0  # Georges Bank

# find closest index to specified value
def near(array,value):
    idx=(abs(array-value)).argmin()
    return idx

# Find nearest point to desired location (could also interpolate, but more work)
ix = near(lon, loni)
iy = near(lat, lati)

# Extract desired times.      
# 1. Select -+some days around the current time:
start = dt.datetime.utcnow()- dt.timedelta(days=3)
stop = dt.datetime.utcnow()+ dt.timedelta(days=3)
#       OR
# 2. Specify the exact time period you want:
#start = dt.datetime(2013,6,2,0,0,0)
#stop = dt.datetime(2013,6,3,0,0,0)

istart = netCDF4.date2index(start,time_var,select='nearest')
istop = netCDF4.date2index(stop,time_var,select='nearest')
print istart,istop

# Get all time records of variable [vname] at indices [iy,ix]
vname = 'Significant_height_of_wind_waves_surface'
#vname = 'surf_el'
var = nc.variables[vname]
hs = var[istart:istop,iy,ix]
tim = dtime[istart:istop]

# Create Pandas time series object
ts = pd.Series(hs,index=tim,name=vname)

# Use Pandas time series plot method
ts.plot(figsize(12,4),
   title='Location: Lon=%.2f, Lat=%.2f' % ( lon[ix], lat[iy]),legend=True)
plt.ylabel(var.units);

#write to a CSV file
ts.to_csv('time_series_from_netcdf.csv')

这两个都创建了这个图,以验证您是否已获得所需的数据:

并将所需的CSV文件time_series_from_netcdf.csv写入磁盘.

你也可以view, download and/or run this example on Wakari.

https://stackoverflow.com/questions/28420988/how-to-read-netcdf-file-and-write-to-csv-using-python

转载注明原文:如何使用Python读取NetCDF文件并写入CSV