×
【作者】 李儒
【导师】 童庆禧;张兵;郑兰芬
【 学位年度 】 2008
【论文级别】 硕士
【关键词】 植被指数时间序列,傅立叶谐波分析,HANTS,频数,异常值检测
【Key words】 the vegetation index time series of remotely sensed data, Harmonic Analysis, HANTS, number of frequency, outlier detection
【中文摘要】
源于卫星遥感数据的植被指数时间序列数据能够从不同尺度(从全球到区域)反映植被生长状况、地表覆盖变化等信息,因而有着十分广泛的应用,已经成为许多科学研究、工程项目的重要数据源之一,如应用于植被生长监测,土地覆盖类型解译及其监测,物候特征识别与信息提取,全球、大洲、区域等尺度的陆地生态系统建立,甚至植物物候变化对GPP和NPP的影响研究等。应用此类数据须重视数据序列的时空一致性问题,影响数据序列时空一致性的主要因素有卫星遥感器本身的原因、云、大气条件、视场角和几何校正等。针对这些问题,学者们已经发展出一些算法(如最大合成法),但处理后的数据产品仍然存在着比较严重的噪声残差,这制约了其深入应用,因此在应用前必须正确有效的去除残存噪声、重建时间序列。为了有效的解决此类问题,研究者发展了许多算法,包括最佳指数斜率提取法、基于非对称高斯函数拟合法、基于Savitzky-Golay滤波的拟合法、傅立叶谐波分析法等。但这些方法都存在着一些问题,最突出的就是关键参数需要人为通过实验、经验选择,因而引入后天误差、导致对原始数据的干扰。在众多方法中,傅立叶谐波分析法由于使用不同周期的谐波在拟合时间序列曲线时能通过谐波周期来模拟一定的物候规律,而非单纯的数学运算,因而更受关注。
本文即在此类方法基础上,通过引入新的参量或策略改进原方法,试图减少人为影响及其对原始数据的过度扰动,使算法运行结果更为客观。改进算法的主要思路为:
(一)时间序列数据滤波重建原则的提出,即a.尽可能减少对数据过度扰动,滤波重建算法对原始数据扰动越大,其结果可信度越低,某些重要规律甚至会因此而被抹杀;b.尽可能少的使用辅助数据,辅助数据的使用一方面为滤波重建提供了一定的指示,如MODIS云图像,但由于数据获取、生产等过程中的不确定性因素,使辅助数据本身就存在类型复杂的误差,它的使用一定程度上增加程序开销、可能引入新的误差(或不确定性);c.尽可能多的考虑参量物理意义,植被指数时间序列的优势在于可以通过对其分析提取一定的季相物候等规律,因此滤波重建过程能考虑此问题将使数据处理更有意义,这也是傅立叶谐波分析系列算法的优势所在;d.算法效率与效果的兼顾。
(二)通过设置异常值检测增加数据拟合的真实性。由于残存误差的存在,真实值和异常值难以通过表面得以区分。传统方法通过设置阈值处理突变值(疑似异常值)虽然也有效,但难免出现“误杀误放”的情况,因此有必要从数据内部入手,通过分析数据内在规律探测序列中异常值的存在。改进算法引入异常值检测算法,检测拟合迭代过程中的异常值,减少因传统方法根据距离定权剔除无效点而引入的人为误差和新的不确定性。
(三)频数的自动选择和拟合影响因子的自动迭代计算。频数控制拟合效果,也是傅立叶谐波分析方法体现序列数据物理意义的载体。但传统方法通过尝试人为设定、全局使用同一固定频数,即默认整个运算区域都是同一类植被生长模式(从物理意义上考虑);改进方法通过迭代前预处理,动态估算出待处理序列点的峰值个数,即频数,参与下步拟合。另一个关键参数是本文借鉴基于Savitzky-Golay滤波的拟合法中拟合影响因子,计算每次拟合的影响因子,在该因子达到最小值(或局部最小值)时自动终止迭代,而放弃使用传统方法中人为设置的阈值限差。
通过与HANTS软件运算结果对比,本文分别从数据扰动、异常值检测、自动选频结果、迭代次数等方面分析,说明本文提出的改进方法取得了一定的试验效果。改进算法结果较HANTS软件结果在某些方面,如对真实数据扰动影响等,都有明显的改进。但是在算法普适性等方面,本研究仍需要进一步的试验验证。同时在分析过程中,本文还发现了HANTS软件运算结果呈负偏性趋向(拟合值小于原始值),拟合曲线呈低于原始曲线的趋向,这对以后应用HANTS软件时有一定的帮助作用。
最后,本文使用滤波重建后的植被指数时间序列,分析提取北京及周边地区两类裸地,即常年裸地和季节性裸地。经分析发现,北京西北部存在大量的常年裸地,季节性裸地也零星存在。该区处于北京西北风向风口,又是沙尘进入北京的必经之路,成为北京周边沙尘源地可能性极大。北京南部门头沟、房山、大兴存在着大量季节性裸地。该区冬春季节耕地裸露,又处在北京南部北风向风口,也极可能是北京本地沙尘源地之一。
【Abstract】
Because vegetation index time series of remotely sensed data, such as Normalized Difference Vegetation Index (NDVI) products derived from NOAA/AVHRR, SPOT/VEGETATION, TERRA, or AQUA/MODIDS, carry valuable information regarding land-surface properties in kinds of scales, they become more and more important and even one of main data sources for lots of applications including scientific researches and engineering projects. Take NDVI data set for example, various different scales of these products have been applied for detecting long-term land use/cover changes, modeling terrestrial ecosystems on global, continental and regional scales, extracting seasonal metrics of vegetation phenology to classify vegetation or land cover types, and even for estimating gross primary productivity (GPP) and net primary productivity (NPP).
However, since disturbed by cloud contamination, atmospheric variability and bi-directional effects, the vegetation index time series of remotely sensed data have serious noise. Although the most frequently-used data sets are the MVC products, such as MODIS 16-days NDVI/EVI data sets, they still include a lot of such serious residual noise. For this reason, many methods for reconstructing high-quality time-series data sets have been developed, including Best Index Slope Extraction Algorithm (BISE), Asymmetric Gaussian Function Fitting Approach (AGFF), Algorithm based on Savitzky-Golay Filtering(S-GF), Harmonic Analysis Algorithm based on Fourier transform (HAA), and so on. However, these methods also suffer several drawbacks that limit their further applications. The most serious problem for them is that the key parameters, almost for all of these algorithms, need to obtain through lots of trials which are easily influenced and brought new kinds of noise or uncertainties to the data sets by operators. Among these algorithms, because Harmonic Analysis Algorithm takes account of the physical meanings represented by the original data while fitting the curve, it associates well temporal changing regulation and spatial distribution characteristics showed on the time-series dada sets. Therefore, based on this algorithm, a new method for reducing residual noise and constructing high-quality time-series data sets for further application has been given in this paper. The new improved algorithm tries to bring less subjective noise in the data sets through the following ways:
1. Four rules have been given to guide the improvement of the algorithm: fewer disturbances to the original sets, less auxiliary data used; more consideration of the data physical meanings,good balance between effect and efficiency.
2. Outlier detection algorithm has been used to find the data points which are not proper to join the next curve fitting.
3. Key parameters automatically generating instead of obtaining by trials and experience. There are two important parameters, Numbers of Frequency (NOF) and Fitting-effect Index (Fk). The former is used to control the result of fitting, and it also reflects some phonological regulations implied in the data sets. The latter decides when to terminate the iteration.
Comparing to the HANTS result by analyzing disturbance to the data sets, detecting outlier and generating key parameters automatically, it can be concluded that the improved method has a good performance in some aspects such as fewer disturbance to the original data, while new problems,applicability of algorithm for instance, have been found which need to keep improving further.
Finally, the vegetation index time series of remotely sensed data have been applied to detect the source regions of dust weather. After analysis of the detected result and local meteorological data, two possible dust-sand sources regions of Beijing dust weather have been found. One is the junction of Inner Mongolia, Hebei and Shanxi in northwest Beijing. Another is in north Beijing, mainly is in Mengtougou country, Fangshan country and Daxing country. More attention should be paid here to find a balance between Economic and ecological benefits which will be helpful to improve Beijing dust weather.