×
【作者】 尚坤
【导师】 童庆禧;张霞
【 学位年度 】 2014
【论文级别】 博士
【关键词】 高光谱遥感,植被精细分类,植被特征库,直推支持向量机,主动学习
【Key words】 hyperspectral remote sensing,detailedvegetation classification,vegetation feature band set, transductive support vector machine,active learning
【中文摘要】
高光谱遥感影像一直是植被精细分类研究中重要的数据源,其“图谱合一”的特点使很多在多光谱数据中无法区分的类别变得可以区分,在作物品种识别、入侵物种监测、精准农业等方面具有广泛的应用前景。但是,高光谱遥感数据在植被精细分类中主要存在以下问题:第一,随着类别精细度的不断提高,单纯利用光谱信息得到的分类结果已经无法满足应用需求;第二,其成百上千的光谱波段带来了巨大的运算量,对监督分类中训练样本的需求也大幅增加;第三,随着传感器空间分辨率的提高,分类结果中椒盐噪声严重影响分类结果的应用。
近几年发展起来的半监督分类算法,对高光谱遥感数据的小样本分类具有重要意义。半监督分类算法同时利用了少量已标记样本和大量未标记样本,综合对整个数据进行分类。而目前的高光谱数据半监督分类算法大多只是单纯利用地物光谱的统计特征,从机器学习的角度实现半监督学习过程。为此,本文重点开展植被特征库的构建与优化算法和针对小样本的支持向量机半监督分类改进算法研究,提出改进的直推支持向量机与主动学习支持向量机分类算法,并分别采用具有较高空间分辨率的地面成像光谱数据与航空高光谱数据,进行植被精细分类实验,验证所提算法在地面尺度与航空尺度植被精细分类中的有效性。论文的主要研究成果和结论如下:
1) 提出了基于植被特征库构建与优化的植被分类策略:在考虑原有光谱特征、空间纹理特征的基础上,重点加入了对植被叶绿素、类胡萝卜素、花青素、氮素等含量变化较为敏感的50个光谱指数特征,有效的提高了不同植被类别间的可分性;
2) 提出了基于类对可分性(CPS)的植被特征库光谱维优化算法:该算法重点考虑不同植被类别在不同特征波段中的可分性,即“类对可分性”
,在特征库光谱维优化中,分别保留对各类对具有最大Bhattachryya距离的原始波段、纹理特征和光谱指数特征,基于选取的特征计算各类对Jeffries-Matusita距离,通过迭代使各类别均达到较高的分类精度,并利用最优索引因子法进一步降低波段间相关性。实验结果表明,该算法可有效降低数据冗余,提高分类效率;
3) 提出了基于邻域光谱角距离(NPSAD)的植被特征库空间维优化算法:
该算法的提出主要是考虑到地物分布通常具有一定的空间连续性,相邻像元属于同一地物类别概率较高。在有训练样本的情况下,该算法可以根据训练样本自动设置阈值;否则,需由用户根据先验知识设置阈值。从与其他空间维优化算法的对比试验来看,该算法能够在去除分类结果中椒盐噪声的同时,避免边缘效应,保留细节信息,提高分类精度。
4) 提出了基于光谱角距离-欧式距离双重判定的渐进直推支持向量机(SAD/ED-PTSVM)分类算法:该算法在传统渐进直推支持向量机基础上,通过计算光谱角距离与欧式距离分别对未标记样本标签进行判定,并结合其到分划边界距离,实现对未标记样本的“自动”标注。该算法有效利用了高光谱数据的光谱维信息,降低未标记样本的误标记概率,进而降低后期标签重置带来的时间成本,并有效简化了传统直推支持向量机的参数设置,减少参数优化所需时间,提高分类效率。
5) 提出了结合混合像元分解进行主动学习的支持向量机(SUAL-SVM)分类算法:该算法将高光谱研究中的混合像元分解技术与机器学习领域中的主动学习策略进行有机结合,充分利用了各像元的丰度信息来辅助分类决策;同时,算法中设置了权重系数,用来综合考虑“丰度最均匀”与“最易误分”像元。在主动学习样本添加时,该算法按各类对区分难易度由多到少添加样本,使样本添加更具有针对性。该算法可有效提高在影像中分布较少的类别的识别精度,利用更少的标记样本实现更高的总体分类精度,有效减少样本标记工作量,减少分类器样本训练所需时间。
6) 地面与航空尺度的植被精细分类实验结果表明:基于植被特征库构建与优化的分类算法可以有效的提高地面/航空尺度的不同种类作物/杂草的分类精度,提取出更完整的叶片/作物地块;而SAD/ED-PTSVM与SUAL-SVM分类算法,在各类别仅有25个标记样本的情况下,便可得到较高的分类精度,有效提高了分类精度和分类效率,并降低样本标注所需的人力物力。实验结果表明本文所提算法在地面与航空尺度的植被精细分类研究中具有广阔的应用前景。
【Abstract】
As one of the most important data source of detailedvegetation classification(DVC), hyperspectral remote sensing(HRS)data make some classes distinguishable that cannot be separated in multispectral images, due to its advantage of imaging spectrum. HRSdata therefore have widespread application prospects in identification of crop species, monitoring of invasive species and precision agriculture. However, there are also some problems of HRSdata in DVC: Firstly, classification results based on only spectral information cannot meet the application requirements, as the classes become more and more sophisticated. Secondly, the increasing amount of spectral bands requires not only a huge number of calculations but also the corresponding growth of training samples in supervised classification. Thirdly, as the spatial resolution of hyperspectral sensor improves, the applications of classification are seriously affected by the salt and pepper noise in the classification results.
The semi-supervised classification algorithmsdeveloped in recent years play a key roleinHRSclassificationwith limited samples. The semi-supervised algorithmscould perform theclassification process by usinga small amount of labeled samples and a large number of unlabeled samples. However, most of the current semi-supervised classification algorithmsof HRSdataonly take advantage of the statistical characteristics of ground feature spectrum.On the basisof full investigation of current overseas and domestic research status, this study focuses on both vegetation feature band set construction and optimizationand semi-supervised classification method based on support vector machine (SVM) for limited labeled samples, to overcome the problem that labeled samples are difficult to be acquired in hyperspectral data classification. Some classification experiments of field imaging spectral data and airborne hyperspectral data with high spatial resolution are also performed, in order to verify the effectiveness of the proposed methods of sophisticated vegetation classification both on ground and airborne scales.
The main results and conclusions of this study are presented as follows.
(i) A DVCstrategy based on vegetation feature band set(FBS)construction and optimization is proposed: besides the spectral and texture feature of original images, we add 50 spectral indices that are sensitive to chlorophyll, carotenoid, anthocyaninand nitrogencontent to the vegetationFBS. Results show that this strategy is able to effectively improve the separability between different vegetation classes.
(ii) A spectral dimension optimization algorithm ofFBSbased on class-pair separablity (CPS) is proposed. This method focuses on the separablity of different vegetation classes at different feature bands, i.e. CPS. Itpreserves the original bands, texture features and spectral index features respectively that have the largest Bhattachryya distance of each CPthrough the iteration, and calculates the Jeffries-Matusitadistance to make sure that each CPmaintains a good separabilitythroughthe spectral dimension optimization of the FBS.Then Optimum Index Factor is employed to reduce the feature bands with high correlation.This proposed method can reduce the redundant data and improve the classificationefficiency.
(iii) A spatial dimension optimization algorithm of FBSbased on neighborhood pixels' spectral angle distance(NPSAD)is proposed, considering that in general the probability of adjacent pixels being the same class is relatively high. This method can set thresholds automatically according to training samples if there are, otherwise users have to set thresholds basedon prior knowledge. From the comparison of classificationresults, the proposed method could remove the salt and pepper noise from the classification results while avoiding the “edge effect” and keeping details at the same time, which can help to increasethe classification accuracy.
(iv) A progressive transductive support vector machine method based on the discrimination of both Spectral Angle Distanceand Euclidian distance(SAD/ED-PTSVM)is proposed. On the basis of traditional PTSVM method, this methoddiscriminates the unlabeled samples by respectively calculating SAD and ED, and implements “automatic label” for the unlabeled samples according to their distances to the separating hyperplane borders. SAD/ED-PTSVMmakes good use of spectral information of HRSdata, reducing the risk of incorrectly labeling and thus the time cost of label reset later. This method also effectively simplifies the parameter set of traditional PTSVM, reducing the amount of time spent on parameter optimization and thus improving classifying efficiency.
(v) A support vector machine classification method based on active learning using spectral unmixing technology(SUAL-SVM)is proposed. This method combines spectral unmixing technology in hyperspectral study and active learning strategy in machine learning field and fully utilizes the abundance of each pixel for better classification. Meanwhile, weighting factors are set to adjust the ratio of the most homogeneously mixed pixels and the most easily misclassified pixels. During active learning samples are added according to the distinguishing complexity of class-pairs, which makes the newly added samples more targeted. This method greatly improves identifying accuracy of the classes with small distribution, acquiring higher overall accuracy with less labeled samples and reducing not only the workload of labeling samples but also the time of samples training.
(vi) The results of detailedvegetation classification experiments on ground and airborne scales show that the classification method based on vegetation FBSconstruction and optimization can increase the classification accuracy of different types of crops/weeds on ground/airborne scale and extract more complete blade/crop plots information. SAD/ED-PTSVM and SUAL-SVMcan effectively improve the classification accuracy and efficiency, reaching relatively high accuracy when there are only 25 labeled samples for each class. The results of experiments illustrate that the proposed methods in this study have great potentialand broad application prospects in DVC, both on ground and airborne scales.