    論文名稱(中文) 以近紅外線光譜與機器視覺鑑別水稻品種

    論文名稱(英文) Classifying Paddy Rice Cultivars by Near-Infrared Spectroscopy and Machine Vision

    研究生(中文) 劉昌群

    研究生(英文) Chang-Chun Liu

    學號 D89631002

    學位類別 博士

    出版年 2007

    論文頁數 103

    學校名稱 臺灣大學

    學院名稱 生物資源暨農學院

    系所名稱 生物產業機電工程學研究所

    指導教授 蕭介宗 ;

    語文別 英文

    中文關鍵字 水稻 ; 近紅外線 ; 機器視覺 ; 品種鑑別 ;

    英文關鍵字 Paddy Rice ; Near Infrared Spectroscopy ; Machine Vision ; Classify ;

    論文使用權限 同意授權瀏覽/列印電子全文服務,於2007-02-02起公開。

    以1100至2500nm近紅外線反射光譜值,每隔3 nm擷取一個吸收值為變數。全部351個變數建立判別分析以及倒傳遞類神經網路鑑別模式,平均鑑別率分別為98.1%及92.5%。以逐步排除法選取69個變數建立判別分析以及倒傳遞類神經網路鑑別模式,平均鑑別率分別為98.5%及85.5%。以變數之間的相關矩陣選取69個變數建立判別分析以及倒傳遞類神經網路鑑別模式,平均鑑別率分別為72.0%以及72.2%。以變數在第一和第二主成分軸上的Loading值選取69個變數建立判別分析以及倒傳遞類神經網路鑑別模式,平均鑑別率分別為69.1%及60.6%。在近紅外線的波長選擇方法中,以逐步排除法挑選的變數所建立的模式,鑑別率優於相關矩陣法及loading值法所建立的模式,不僅能減少變數的數目,同時鑑別率仍可達到使用351個變數的準確性。使用相同的變數時,判別分析法的鑑別能力優於類神經網路法,且具顯著差異。
    使用五種水稻的外觀形狀及顏色作為變數,包括單一穀粒的面積、周長、形狀係數(4π×面積/周長2)、面積/周長、最大寬度值、最長軸值、最長軸/最大寬度、每粒稻穀的平均紅色光度值(R)、平均綠色光度值(G)、平均藍色光度值(B),以及最長軸上等距之50個寬度值,共計60個水稻特徵。以倒傳遞類神經網路訓練品種鑑別模式。模式1使用60個變數的平均鑑別率為92%。模式2 使用在第一主成分軸上的Loading值較大的50個變數,平均鑑別率為90.0%。模式3使用以相關矩陣選取35個相關性較低的變數,平均鑑別率為91.0%。模式4以變數對網路訓練的影響度,由大到小挑選20個影響度較大的變數,平均鑑別率為91.8%。在變數的選擇方法中,以變數對網路訓練的影響度所挑選的變數所建立的鑑別模式,在進行水稻的品種鑑別時不僅可使用較少的變數,同時也具有較佳的穩定性。
    結合以逐步排除法從60個外觀形狀及顏色變數裡選出17個,以及從351個近紅外線反射光譜的吸收值選出54個變數,共71個變數,使用倒傳遞類神經網路建立鑑別模式。依變數在第一主成分軸上的Loading 值由大到小每隔5個變數分別選取10至71 個變數建立鑑別模式1,平均驗證率分別為76.2%、82.4%、92.9%、92.7%、95.5%、95.8%、96.5%、94.1%、95.5%、95.5%、95.5%、95.5%及96.5%,平均鑑別率分別為78.0%、78.5%、83.5%、51.9%、66.8%、51.2%、77.1%、73.7%、59.0%、82.0%、83.2%、78.6%及79.3%。依變數之間的相關性由低至高每隔5個變數分別選取10至71 個變數建立鑑別模式2,平均驗證率分別為83.3%、83.5%、83.5%、93.9%、93.4%、96.2%、96.0%、96.7%、97.2%、95.8%、97.2%、96.7%及96.2%,平均鑑別率分別為87.0%、85.4%、76.4%、70.4%、54.4%、45.0%、49.8%、55.3%、62.0%、62.0%、61.9%、76.8%及79.0%。依變數對類神經網路訓練的影響度由大到小每隔5個變數分別選取10至71個變數建立鑑別模式3,平均驗證率為85.2%、91.5%、91.1%、92.5%、94.1%、95.3%、96.2%、95.5%、96.0%、95.5%、96.0%、96.0%及96.2%,平均鑑別率分別為65.1%、66.3%、64.9%、62.0%、74.7%、70.8%、66.6%、76.4%、78.8%、78.5%、72.5%、71.8%及85.1%。結合近紅外線反射光譜的吸收值與外觀性狀及顏色作為變數所建立的鑑別模式,使用的變數較少時,其鑑別率反不如單獨使用近紅外線反射光譜的吸收值或單獨以外觀性狀及顏色作為變數所建立的鑑別模式的鑑別率,推測其主要原因乃是因為在擷取近紅外線反射光譜的吸收值時,是掃描約8g重(約90至100粒)的散裝樣本置於容器之平均值,擷取外觀形狀及顏色變數時,則以單粒的榖粒取像之10粒平均值。對樣本不同的處理方法的資料結合建立模式時,可能造成相互干擾,反而降低鑑別率。進一步研究掃描單粒擷取近紅外線反射光譜的吸收值,是否可以提升鑑別率。總之,以逐步排除法、變數之間的相關性、主成分軸上的Loading 值以及變數對類神經網路訓練時的影響度為依據選取變數,較能選出具品種特徵和品種區分能力的變數,降低鑑別模式所需使用變數的數目。以判別分析與倒傳遞類神經網路建立的鑑別模式,能有效鑑別水稻的品種。

    From five paddy rice cultivars Tainung Sen 20, Taichung Sen 10, Tainung 67, Taikeng 8, and Taikeng 9 grown in central, eastern and south Taiwan, and harvested in the summers of 1997, 1998, and 1999, calibrated models were established by discriminant analysis and backpropagation neural network program through near infrared absorbance and external morphological and color features selection. The calibrated models were used to classify the above mentioned five paddy rice cultivars harvested in the same area in the summer of 2000.
    From 1100 nm to 2500 nm in 3-nm steps, the reflectance spectrum absorbance was collected as a variable. Totally three hundred fifty-one variables were used to develop the discriminant analysis and backpropagation neural network models, and the average classification rates were 98.1% and 92.5%, respectively. Sixty-nine variables were selected by using stepwise discrimination to develop the discriminant analysis and backpropagation neural network models, and the average classification rates were 98.5% and 85.5%, respectively. Sixty-nine variables were selected by using the correlation matrix to develop the discriminant analysis and backpropagation neural network models, and the average classification rates were 72.0% and 72.3%, respectively. Sixty-nine variables were selected by loading on the first and second principal components to develop the discriminant analysis and backpropagation neural network models, and the average classification rates were 69.1% and 60.6%, respectively. In selecting wavelength of near infrared spectroscopy for establishing models, the classification rate by stepwise discrimination method was superior to the results by correlation matrix and loading value method. Reducing 351 variables to 69, the same classification rate was still kept. Using the same variables the classification by discrimation method was better than that by backpropagation neural network.
    Using morphological and color features of five paddy rice cultivars, there were 60 variables including single kernel area, perimeter, shape factor (4π×area/perimeter2), area/perimeter, maximum width, maximum length, maximum length/ maximum width, average intensities of red, green, and blue, and 50 widths on the maximum length. The following models were trained by backpropagation neural network program to establish classification models. With 60 features, the average classification rate of Model 1 was 92%. With the most effective 50 features, by loading in the first principal component, the average classification rate of Model 2 was 90.0%. With 35 features selected from the correlation coefficient matrix, the average classification rate of Model 3 was 91.0%. With the most effective 20 features of area, area/perimeter, 48th width, shape factor, maximum length/maximum width, average intensity of blue, maximum length, average intensity of green, 47th width, 50th width, average intensity of red ,1st width, 19th width, 5th width, 6th width, 29th width, perimeter, 46th width, 42nd width, and 4th width based on the contribution of the training model, the average classification rate of Model 4 was 91.8% and would be recommended for classifying five paddy rice cultivars of set trading prices because it required fewer features and held a stable classification rate.
    17 variables from 60 morphological and color features and 54 variables from 351 near infrared reflectance absorbance were selected by stepwise discriminant method. Totally 71 variables were input into backpropagation neural network to establish classification model. Selecting 71 variables according to loading value on the first principal component from high to low, the number of variables were used from 10 to 71 with 5 steps were input backpropagation artificial neural network program to establish Model 1. The average validation rates of model were 76.2%, 82.4%, 92.9%, 92.7%, 95.5%, 95.8%, 96.5%, 94.1%, 95.5%, 95.5%,95.5%,95.5%,and 96.5%, respectively, and the average classification rates were 78.0%, 78.5%, 83.5%, 51.9%, 66.8%, 51.2%, 77.1%, 73.7%, 59.0%, 82.0%, 83.2%, 78.6%, and 79.3%, respectively. With 71 variables arranged according to correlation coefficient from low to high, and with the number of variables from 10 to 71 with 5 steps to establish Model 2, the average validation rates were 83.3%, 83.5%, 83.5%, 93.9%, 93.4%, 96.2%, 96.0%, 96.7%, 97.2%, 95.8%, 97.2%, 96.7%, and 96.2%, respectively, and the average classification rates were 87.0%, 85.4%, 76.4%, 70.4%, 54.4%, 45.0%, 49.8%, 55.3%, 62.0%, 62.0%, 61.9%, 76.8%, and 79.0%, respectively. According to contribute to backpropagation neural network model from high to low, and the number of variables from 10 to 71 with 5 steps were used to establish Model 3. The average validation rates were 85.2%, 91.5%, 91.1%, 92.5%, 94.1%, 95.3%, 96.2%, 95.5%, 96.0%, 95.5%, 96.0%, 96.0%, and 96.2%, respectively, and the average classification rates were 65.1%, 66.3%, 64.9%, 62.0%, 74.7%, 70.8%, 66.6%, 76.4%, 78.8%, 78.5%, 72.5%, 71.8%, and 85.1%, respectively. Collecting image features from single kernel and average 10 kernel data, and scanning the near-infrared absorbance in a vessel (8 g, about 90-100 kernels), two different data treatment method were combined to establish models, it may exist some interference and reduce the classification rates only by image features or near infrared absorbance. Further studying scanning the near-infrared absorbance in single kernel see whether can improve the classification rates.

