Article Request Page ASABE Journal Article Crop Residue Cover Percentage Estimation from RGB Images Using Transfer Learning and Ensemble Ordinal Regression
Parth C. Upadhyay1, T.A.P. Lagaunne2, John A. Lory2,*, Guilherme N. DeSouza1
Published in Journal of the ASABE 67(4): 943-953 (doi: 10.13031/ja.15655). Copyright 2024 American Society of Agricultural and Biological Engineers.
1 Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri, USA.
2 Division of Plant Science and Technology, University of Missouri, Columbia, Missouri, USA.
* Correspondence: loryj@missouri.edu
Submitted for review on 1 May 2023 as manuscript number NRES 15655; approved for publication as a Research Article and as part of the Soil Erosion Research Symposium Collection by Associate Editor Dr. Kyle Mankin and Community Editor Dr. Kyle Mankin of the Natural Resources & Environmental Systems Community of ASABE on 10 February 2024.
Citation: Upadhyay, P. C., Lagaunne, T. A. P., Lory, J. A., & DeSouza, G. N. (2024). Crop residue cover percentage estimation from RGB images using transfer learning and ensemble ordinal regression. J. ASABE, 67(4), 943-953. https://doi.org/10.13031/ja.15655
Highlights
- A transfer learning strategy improved residue estimates from high-resolution RGB imagery.
- The best method used probabilistic estimates of expert classifiers to estimate residue cover.
- This research confirms the utility of RGB imagery to quantify residue cover in agricultural fields.
Abstract. Plant residue on the soil surface increases the sustainability of food and fiber production in agricultural systems. Automated assessments of residue cover based on imagery have the potential to reduce labor and human bias associated with in-field measurements. We evaluate the capacity of a transfer learning strategy to improve the determination of residue level from high-resolution RGB images. The imagery for the study was collected from 88 field locations in 40 row crop fields in five Missouri counties between mid-April and early July in 2018 and 2019. At each field location, 50 contiguous 0.3 m × 0.2 m region of interest (ROI) images (ground sampling distance of 0.014 cm pixel--1) were extracted from imagery, resulting in a dataset of 4,400 ROI images; 3,000 were used for cross-validation and training (data collected in 2018) and 1,400 were used for testing (data collected in 2019). The percentage residue for each ROI image (ground truth) was determined by a bullseye grid method (n = 100). Features were extracted from ROI images using the VGGNet-16 CNN model, a pre-trained convolutional neural network model. We extracted 1,472 features per ROI using a global averaging and pooling strategy. The optimum feature set was identified using recursive feature elimination using a support vector machine (RFE-SVM). To estimate crop residue percentage using selected features, expert two-class SVMs were trained to separate adjacent levels of residue cover, where the rationale of the ensemble was to allow each of the two-class SVMs to find the hyperplanes that maximize the margin between the corresponding two consecutive classes. Based on the distance of the samples to these hyperplanes, probabilistic estimates of the data-point belonging to the class were computed. With the combined knowledge of probabilistic estimates from each expert classifier, the percentage crop residue cover of each ROI image was calculated. We tested our approach with 3-, 4-, 5-, and 8-class problems, achieving the best results with the 8-class problem with r2 = 0.93 at the ROI level, r2 = 0.97 at the field-location level, and minimal bias in residue estimates in low residue conditions. These results are superior to other reported estimates of percent residue derived from imagery. This research confirms the utility of high-resolution RGB imagery to quantify residue cover in agricultural systems.
Keywords. Convolutional neural network, Soil erosion, Support vector machine, Transfer learning.Plant residue on the soil surface provides eco-service benefits, including reduced evaporation, protection of soil from water and wind erosion, and improved soil structure and infiltration through increases in soil organic matter (Bronick and Lal, 2005; Cherubin et al., 2018; Ranaivoson et al., 2017; Searle and Bitnere, 2017; Singh and Rengel, 2007). The importance of residue in agricultural systems is represented by its role as a key input in soil erosion models such as the Watershed Erosion Prediction Project (WEPP), Revised Universal Soil Loss Equation (RUSLE), and RUSLE2 (Dabney et al., 2011; Flanagan et al., 2007; Weltz et al., 2020). The Food Security Act of 1985 (Glaser, 1985) established a requirement to maintain “sustainable erosion rates on cropland, hay land, and pasture” defined as “highly erodible land” (HEL) with the objective of protecting the Nation’s long-term capability to produce food and fiber. Residue assessments are part of the documentation of farmer HEL compliance; when farmers fail to comply with requirements from the Food Security Act of 1985, they may lose access to assistance payments, conservation program benefits, and other federal subsidies.
The Natural Resource Conservation Service (NRCS) is the primary agency documenting farmer HEL compliance. They assess the percentage of plant residue cover as a critical compliance variable in the Conservation and Compliance assessments (USDA-NRCS, 2011). To determine percent residue cover, trained technical staff visit fields and make three to five determinations of residue cover in representative areas using the line-transect method (USDA-NRCS, 2011). The line-transact method requires NRCS personnel to read 100 points evenly distributed along a 15.2- or 30.4-m tape laid at a 45-degree angle to the direction of farming (USDA-NRCS, 2011). In practice, the line-transect method requires careful attention to correctly read the tape and is prone to reader bias to overestimate residue (Laamrani et al., 2017; Laflen et al., 1981; Lory et al., 2021; Richards et al., 1984).
An automated system based on RGB imagery can limit human judgment and bias in residue estimation while providing imagery to document the assessment (e.g., Upadhyay et al., 2022). Traditional machine-learning methods have been used to build a prediction model from RGB images based on known color, shape, and texture features extracted from the image (Kavoosi et al., 2020; Najafi et al., 2021; Upadhyay et al., 2022). Recent interest in RGB imagery is likely driven by easy access to low-cost RGB images from tools like an unmanned aerial vehicle (UAV) or a smart phone.
Automated systems using RGB images have applied a diverse set of strategies using limited datasets with mixed results (Baeur and Strauss, 2014; Laamrani et al., 2018; Riegler-Nurscher et al., 2018; Kavoosi et al., 2020). These strategies typically relied on limited datasets with sub-meter resolution and reported r2 values between 0.75 and 0.90. Hyperspectral and multispectral methods have used much lower resolution imagery (>30 m), were typically tested in lab and field conditions with more extensive datasets, and achieved an r2 between 0.77 and 0.96. While both approaches have shown promise, an RGB-based system has the benefit that practitioners can obtain relatively low-cost, high-resolution images during field visits using a hand-held camera, smart phone, or UAV (Laamrani et al., 2018; Upadhyay et al., 2022).
Transfer learning-based models use knowledge from a previously trained model to solve a new problem, often more quickly and with better results than with models built with known features (Kaya et al., 2019; Weiss et al., 2016; Zhuang et al., 2021). A typical transfer learning approach uses weights obtained from a pre-trained neural network model as extracted features (Kaya et al., 2019; Zhuang et al., 2021). Lagaunne et al. (2023) estimated the percent residue cover in RGB images using a transfer learning strategy based on features derived from a pre-trained visual geometry group net-16 convolutional neural network (VGGNet-16 CNN) model previously trained on the ImageNet dataset (Simonyan and Zisserman, 2015). The results were superior to results from the same dataset using known features reported by Upadhyay et al. (2022).
The support vector machine (SVM) has been successfully used for field-crop related classification problems such as crop and weed identification in corn fields and plant density estimation of wheat (Guerrero et al., 2012; Jin et al., 2017). Similarly, Upadhyay et al. (2022) showed that a three-class SVM classifier is effective for residue estimation from RGB imagery at different ground sampling distances (GSDs). In a classification pathway for data like percent residue cover, where labels are in rank order (ordinal data), most misclassification occurs with the data-points near class boundaries. Upadhyay et al. (2021) improved the classification of residue levels using an ensemble SVM classifier method. In the ensemble approach, for each classification boundary between consecutive classes, an expert binary classifier was developed that provided probabilistic estimates of a data-point classification. This information was then propagated into the final multi-class classifier.
The objective of this study was to integrate lessons learned from previous work estimating residue cover from high-resolution RGB images to improve the accuracy and precision of residue estimates and the efficiency and simplicity of implementing the method for common residue assessment problems. The first goal was to compare an SVM with an ensemble SVM strategy using features derived from transfer learning, as Lagaunne (2022) and Lagaunne et al. (2023) suggested. The second goal was to compare the ensemble SVM strategy with an ensemble ordinal regression strategy to simplify the estimate of percent residue at a field location from classification analysis. The ensemble ordinal regression strategy was an extension of the ensemble SVM strategy, where the probabilistic estimates of each expert SVM classifier were used to directly estimate percentage residue cover. Finally, for all three strategies, we optimized the number of classes used.
Materials and Methods
Data Collection
This research used the same image dataset with the associated ground truth as Upadhyay et al. (2022); details of field data collection and assigning ground truth to images are reported there. A summary of the key points of obtaining images and ground truth is provided here for context.
From late April through early July in 2018 and 2019, the project team obtained residue imagery of corn (Zea mays L.) and soybean (Glycine max [L.] Merr.) from fields in five central Missouri counties; imagery was retained from 60 field locations in 20 row-crop fields in 2018, and 28 field locations in 20 additional row crop fields in 2019 (Upadhyay et al., 2022). One to four field locations in each field were selected by NRCS personnel to represent the diversity of residue cover conditions in the row crop field. Our objective was to visit fields after planting but prior to corn or soybean reaching growth stage V3. At each field location, a 15.2-m tape was placed at 45 degrees to the planted crop row direction in accordance with the line-transect method for estimating residue cover (USDA-NRCS, 2011). The imagery was obtained at an elevation of 0.93 m above the surface of the soil using a tripod-mounted Canon EOS Rebel T6i Digital SLR camera (Canon USA, Melville, NY) with a 24 mm lens and 24.2 MP resolution, producing a 6,000- × 4,000-pixel image. The images had a calculated GSD of 0.014 cm pixel-1. We obtained 50 or 51 images along each tape by centering the camera over the 0 point on the tape horizontal to the soil surface, taking an image, and then repeating the process every 30 cm along the length of the tape. Using Adobe Photoshop (Adobe Inc., San Jose, CA), 50 sequential 0.305 m × 0.20 m images were cropped from the part of the image contiguous to, but not including, the tape, starting at the 0 point on the tape (Upadhyay et al., 2022). The cropped images were obtained from the side of the tape with an oblique angle to the sun to eliminate tape shadow and the cropped portion was from the image where the target area was most central to the image to minimize the parallax effect (Upadhyay et al., 2022). The resulting dataset had 4,400 images (88 field locations × 50 images) defined as the region of interest (ROI) image dataset (fig. 1). The resulting ROI images had a mean size of 2,319 pixels wide (range 1,820 to 2,690) × 1,551 pixels tall (range 1,270 to 1,820 pixels).
Figure 1. Example region of interest (ROI) images with corn, soybean, and winter small grain residue. A bullseye grid method with n = 100 grid points (Lory et al., 2021) was used to estimate the percentage residue cover (ground truth) for each of the 4,400 cropped ROI images. We estimated the percentage residue cover only, ignoring live plant contributions (e.g., grain crop and weeds) to ground cover. The field-location percentage residue estimate was then calculated as the mean of the 50 cropped images along the tape. Figure 2 summarizes the distribution of ROI and field-location images for 2018 (designated as the training dataset) and 2019 (designated as the testing dataset). The dominant residue in 2018 was corn (28 locations), soybean (23 locations), or winter small grains (nine locations), and in 2019 it was corn (four locations), soybean (19 locations), or winter small grains (five locations).
Post Data Collection
The post-data collection project workflow is summarized in figure 3. Details of the key differences among three classification pathways tested are described in detail below.
Feature Extraction (Step 1)
For feature extraction, we used the transfer learning approach suggested by Lagaunne (2022) and Lagaunne et al. (2023). Features were extracted from the 4,400 ROI images using a pre-trained VGGNet-16 CNN model previously trained on the ImageNet dataset (Simonyan and Zisserman, 2015). We used the version of VGGNet-16 CNN pre-trained using the ImageNet dataset because the 1000 training classes include some class objects that resemble residue (e.g., hay) or are agriculturally related (e.g., plants). The VGGNet-16 CNN contains 13 convolution layers, five max pooling
layers, and three fully connected layers (Simonyan and Zisserman, 2015). We derived features from the five max-pooling layers for this study because max-pooling is spatially invariant (Nagi et al., 2011).
The VGGNet16 model required input images to be sized to 224 × 224 pixels. Consequently, the original 4,400 ROI images were cropped into multiples of 224 pixels in height and width in preparation for processing. This resulted in ROI images being represented by arrays of eight to 12 sub-images wide by five to eight sub-images tall, depending on the exact size of the ROI. The majority of the ROI images contained 60 to 70 sub-images (10 × 6 or 10 × 7), each sized at 224 × 224 pixels. Figure 4 represents an ROI divided into 70 sub-images, as an example.
In VGGNet-16 CNN, the first pooling layer has a dimension of 112 × 112 cells with 64 kernels. After extraction, sub-images from the first pooling layer were recombined into a matrix, and the average value for all the cells in the layer was determined. For example, for an image with 70 sub-images, the first pooling layer was combined into a matrix with a dimension of 1,120 (10 × 112) × 784 (7 × 112) with 64 kernels (fig. 4). For each kernel, we calculated the average value of the 878,080 cells in the 1,120 × 784 matrix. Consequently, the first max pooling layer of an image was represented by 64 values, where each value was the average of the cells derived from the sub-images obtained from the ROI image. This was repeated for the other four max-pooling layers with kernel numbers of 128, 256, 512, and 512, respectively (fig. 4). Consequently, each of the 4,440 images was then represented by a 1 × 1,472 feature matrix (64 + 128 + 256 + 512 + 512 = 1,472; see fig. 4).
Figure 2. Distribution of dominant residue type based on three categories: SB=Soybean, C=Corn, WSG=Winter Small Grain. Data collected in 2018 (left) and 2019 (right). Report number of region-of-interest (ROI) images (top) and field locations (bottom). Figure 3. Post data-collection project flowchart defining the three methods of estimating residue evaluated. The dataset was then divided into the training dataset (2018 data, 3,000 images) and the testing dataset (2019 data, 1,400 images). Using 2018 data for training and 2019 data for testing ensured training and testing datasets were independent, providing a robust test of the trained model's ability to perform on different fields under different environmental conditions associated with a different year.
In preparation for feature selection analysis, using the 2018 training data, the minimum and maximum values were determined for each of the 1,472 features, and all feature values were scaled to a range of 0 to 1 based on the min-max range. The same min-max range was then used to scale the 2019 dataset. All subsequent analyses used scaled data.
Feature Selection (Step 2)
A recursive feature elimination using support vector machine (RFE-SVM; Isabelle et al., 2002) method was used to determine an optimized set of features to be used for building a model, as this method minimizes potential over-fitting and showed superior results over other methods (Upadhyay et al., 2022). The RFE-SVM, as we applied it, required transforming our numeric percent residue cover into class data. Our goal was to evaluate four distinct classification scenarios: 3-, 4-, 5-, and 8-class problems. In preparation for classification analysis, using the percentage residue estimates attained by the bullseye method, each ROI image was assigned a class value for a 3-, 4-, 5-, and 8-class datasets, respectively, where each n-class problem contained n evenly divided intervals. For example, for the 3-class dataset, ROI images with 0% to 33.3% residue were assigned as class 1, ROI images with >33.3% to 66.6% residue were assigned as class 2, and ROI images with >66.6% residue were assigned as class 3. The number of ROI images in each class for the four classification datasets is presented in table 1 for both the training and testing datasets.
Figure 4. Flow chart showing how 1,472 features were extracted from five max pooling layers. Features were initially extracted from sub-images using the five max pooling layers of VGGNet-16 CNN, which were then averaged across the sub-images, resulting in 1,472 features (64 + 128 + 256 + 512+ 512 = 1,472). Example pictured is for an image divided into 70 sub-images.
Table 1. Number of training and testing data-samples per class for the evaluated classification problems. Class No. Number of Training Data-Samples Number of Testing Data-Samples 3-Classes 4-Classes 5-Classes 8-Classes 3-Classes 4-Classes 5-Classes 8-Classes Class-1 1522 1258 1097 808 762 658 586 457 Class-2 662 633 580 474 320 248 229 212 Class-3 816 408 384 344 318 265 200 134 Class-4 701 316 281 229 219 113 Class-5 623 206 166 128 Class-6 200 138 Class-7 223 139 Class-8 464 79 The RFE-SVM feature selection process was performed on training datasets (3,000 ROI images with 1,472 features) for each of the four classification problems (3-, 4-, 5-, and 8-classes). First, the 1,472 features were put in rank order of importance using the RFE-SVM feature selection method, where the SVM classifier with the linear kernel was used. Ten-fold cross-validation was then used to determine the cross-validation score (reported as accuracy; the ratio of the correctly classified ROI images divided by the total number of ROI images) using the SVM classifier with the radial basis function kernel, the penalty parameter C selected as ‘1’and the kernel parameter ? selected as (number of features)-1. The accuracy score was calculated for the addition of each ranked feature, starting from the first ranked feature, and evaluating up to the 50th ranked feature. The optimum feature subset was determined from the 10-fold cross-validation scores, stopping when mean accuracy increased less than 0.005 for an added feature. Classification operations based on the SVM model used the ‘SVC’ package in the “sklearn” library (Pedregosa et al., 2011) and were implemented on the Jupyter Notebook (ver. 6.2.0, Kluyver et al., 2016) using Python (ver. 3.11.3).
Model Building for ROI Image-Wise Testing (Step 3)
The training dataset with the optimum feature subset for each class problem was then used to develop the models for each classification pathway (fig. 3). The 3,000 training images (2018 dataset) were used for training the respective model-building strategies, and each model was then tested on the 1,400 testing images (2019 dataset). The specifics of the three model building pathways are described in the following subsections.
Simple SVM Classification
This pathway was used by the strategy detailed in Upadhyay et al. (2022), which classified ROI images into three classes using the SVM classifier. In contrast to that study, we used the optimum transfer learning feature sets for model building, optimizing the number of classes used by comparing outcomes with 3-, 4-, 5-, and 8-class approaches.
Ensemble SVM Classification
The classification with ensemble SVM pathway is based on previous work that documented that an ensemble SVM approach improved classification accuracy compared to the simple SVM classifier strategy in a residue problem (Upadhyay et al., 2021). Originally, SVM was conceived for binary classification tasks, and SVM seeks a hyperplane in an N-dimensional space (N- number of features) that maximizes the separating margin that distinctly classifies the data-points (Boser et al., 1992; Cortes and Vapnik, 1995). Generally, SVM does not output probabilities, but probability calibration methods like the Platt scaling method can be used to convert the output to class probabilities (Platt, 1999). So, the SVM outputs were regarded as a probabilistic estimate of the data-point belonging to each class, and such probabilistic estimates were computed for all two-class (binary) SVMs. To compute these probabilistic estimates, the distances dk(x) of each data-point x to the separating hyperplane corresponding to the relevant pair of classes k were computed. Based on these distances dk(x), probabilistic estimates pk(x) of a data-point belonging to one of the classes in the pair k were then calculated using a sigmoid function with equation 1 (Platt, 1999):
(1)
where Ak and Bk are parameters determined for each k of the two-class SVM. These parameters are derived by minimizing the mean square error between the original label and the output of the sigmoid function applied to the training data.
Figure 5a outlines the development of the expert classifiers for ensemble classification. First, the data is transformed from an n-class ordinal problem to an n-1 new binary class problems using the selected features for the ROI images in the respective paired classes. Once each binary classifier is trained, the probabilistic estimate is calculated (fig. 5a). For ensemble classification, a final model for classifying ROI images was then developed with the optimum feature set plus the k-1 expert derived probabilities for the training dataset using SVM (Upadhyay et al., 2021). This was done for the 3-, 4-, 5-, and 8-class problems.
(a) SVM ensemble training.
(b) SVM ensemble for prediction.
Figure 5. Training (a) and testing (b) algorithms developed for the proposed ensemble SVM for ordinal regression to estimate crop residue cover percentage. Ensemble SVM for Ordinal Regression
The goal of ensemble SVM for ordinal regression is to develop a percent residue estimate for the ROI images using the probabilistic estimates calculated for each of the n-1 expert classifiers from the ensemble SVM training process (fig. 5). In contrast to the ensemble SVM classification strategy, after the binary classifiers are trained, the probability estimates for an unknown image (P1, P2,..., Pn-1) are used directly to estimate the percentage residue of the ROI image (fig. 5b). To predict the percentage value of an unseen instance, first obtain a summation of the probabilistic estimates of the n-1 original ordinal classes predicted using the n-1 models (eq. 2). Then calculate the ROI image percentage residue using equation 3.
(2)
(3)
Testing ROI Image Results
For the classification methods (simple SVM classification and ensemble SVM classification), the performance of the ROI image classification was reported as a classification accuracy score using the appropriate ground truth classification values for the testing dataset. For the ensemble ordinal regression, the ROI image percent residue estimate was evaluated using linear regression fit statistics to the ground truth percent residue estimate (r2 and mean absolute error (MAE)) using the testing dataset.
Location-Wise Assessment (Step 4)
The final step of the process was to estimate the location percent residue based on the results for the 50 ROI images from Step 3 (see fig. 3). To integrate ROI classification values into a field-location percent residue cover estimate, a Bayesian multinomial Gaussian model was developed (Upadhyay et al., 2022; Vasko et al., 2000). The model was developed using the 2018 training dataset, and it estimated location-wise residue values based on probability distributions derived from image-wise classification scores. These scores were determined by the final classification models developed in Step 3. The number of images in each class was converted to the percentage of images in each class to facilitate using the model in situations where the number of ROI images did not match the 50 ROI images per field location used during model development. The scaling factor (alpha) was set to one, the optimum factors (betas) were assumed to be normally distributed, the tolerance factors (gammas) were assumed to have a gamma distribution, and non-informative priors were used, consistent with Vasko et al. (2000). This Bayesian model was then used to obtain field location residue estimates for the 2019 testing dataset using 50 ROI image-wise classification estimates from Step 3 as the input (fig. 3). The model was developed using PROC MCMC (SAS ver. 9.4, SAS Institute Inc., Cary, NC).
For the ensemble ordinal regression approach, field-location residue estimates were the mean of the 50 ROI image-wise estimates of percent residue. Field-location model performance for all three methods was then evaluated and compared using regression fit statistics to the field-location ground truth data (r2 and MAE). Outliers were defined as field-location percent residue estimates greater than +10 percentage points from ground truth.
Results
Feature Selection for Each Classification Problem
Our first objective was to select the optimum subset of features from the 1,472 features generated using the pre-trained VGGNet-16 CNN for each of the 3-, 4-, 5-, and 8-class classification models. For each class (3,4,5, and 8), the optimum feature subsets consisted of 10, 9, 18, and 16 features, respectively, using the criteria of less than a 0.005-unit improvement in the mean accuracy (fig. 6). More than 85% of the selected features in each classification set were from max pooling layers 4 and 5 of the VGGNet-16 CNN.
Figure 6. The mean classification accuracy across ten folds of cross-validation was computed for the first 25 chosen features in the classification pathways (RFE-SVM). ROI Residue Estimates
Our second objective was to estimate residue at the ROI image-wise in terms of classes using classification models and percentage levels using ensemble ordinal regression models. Results for the two class-based classification methods are reported in table 2, and the ensemble ordinal regression method is shown in table 3.
Table 2. ROI Image-wise classification model scores percentage accuracy. Classes # Features Cross-Validation
(%)Training
(%)Testing
(%)Simple Ensemble Simple Ensemble Simple Ensemble 3 10 90.3 92.9 92.2 92.8 88.6 90.2 4 9 85.5 88.9 88.3 89.0 76.6 78.1 5 18 81.3 86.7 86.1 87.1 76.8 77.1 8 16 71.0 76.7 77.9 77.9 57.0 57.9
Table 3. Model fit statistics for ROI Image-wise estimates of percent residue using the ensemble ordinal regression strategy compared to the ground truth. Classes ROI Images Scores Train r2 Test r2 Train MAE Test MAE 3 0.94 0.91 10.0 10.4 4 0.96 0.89 7.9 9.2 5 0.97 0.93 6.4 6.6 8 0.98 0.93 4.9 6.1 The ensemble SVM classifier worked better in all classification problems than the simple SVM classifier (table 2). There is a clear reduction in the accuracy score as the number of ordinal classes increases. This increase was linearly related to the number of classes, implying it was associated with the number of boundaries in the classification problem (data not shown) and also implying no clear differences among the different class problems. In contrast, ensemble ordinal regression model fit statistics increased with the number of classes (table 3). The ROI testing data is plotted in figure 7 for the eight-class ensemble ordinal regression problem. We also tested conventional regression strategies, including the linear regression model, lasso model, support vector regression, and gradient boosting regressor; none outperformed ensemble ordinal regression using the eight-classes problem when considering both r2 and MAE metrics (data not shown).
Figure 7. The relationship between predicted% residue for testing ROI images and the ground truth (GT) for the ensemble ordinal regression 8-classes model. Location-Wise Residue Estimates
Our final objective was to estimate the field location-wise percentage residue (table 4, fig. 8).
The number of classes had an impact on all three methods, with the most classes (8) performing worst for the two classification strategies and the fewest classes performing worst for the ensemble ordinal regression strategy (table 4). The 3- and 5-class ensemble SVM classification strategies had the highest r2 and the lowest MAE, with one (3-class) and no (5-class) outliers. The simple SVM + Bayesian model performed marginally worse than the ensemble SVM + Bayesian model with lower r2, higher MAE, and more outliers. The outcome from 8-class ensemble ordinal regression showed a slight decrease in r2 and the presence of one outlier. However, the ensemble ordinal regression outcome had no evidence of bias in the estimation of percent residue cover (fig. 8; slope not different than one and intercept not different than zero (p > 0.5)). The pure classification methods over-estimated residue at the intercept and had slopes significantly less than one (p < 0.01; fig. 8). Consequently, the superior strategy was the 8-class ensemble ordinal regression method. The pure regression models substantially underperformed the three SVM strategies (data not shown).
Table 4. Field location-wise scores for each method. Methods Classes Testing Field Location-Wise Scores r2 MAE Field-Outliers Simple SVM
+
Bayesian3 0.97 3.91 2 4 0.94 5.74 5 5 0.97 4.32 1 8 0.95 5.13 4 Ensemble SVM
+
Bayesian3 0.98 3.38 1 4 0.94 5.46 4 5 0.98 3.41 0 8 0.95 5.26 4 Ensemble SVM
Ordinal
Regression3 0.97 7.84 9 4 0.94 7.29 6 5 0.97 4.32 1 8 0.97 3.83 1
(a) (b) (c) Figure 8. The relationship between predicted% residue for testing field location and the ground truth readings for (a) simple SVM (5-class) + Bayesian, (b) ensemble SVM (5-class) + Bayesian, and (c) ensemble SVM (8-class) with ordinal regression method, respectively. Discussion
This study further documented the benefit of the ensemble SVM for estimating residue cover. Both the ROI image accuracy scores (table 2) and the field location fit statistics (table 4) were better for the ensemble SVM model. Additionally, the resulting slope and intercept comparing the estimated results to the ground truth had less bias than the simple SVM results (fig. 8). Increasing the number of classes did not ensure enhanced model performance. The Bayesian methodology for converting ROI image class values into a field-location estimate of percent residue could have accounted for some inherent biases in the classification system. Yet the regression model statistics comparing the field location ground truth to the model estimates for the Bayesian methods show clear bias, over-estimating residue percentage in low-residue conditions (fig. 8). The testing dataset over-represented soybean residue compared to corn residue compared to the training dataset (fig. 2). There was also a drop in ROI image classification scores from training to testing (table 2). A similar pattern was observed for a three-class model derived primarily from known texture features using the same dataset (Upadhyay et al., 2022). It is impossible to definitively determine if the bias documented in the testing dataset is derived from issues with the ROI classification or challenges posed by the shift in residue types affecting the integration of ROI image classification into a location estimate using the Bayesian model.
The success of the ensemble SVM ordinal regression method suggests this approach is the most robust. This approach provided similar fit statistics as the classification plus Bayesian model pathways (table 4), combined with limited evidence of bias when comparing the field-location estimates to the ground truth (fig. 8). Another benefit of this approach is that it simplifies the estimate of residue by eliminating the need for the Bayesian model. The Bayesian model is based on a distribution of residue associated with the sampling of residue along a transect in the field. Deploying the method differently, such as estimating residue at a point location in a field, may not be compatible with the assumptions of the Bayesian model. There are no such restrictions on the ensemble ordinal regression method. The ensemble ordinal regression method, using simple averaging results of the ROI image to obtain field-location estimates of percent residue, is simple to implement and simplifies the conceptualization of using this strategy to estimate residue for alternative sampling strategies.
In contrast to the pure SVM classification methods, the ensemble SVM ordinal regression method improved with increasing the number of classes, both at the ROI image scale (table 2) and at the field location scale (table 4). With fewer classes (e.g., a 3-class problem), the probability calculation (eq. 2) lacked sensitivity, overclassifying ROI images as 0%, 50%, or 100% residue cover. Increasing to eight classes resolved this issue. We also tested 10 classes, which showed some decline in performance compared to eight classes (data not shown). This could reflect insufficient data in some classes to effectively train the expert classifier.
Estimates of percent residue cover from this study compare favorably with previous published research estimating residue cover using high-resolution RGB images, which reported performance r2 between 0.75 and 0.90 (Bauer and Strauss, 2014; Kavoosi et al., 2020; Laamrani et al., 2018; Riegler-Nurscher et al., 2018; Upadhyay et al., 2022). Hively et al. (2019), using WorldView-3 satellite imagery, mapped crop residue cover with 92% (+/- 10%) accuracy.
The dataset used in this research was more extensive than most previously reported work, based on 4,400 ROI images from 88 field locations spanning two years in mid-Missouri. Our previous work using the same dataset and known features reported testing r2=0.90 with seven outliers using a process equivalent to the three-class simple SVM model in this research using transfer learning features, which reported an r2=0.97 and two outliers (table 4). This documents the apparent superiority of the selected features in the transfer learning strategy. The success of the transfer learning method suggests that the global average pooling strategy to reduce the number of features extracted from the CNN was successful at reducing the redundancy associated with the CNN feature set. The success of this study suggests an opportunity to improve outcomes through testing alternatives to global average pooling for reducing the number of features and investigating alternative pre-trained models for relevant features.
The data from this experiment were derived from images taken along a 15-m transect placed at a 45-degree angle to the direction of cropping. In practice, when duplicating the tape-transect method with high-resolution RGB images from a camera similar to what was used in this study, we suggest obtaining 18 contiguous images, which will cover approximately 15 m. These images should be divided into six 1792- × 1792-pixel ROI sub-images (a two tall by three wide arrays per image), and the top three and bottom three ROI images divided into different sets. This will result in two sets of 54 ROI sub-images (3 ROI sub-images × 18 images = 54 ROI) for processing to estimate field-location residue. Each ROI image is then divided into 8 × 8 arrays of 224 × 224-pixel images for processing by the CNN. The extracted features are processed in the ensemble SVM ordinal regression model to estimate the percent residue for the ROI. Then average the ROI percent residue estimates for both sets of 54 ROI images to obtain two estimates of the percent residue for the field location. If a camera with different specifications is used, the camera height should be determined from the ground area needed to obtain a 0.014 cm pixel-1 GSD. Subsequently, calculating the number of contiguous images to cover an approximate distance of 50 m is required. Lastly, the best way to effectively divide each image into ROI images that approximate our approach should be identified.
There are some clear limits to the application of these results to other situations. The model was developed in agricultural fields in mid-Missouri dominated by silt loam soils and corn-soybean rotations. We used a tripod to collect the data to limit variability in GSD. Further testing is required to assess the sensitivity of these vaguely defined features obtained from the transfer learning process to variations in RGB images due to differences in GSD, or variations associated with RGB images obtained by different SLR cameras, smart phones, or aerial imagery with comparable resolutions. Others using the proposed method are encouraged to verify residue cover in a subset of their images to ensure the system is working correctly.
Conclusion
The combination of a transfer learning strategy for extracting features using the pre-trained VGGNet-16 CNN and a model based on ensemble SVM using eight classes and ordinal regression provided superior results for residue cover prediction. Features derived from transfer learning provided substantially better results than previous work using primarily known texture features and suggests that the global average pooling strategy successfully reduces the number of features extracted from the CNN. Ensemble SVM improved results by documenting improved performance by exploiting ordering information. The ensemble SVM also facilitated an ordinal regression method to assign percent residue estimates to the ROI images, facilitating averaging to obtain field-location estimates of percent residue cover. This strategy reduced the bias, which was observed in other methods that overestimate residue in low residue situations and facilitates a more intuitive application of the method to different image sampling strategies for residue cover.
More work is needed to test the algorithm on a broader range of soil and residue types. The current application uses a tripod to limit variability in GSD. Additional work is needed to determine the sensitivity of the method to small variations in GSD that are likely when using handheld and UAV-based cameras. Users of this method should validate the accuracy of residue estimates on a subsample of images to confirm it is working under their specific conditions.
Overall, this research confirms the utility of high-resolution RGB imagery to quantify residue cover. Expanding access to low-cost, high-resolution RGB images through technologies such as smart phones and UAVs suggests this will be an important strategy to document compliance with residue requirements in agricultural systems.
Acknowledgments
We thank the Missouri USDA NRCS for financial and logistical support, with particular appreciation to Ron Miller and Glenn Davis.
References
Bauer, T., & Strauss, P. (2014). A rule-based image analysis approach for calculating residues and vegetation cover under field conditions. CATENA, 113, 363-369. https://doi.org/10.1016/j.catena.2013.08.022
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Proc. 5th Annual Workshop on Computational Learning Theory (pp. 144–152). Association for Computing Machinery. https://doi.org/10.1145/130385.130401
Bronick, C. J., & Lal, R. (2005). Soil structure and management: A review. Geoderma, 124(1), 3-22. https://doi.org/10.1016/j.geoderma.2004.03.005
Cherubin, M. R., Oliveira, D. M., Feigl, B. J., Pimentel, L. G., Lisboa, I. P., Gmach, M. R.,... Cerri, C. C. (2018). Crop residue harvest for bioenergy production and its implications on soil functioning and plant growth: A review. Scientia Agricola, 75(3), 255-272. https://doi.org/10.1590/1678-992x-2016-0459
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Mach. Learn., 20(3), 273-297. https://doi.org/10.1007/BF00994018
Dabney, S. M., Yoder, D. C., Vieira, D. A., & Bingner, R. L. (2011). Enhancing RUSLE to include runoff-driven phenomena. Hydrol. Process., 25(9), 1373-1390. https://doi.org/10.1002/hyp.7897
Flanagan, D. C., Gilley, J. E., & Franti, T. G. (2007). Water Erosion Prediction Project (WEPP): Development history, model capabilities, and future enhancements. Trans. ASABE, 50(5), 1603-1612. https://doi.org/10.13031/2013.23968
Glaser, L. K. (1985). Provisions of the Food Security Act of 1985. USDA-ERS. Retrieved from https://www.ers.usda.gov/webdocs/publications/41995/15133_aib498_1_.pdf?v=153.7
Guerrero, J. M., Pajares, G., Montalvo, M., Romeo, J., & Guijarro, M. (2012). Support Vector Machines for crop/weeds identification in maize fields. Expert Syst. Appl., 39(12), 11149-11155. https://doi.org/10.1016/j.eswa.2012.03.040
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Mach. Learn., 46(1), 389-422. https://doi.org/10.1023/A:1012487302797
Hively, W. D., Shermeyer, J., Lamb, B. T., Daughtry, C. T., Quemada, M., & Keppler, J. (2019). Mapping crop residue by combining landsat and worldview-3 satellite imagery. Remote Sens., 11(16), 1857. https://doi.org/10.3390/rs11161857
Jin, X., Liu, S., Baret, F., Hemerlé, M., & Comar, A. (2017). Estimates of plant density of wheat crops at emergence from very low altitude UAV imagery. Remote Sens. Environ., 198, 105-114. https://doi.org/10.1016/j.rse.2017.06.007
Kavoosi, Z., Raoufat, M. H., Dehghani, M., Jafari, A., Kazemeini, S. A., & Nazemossadat, M. J. (2020). Feasibility of satellite and drone images for monitoring soil residue cover. J. Saudi Soc. Agric. Sci., 19(1), 56-64. https://doi.org/10.1016/j.jssas.2018.06.001
Kaya, A., Keceli, A. S., Catal, C., Yalic, H. Y., Temucin, H., & Tekinerdogan, B. (2019). Analysis of transfer learning for deep neural network based plant classification models. Comput. Electron. Agric., 158, 20-29. https://doi.org/10.1016/j.compag.2019.01.041
Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B. E., Bussonnier, M., Frederic, J.,... Willing, C. (2016). Jupyter Notebooks - A publishing format for reproducible computational workflows. Proc. 20th Int. Conf. Electronic Publ.2016. Elpub. https://doi.org/10.3233/978-1-61499-649-1-87
Laamrani, A., Joosse, P., & Feisthauer, N. (2017). Determining the number of measurements required to estimate crop residue cover by different methods. J. Soil Water Conserv., 72(5), 471-479. https://doi.org/10.2489/jswc.72.5.471
Laamrani, A., Pardo Lara, R., Berg, A. A., Branson, D., & Joosse, P. (2018). Using a mobile device “app” and proximal remote sensing technologies to assess soil cover fractions on agricultural fields. Sensors, 18(3), 708. https://doi.org/10.3390/s18030708
Laflen, J. M., Amemiya, M., & Hintz, E. A. (1981). Measuring crop residue cover. J. Soil Water Conserv., 36(6), 341-343. Retrieved from https://www.jswconline.org/content/jswc/36/6/341.full.pdf
Lagaunne, T. A. (2022). Estimation of crop residue cover in high-resolution RGB images using feature from a pre-trained convolutional neural network. MS thesis. Columbia, MO: University of Missouri, Div. Plant Sci. https://doi.org/10.32469/10355/91517
Lagaunne, T. A., Upadhyay, P. C., Lory, J. A., DeSouza, G. N., & Reece, D. A. (2023). Estimation of crop residue cover in high-resolution RGB images using features from a pre-trained convolution neural network. Proc. Soil Erosion Research Under a Changing Climate. St. Joseph, MI: ASABE. https://doi.org/10.13031/soil.23092
Lory, J. A., Upadhyay, P., Lagaunne, T. A., Spinka, C., Miller, R., Davis, G., & DeSouza, G. N. (2021). Capability of high-resolution RGB imagery to accurately document residue in row-crop fields. J. Soil Water Conserv., 76(5), 403-413. https://doi.org/10.2489/jswc.2021.00193
Nagi, J., Ducatelle, F., Di Caro, G. A., Ciresan, D., Meier, U., Giusti, A.,... Gambardella, L. M. (2011). Max-pooling convolutional neural networks for vision-based hand gesture recognition. Proc. 2011 IEEE Int. Conf. on Signal and Image Processing Applications (ICSIPA) (pp. 342-347). IEEE. https://doi.org/10.1109/ICSIPA.2011.6144164
Najafi, P., Feizizadeh, B., & Navid, H. (2021). A comparative approach of fuzzy object based image analysis and machine learning techniques which are applied to crop residue cover mapping by using Sentinel-2 satellite and UAV imagery. Remote Sens., 13(5), 937. https://doi.org/10.3390/rs13050937
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,... Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res., 12, 2825-2830.
Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Marg. Classif., 10(3), 61-74.
Ranaivoson, L., Naudin, K., Ripoche, A., Affholder, F., Rabeharisoa, L., & Corbeels, M. (2017). Agro-ecological functions of crop residues under conservation agriculture. A review. Agron. Sustain. Dev., 37(4). https://doi.org/10.1007/s13593-017-0432-z
Richards, B. K., Wafter, M. F., & Muck, R. E. (1984). Variation in line transect measurements of crop residue cover. J. Soil Water Conserv., 39(1), 60-61. Retrieved from https://www.jswconline.org/content/jswc/39/1/60.full.pdf
Riegler-Nurscher, P., Prankl, J., Bauer, T., Strauss, P., & Prankl, H. (2018). A machine learning approach for pixel wise classification of residue and vegetation cover under field conditions. Biosyst. Eng., 169, 188-198. https://doi.org/10.1016/j.biosystemseng.2018.02.011
Searle, S., & Bitnere, K. (2017). Working paper: Review of the impact of crop residue management on soil organic carbon in Europe. Int. Council Clean Transportation. Retrieved from https://theicct.org/publications/impact-of-crop-residue-mgmt-EU
Simonyan, K., & Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. Proc. 3rd Int. Conf. on Learning Representations (ICLR 2015). https://doi.org/10.48550/arXiv.1409.155
Singh, B., & Rengel, Z. (2007). The role of crop residues in improving soil fertility. In P. Marschner, & Z. Rengel (Eds.), Nutrient cycling in terrestrial ecosystems (pp. 183-214). Berlin, Heidelberg: Springer. https://doi.org/10.1007/978-3-540-68027-7_7
Upadhyay, P. C., Karanam, L., Lory, J. A., & DeSouza, G. N. (2021). Classifying cover crop residue from RGB Images: A simple SVM versus a SVM ensemble. Proc. 2021 IEEE Symp. Series on Computational Intelligence (SSCI) (pp. 1-7). IEEE. https://doi.org/10.1109/SSCI50451.2021.9660147
Upadhyay, P. C., Lory, J. A., DeSouza, G. N., Lagaunne, T. A., & Spinka, C. M. (2022). Classification of crop residue cover in high-resolution RGB images using machine learning. J. ASABE, 65(1), 75-86. https://doi.org/10.13031/ja.14572
USDA-NRCS. (2011). National Agronomy Manual.
Vasko, K., Toivonen, H. T., & Korhola, A. (2000). A Bayesian multinomial Gaussian response model for organism-based environmental reconstruction. J. Paleolimnol., 24(3), 243-250. https://doi.org/10.1023/A:1008180500301
Weiss, K., Khoshgoftaar, T. M., & Wang, D. (2016). A survey of transfer learning. J. Big Data, 3(1), 9. https://doi.org/10.1186/s40537-016-0043-6
Weltz, M. A., Huang, C.-H., Newingham, B. A., Tatarko, J., Nouwakpo, S. K., & Tsegaye, T. (2020). A strategic plan for future USDA Agricultural Research Service erosion research and model development. J. Soil Water Conserv., 75(6), 137A-143A. https://doi.org/10.2489/jswc.2020.0805A
Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H.,... He, Q. (2021). A comprehensive survey on transfer learning. Proc. IEEE, 109(1), 43-76. https://doi.org/10.1109/JPROC.2020.3004555