Click on “Download PDF” for the PDF version or on the title for the HTML version.

If you are not an ASABE member or if your employer has not arranged for access to the full-text, Click here for options.

Variety Trial Validation: A Framework to Incorporate on-Farm Data

Published by the American Society of Agricultural and Biological Engineers, St. Joseph, Michigan

Citation:  2021 ASABE Annual International Virtual Meeting  2101206.(doi:10.13031/aim.202101206)
Keywords:   Validation, Trial, Variety, Farm, data, framework, environment, cultivar


Variety trial validation is a framework to incorporate on-farm data. This study presents a predictive model for official cultivar trial yields, an innovative solution to the absence of cultivar selection tools based on on-farm data. Historical data indicates that cultivar selection is widely accepted to be the most important decision a farmer makes during the year. This decision establishes a maximum possible yield and quality, based on the genetic potential of the cultivar. Studies based upon the National Variety Trial Data (NCVT) from 2013-2018 indicate that the environment has a high contribution to the variability in yield. Therefore, for this study, several environmental factors were considered given their influence on yields, such as soil texture, pH, and weather data. Public databases NRCS Soil Surveys and NOAA National Climatic Data Center (RNOAA) served as the main sources of environmental data. To identify the association between environmental data and trail data a k-means cluster analysis (unsupervised clustering technique) was used to group trial data per state. The generated results were then used to train and test a predictive model, this model was then evaluated using a confusion matrix. After evaluating the prediction results of the model (for the state of Texas given 2014 information only), the model had an accuracy of 95.4% when predicting the yield at a given trial site in Texas. The 2014 cultivar trial based model was then used to predict 2015 (Texas) outcomes. The model had an accuracy of 34.4%, this percentage represents the ratio of successes generated by the model. The system compared the model‘s results against the unsupervised clustering results to generate a confusion matrix. This model can be further improved by incorporating weather data into the model. Currently, cultivar trial data does not include planting dates nor harvesting dates, therefore a method needs to be developed to predict planting dates and harvesting dates indirectly (Ex. Degree days). Then, a new predictive model can be developed to take into account weather data and obtain results based on sufficient environment data.

(Download PDF)    (Export to EndNotes)