Click on “Download PDF” for the PDF version or on the title for the HTML version.


If you are not an ASABE member or if your employer has not arranged for access to the full-text, Click here for options.

Improving Yield Data Analysis Using Contextual Data

Published by the American Society of Agricultural and Biological Engineers, St. Joseph, Michigan www.asabe.org

Citation:  Applied Engineering in Agriculture. 39(4): 391-398. (doi: 10.13031/aea.14655) @2023
Authors:   Elizabeth M. Hawkins, Dennis R. Buckmaster
Keywords:   Combine yield monitor, Context, Data analysis, Integrity zones, Management zones, Metadata, Precision agriculture, Yield, Yield data.

Highlights

Context-driven yield data cleaning resulted in more accurate whole field yield estimates

Using a context-driven yield data cleaning method can improve yield estimates for zones within fields

Identifying error-prone areas in field where data quality is likely to be low and removing that data in bulk can reduce data cleaning bias

Abstract. As agriculture becomes more data driven, decision-making has become the focus of the industry and data quality will be increasingly important. Traditionally, yield data cleaning techniques have removed individual data points based on criteria primarily focused on the yield values themselves. However, when these methods are used, the underlying causes of the errors are often overlooked and as a result, these techniques may fail to remove all of the inaccurate (error-prone) data and/or remove legitimate data. In this research, an alternative to data cleaning was developed. Data integrity zones (DIZ) within each field were identified by evaluating metadata which included data collected by the combine that reported the operating conditions of the machinery (i.e., travel speed, crop mass flow), data about the field environment (i.e., soil type, topography, weather), and data of field operations (e.g., field logs, as-applied maps). Data in DIZ were isolated using buffers and the analysis of the reduced datasets was compared to the raw data. The amount of data removed depended on the amount of variability (e.g. soil characteristics, topography) in the field. Statistical comparisons of the data showed the mean yield estimates for soil type polygons increased by an average of 1.4 Mg/ha for corn when DIZ data was used compared to raw data. On average, the confidence around the mean remains similar even with a large amount (70%) of data removed. Notably, the none of the mean estimates derived from raw datasets were contained in the confidence intervals produced from DIZ data. This meta-data (context-driven) alternative to data cleaning effectively removed errors and artifacts from yield data which would only be identified when looking beyond the yield measurements themselves. When similarly reduced datasets are used to analyze historical yield data, they should provide a clearer picture of true yield effects of treatments, management zones, soil types, etc.; this will improve decisions on input and resource allocation, support wiser adoption of precision agricultural technologies, and refine future data collection.

(Download PDF)    (Export to EndNotes)