Key Dates
Solutions Due
October 7, 2020
Virtual Pitch
November 9, 2020
Winners Announced
November 16, 2020
The challenge is closed.
Background
Cereal based products rely on raw grain from crops grown by farmers. Different growing locations and environmental conditions are known to influence grain physical and compositional elements which can affect processing and final product quality. The goal of this exercise is to create a model that can predict the effect of growing location, soil type, fertilizer, and crop parameters associated with growth and development on product assessment.
Data
The Crop and Grain dataset consists of field trials conducted over several years in 5 different locations identified by the Site ID column. Each of the trials had several replicates on which 28 assessment types were measured. Note that the year appended within the Site ID is the harvest year.
The other columns in the dataset are:
- Growth Stage: The crop growth stage on the date that the assessments were conducted; values are ordinal.
- Variety: B and M are the crop varieties grown in the trials.
- Assessment Type: The types of assessments (or KPIs - key performance indicators) measured on the crop.
- Assessment Date: The dates when the assessment types were evaluated.
- Assessment Score: The evaluation scores on each KPI.
Site metadata are also provided and include:
- Latitude and Elevation of the trial sites
- Sowing and Harvest dates of the trial crops
- Relevant soil parameters and fertilizer quantities; Soil Parameter A has valued at the nominal level.
Weather data from weather stations installed at each of the trial sites is also included. There are 6 relevant weather parameters in the data. The dates when the data were gathered are included in the weather worksheet.
The Challenge
A model predicting the assessment score would allow growers to assess product quality. Given the provided data sets, derive a model that would predict the assessment scores as accurately as possible using relevant features or predictor variables.