Prediction of Wheat Yield in Uttar Pradesh using Multiple Linear Regression Approach

Author: Nitesh Kumar Yadav*, R. Vasanthi, R. Pangayar Selvi and Tmt. G. Vanitha

Journal Name:

PDF Download PDF

Abstract

Development of effective agricultural and food policy at the regional and global levels, accurate crop production estimates are essential. Multiple regression approach has been used to prediction of the crop yield production widely. Thirty years (1991-2020) of weather data and yield data of wheat for thirteen districts of Uttar Pradesh are taken from Directorate of economics and Statistics and NASA metrological data, which are main producer of wheat was used to develop yield prediction equations. MLR model were validated by predicting yield of three years (2018,2019 and 2020) data. The results indicated that the district Muzaffarnagar has less Mean Absolute Percentage Error (0.07376), Mean Absolute Error (0.25552), Standard Error (0.3622) comparing to remaining districts. Using Backward regression in SPSS the variable Minimum temperature (Tmin) is found to be significant and influences Wheat yield in Muzaffarnagar district. For Aligarh district Relative humidity is found to be significant for Wheat Yield prediction. By analysing the results MLR is found to be the best fit for the district Muzaffarnagar and Aligarh since it has less MAPE value comparing to other districts. The model can be used to some extent to forecast the yield in various districts of Uttar Pradesh. However, One of the main producing region for wheat in both India and the rest of the world is the Uttar Pradesh Wheat yields in this area will be impacted by climate change due to variations in temperature and precipitation as well as decreasing water availability for irrigation, posing serious questions about the security of the food supply on a national and international level.

Keywords

Multiple linear regression, Mean Absolute Error, Mean Absolute Percentage Error, Standard Error, Weather Indices

Conclusion

For thirteen districts in Uttar Pradesh, yield forecasts have been made for the wheat crops and it is found to be that the MLR model is best fit to the district Muzaffarnagar because it has less MAPE value (0.16), RMSE value (0.55) & MAE value (0.45) and also by seeing the result of Backward method in SPSS, Minimum temperature and Relative humidity influences Wheat yield in the district Muzaffarnagar and Aligarh respectively. The MLR model are also validated because the predicted yield was closed to observed yield. As a result, it could be used to predict wheat yield for all thirteen districts of Uttar Pradesh and found that Multiple Linear Regression is found to be the best fitted model.

References

INTRODUCTION Agriculture is the backbone of the Indian economy, the expansion of agriculture and related industries remains a vital role in the Indian economy's overall performance. For effective planning and policymaking in the nation's agriculture sector, two components—crop acreage estimation and crop yield forecasting—are essential (Lal et al., 1998). The estimation of agricultural output at the regional level serves as the foundation for planning crop production forecasts at the national level. Model based on weather variables can provide accurate predictions of crop output before harvest and early warning of pest and disease attacks, allowing for the timely implementation of relevant plant protection measures to protect the crops (Agrawal and Mehta 2007). An important project called FASAL, which is being carried out by the Ministry of Agriculture of the Government of India in conjunction with the Space Application Centre (SAC), the Institute of Economic Growth (IEG), and the India Meteorological Department, forecasts agricultural output using space, agrometeorology, and land-based observations (IMD). IMD develops intra-seasonal operational yield forecasts at district and state levels for 13 important crops of India throughout the kharif and rabi seasons using a statistical model as part of the FASAL project in partnership with 46 Agromet Field Units (AMFU) situated in various regions of the nation (Ghosh et al., 2014). Wheat is commonly grown for its seed, a cereal grain that is a staple diet for all across the world. The genus Triticum has numerous wheat species and the most frequently grown wheat is Triticum aestivum. Wheat is farmed on the most land on the comparison of any food crop. India's wheat production is 106.41 million tonnes in the 2021-22 crop season, which ends in June. Wheat and Rice is the main producing crop in India and these two crop play important role in the economic growth of our country. For the purpose of predicting the growth and yield of four cultivars of rice (Oryza sativa L.) under three transplanting dates, three agroclimatic models based on growing degree-days, helio-thermal units, and photo-thermal units were developed at Srinagar in 2004 and 2005. These cultivars were "Jhelum," "K 39," "Shalimar rice 1," and "China 1007." (25 May, 10 June and 25 June) (Singh et al., 2010). Wheat production reached 761 million tonnes in 2020, making it the second most widely grown cereal after maize, with a yearly production volume of 131,696,392 tonnes China is the world's greatest wheat producer and India is in second with a yearly production of 93,500,000 tonnes. Wheat is grown in three agroclimatic zones: western Uttar Pradesh (3.29 million hectares), eastern Uttar Pradesh (5.24 million ha), and central Uttar Pradesh (5.24 million ha) (0.68 million ha). The total area is 9.2 million hectares, with a total production of 24.5 million tonnes and a yield of 2.7 tonnes per hectare. The state of Uttar Pradesh produces the most wheat among all the states in India. It is situated in the Ganga's extremely fertile river basin. The state produces 300,00 tonnes of wheat. Wheat is produced on 96 lakh hectares in the state, on the whole 37% in India. METHODOLOGY Data Collection: Data were collected for 13 district in UP from 1991-2020 (30 Year) from NASA research centre for Solar and meteorological data and Directorate of economics and statistics. Variables considered for this study are namely Yield, Maximum temperature (Tmax), Minimum temperature (Tmin), Relative humidity, Rainfall (mm) and Ph respectively. In this study 13 district have selected because of there higher production in Uttar Pradesh. Statistical Method: The stepwise regression approach was used to pick the optimum regression equation from a large number of independent variables. The data was analysed using Rstudio software, with a probability level of 0.05 for entering and 0.1 for removing variables. To predict the yield of wheat for the subsequent years, a regression model was fitted using the entered variables derived via individual stepwise regression analysis. The multiple linear regression analysis was carried out for examining the Standard Error (SE) of estimated values resulting from various weather parameters. The statistical measures namely Mean absolute percentage error (MAPE), Mean absolute error (MAE) and Root Mean Square Error (RMSE) are used to fit the measured data to the degree of precision of each investigated correlation. Yield forecast model have been created for all thirteen wheat-producing districts. Model Performance Metrics: The effectiveness of the developed statistical models is examined using the mean absolute error (MAE), Mean absolute percentage error (MAPE) and Root Mean Square Error (RMSE). The following formula was used to calculate them: MAE= 1/n ∑_(i=1)^n▒|Y_i-Y ̂_i | MAPE= 100/n ∑_(i=1)^n▒|(Y_i-Y ̂_i)/Y_i | RMSE=[1/n ∑_(i=1)^n▒(Y_i-Y ̂_i )^2 ]^(1⁄2) Yi = Actual value ŶI = Model output The generated models perform better when the RMSE approaches 0, and the model fits the data better when the MAE and MAPE values are lower. MAE: Mean Absolute Error is a model evaluation metric used with regression models. The mean absolute error of a model with respect to a test set is the mean of the absolute values of the individual prediction errors on over all instances in the test set. Each prediction error is the difference between the true value and the predicted value for the instance. MAE= 1/n ∑_(i=1)^n▒|Y_i-Y ̂_i | MAPE: The mean absolute percentage error (MAPE) it is also called the mean absolute percentage deviation (MAPD) which measures accuracy of a forecast system. It measures this accuracy as a percentage, and can be calculated as the MAPE= 100/n ∑_(i=1)^n▒|(Y_i-Y ̂_i)/Y_i | The mean absolute percentage error (MAPE) is the most common measure used to forecast error, probably because the variable’s units are scaled to percentage units, which makes it easier to understand. It works best if there are no extremes to the data (and no zeros). It is often used as a loss function in regression analysis and model evaluation. Root Mean Square Error (RMSE) is the standard deviation of the residuals (prediction errors). Residuals are a measure of how far from the regression line data points are; RMSE is a measure of how spread out these residuals are. In other words, it tells you how concentrated the data is around the line of best fit. Root mean square error is commonly used in climatology, forecasting, and regression analysis to verify experimental results. RMSE=[1/n ∑_(i=1)^n▒(Y_i-Y ̂_i )^2 ]^(1⁄2) RESULTS AND DISCUSSION Table 1 shows the yield fluctuations explained by the model for the wheat crop, along with the standard error. In all thirteen districts of Uttar Pradesh, The SE was ranged between 0.3622 (Muzaffarnagar) and 0.6023(Kanpur). The MAPE was ranged between 0.07367 (Muzaffarnagar) and 0.15322 (Gorakhpur). The MAE was ranged between 0.25552 (Muzaffarnagar) and 0.42938 (Gorakhpur). It is found to be that the MLR model is best fit for the district Muzaffarnagar since it has less MAPE value (0.07367). Among remaining 12 district, MLR suites for the district Aligarh (0.09032) since it has less MAPE next to the district Muzaffarnagar. This is comparable with the result of Singh et al. (2011). For the validation of predicted model of wheat according to Singh et al. (2011) & Timbadia et al. (2021) for different district in the year 2018, 2019 and 2020 are shown in Table 2. Predicted yield was closed to observed yield; hence it can be utilised for yield forecasting and planning purpose. The results showed that, in relation to the main wheat-growing districts, the agro-meteorological yield model adequately predicted the yield variability caused by differences in minimum and maximum temperatures, rainfall as well as relative humidity. According to claim Singh et al.(2022) and Kumar et al. (2018), the per-hectare yield of wheat in India has decreased over the past few years as a result of the temperatures progressively rising in January, February, and March (This three month period are most crucial for the wheat crop). According to Lal et al. (1998) and Saxena et al. (2016)., in India, a rise of 0.5 °C in winter temperature is predicted to lower wheat production by 0.45 t ha-1 because maximum and lowest temperatures are particularly sensitive weather parameters for the wheat crop. The maximum and minimum temperatures that prevailed during the cropping season have a significant impact on this region's wheat growing belts. The result revealed that agrometeorological yield model explained the yield variability due to variations in temperatures, rainfall and relative humidity during the different stages (tillering, panicle initiation, booting and physiological maturity). Maximum and minimum temperatures were found common agrometeorological indices for most of the districts of this region. However, rainfall with relative humidity is also proved important agrometeorological indices for some of the districts of Uttar Pradesh (Naushad Khan et al., 2020). Table 3 shows the result of MLR by Backward Method in SPSS. It is found to be that the Minimum temperature is significant at 5% level and hence the temperature plays a significant role in the yield of wheat in Muzaffarnagar district. Table 4 shows the result of MLR by Backward Method in SPSS. It is found to be that the Relative Humidity is significant at 5% level and hence the temperature plays a significant role in the yield of wheat in Aligarh district.

How to cite this article

Nitesh Kumar Yadav*, R. Vasanthi, R. Pangayar Selvi and Tmt. G. Vanitha (2022). Prediction of Wheat Yield in Uttar Pradesh using Multiple Linear Regression Approach. Biological Forum – An International Journal, 14(2a): 266-270.