Forecasting of Area, Yield and Production of Mustard in Odisha

Author: Gourav Sahu and Abhiram Dash

Journal Name:

PDF Download PDF

Abstract

Mustard (Brassica juncea) constitutes as one of the important oilseed crops in Odisha. In Odisha, mustard is generally grown as a rabi season crop after the rice cultivation. Varieties like RLM 619, Pusa bold, Varuna, Pusa Bahar, krantiare growing in Odisha conditions very well. Forecasting of mustard production is of utmost importance for the agri-planners for making policies regarding package of practices of the mustard crop. The present research is carried out on forecasting area, yield and production of mustard in Odisha by using ARIMA model. ARIMA, the most widely used model for forecasting is used in the study. The data on area, yield and production of mustard are collected from 1970-71 to 2019-20 are used to fit the models found suitable from ACF and PACF plots. The ACF and PACF plots are obtained from stationarized data. The best fit model is selected on basis of significance of estimated coefficients, model diagnostic tests and model fit statistics. The selected best fit model is cross validated by refitting the model by leaving last 5 years, 4years, up to last 1 year data and obtaining one step ahead forecast for the years from 2015-16 to 2019-20. After successful cross validation the selected best fit model is used for forecasting the area, yield and production of niger in Odisha for the future years 2020-21, 2021-22, 2022-23. The ARIMA model found to be best fit for area, yield and production of mustard are ARIMA(1,0,0), ARIMA(0,1,1), ARIMA(2,0,2) respectively. All these selected models are fitted without constant as the constant term is insignificant for all these cases. The forecasted values for area under mustard is found to increase in the future years which is responsible for increase in forecasted values of future production despite the yield remaining stagnant for future years.

Keywords

ARIMA, cross validation, forecast, model diagnostics, model fit statistics

Conclusion

ARIMA (1,0,0) with constant model, ARIMA (0,1,1) without constant model and ARIMA (2,0,2) with constant model are found to be the best fit model for area, yield and production of mustard in Odisha. These selected models are used for forecasting of area, yield and production of mustard in Odisha. The forecast values shows that area under mustard is likely to increase in future years, whereas, yield will be stagnant but forecast in production of mustard in Odisha is forecasted to increase in future years with variation in lower and upper class interval of the forecast values. This increase in forecast of production for future years is due to increase in area.

References

INTRODUCTION Oilseeds are broadly divided into two groups, primary groups consists of the edible group viz. Groundnut, Rapeseed (Toria, Mustard and Sarson), Soybean, Sunflower, Sesame, Safflower and Niger and secondary group consists of non-edible group viz. Castor and Linseed. In India, among 9 oilseed crops, mustard is grown in about 24 per cent of area under oilseeds and producing 27 per cent of total oilseeds production followed by ground nut having 20 per cent area and 27 per cent production (oilseeds.dac.gov.in/). In Odisha, the crop is cultivated in 10 per cent of area under oilseeds and producing about 4 per cent of total oilseeds production. Kalahandi, Kandhamal, Sundargarh are now leading in mustard production in the state followed by Balasore, Angul, Keonjhar, Mayurbhanj, Sambalpur. In mustard, the oil content varies from 37 to 49%. The seed and oil are used as condiments in the preparation of pickles, curries, vegetables, hair oils, medicines and manufacture of greases. The oil cake is used as feed and manure. The leaves of young plants are used as green vegetables and green stems and leaves are a good source of green fodder for cattle. In the tanning industry, mustard oil is used for softening leather. Forecasting is one of the main tools in the field of agricultural decision-making system to make effective growth policies and successful economic plans. Thus, oilseeds making an important contribution in general livelihood of farmers and overall agrarian economy, the need of accurate and timely forecasting of production and yield to make different policy decisions on storage, pricing, marketing, import-export, etc. is inevitable. There have been several studies on the forecasting of area, production and yield of different crops by various researchers. Chaudhari et al. (2013) examine three types of model using meteorological weather variables and mustard crop yield (Productivity) in Gandhinagar district of Gujarat. Ravita et al. (2016) applied ARIMA modelling for mustard yield Prediction in Haryana. Shah et al. (2017) conducts a study to forecast the substantial food crop production in Khyber Pakhtunkhwa, Pakistan; the secondary data were utilized by applying ARIMA forecasting strategy. They found that the outcome of the ARIMA model was sufficient. Daka et al. (2019) identified most suitable pre-harvest forecasting model for mustard crop for Banaskantha district of Gujarat. Ajay and Urmil (2020) forecasted yield of mustard in Haryana using ARIMA model. This study focuses on the forecasting of area, production and yield of mustard in Odisha using ARIMA models. Kumari et al. (2022) deals with comparison of different statistical models to predict area, production and yield of citrus in Gujarat state. The study revealed that ARIMA model was superior to explain the area, production and yield of citrus. MATERIALS AND METHODS The secondary data on area, yield and production of niger are collected for the state of Odisha (kharif and rabi seasons combined) for the period 1970-71 to 2019-20 from Five Decades of Odisha Agriculture Statistics published by Directorate of Agriculture and Food Production, Odisha. An Autoregressive Integrated Moving Average model is a statistical model which is used to predict the future trends. The ARMA models, which includes the order of differencing (which is done to station arise the data) is known as Autoregressive integrated moving average (ARIMA) model. A non-seasonal ARIMA model is classified as an "ARIMA (p,d,q)" model, where, the parameters p, d, q are the non-negative integers where p is the number of autoregressive terms, d is the number of nonseasonal differences necessary for stationarizing the data, and q is the number of moving average terms. Thus, the ARIMA (p, d, q) model can be represented y the following general forecasting equation: Y_t=μ+∑_(i=1)^p▒〖ɸ_i Y_(t-i)+ ∑_(j=1)^q▒〖θ_j ε_(t-j)+ε_t 〗〗 Where μ is a mean, ɸ_1,ɸ_2,…..ɸ_(p )and θ_1,θ_2,……θ_j are the parameters of the model, p is the order of the autoregressive term, q is the order of the moving average term, and ε_t ,ε_(t-1),..ε_(t-j)are noise error terms. Model identification: The ARIMA model is fitted to stationary data i.e. having constant mean and variance. Stationarity of data can be tested by using Augmented Dickey-Fuller test. If it is not stationary then it should be converted into stationary series by differencing the data at suitable lag. Usually, the data is stationarized after 1 or 2 differencing. After stationarizing the data, the Auto Correlation Function (ACF) and Partial Auto Correlation Function (PACF) plots are used to identify tentative Auto Regression (AR) and Moving Average (MA) orders. The orders of AR and MA are denoted by p and q respectively. Various tentative models based on identified AR and MA orders are fitted and parameters are estimated. After fitting the tentative models for a variable (area/yield/production) the estimated coefficients are tested for the significance and the normality and independency of the residuals of the fitted models are checked by using Shapiro-Wilk’s test statistic and Box-Ljung test statistic respectively. The models having all the estimated coefficients significant and satisfying the normality and independency of the errors are now compared on the basis of model fit statistics like Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE) and Akaike’s Information Criteria corrected (AICc). Then the model having the lowest value of these model fit statistics is considered to be the best fit model for the variable. The model fit statistics like MAPE, RMSE and AICc are mathematically as follows: Mean absolute percentage error: 100/n ∑_(t=1)^n▒|(y_t-y ̂_t)/y_t | Root mean square error (RMSE): √((∑_(T=1)^n▒(y ̂_t-y_t )^2 )/n) (Mishra, et. al) where y ̂_t= forecasted value, y_t= actual value and n = number of observations Akaike’s information criteria corrected: AIC + (〖2K〗^2+2K)/(n-k-1) Where AIC is the Akaike’s Information criteria, k denotes the number of parameters and n denotes the sample size i.e. no. of observations. The model with lowest RMSE, MAPE and AICc values is selected as the best fit ARIMA model among selected tentative models and it is taken for forecasting. RESULTS AND DISCUSSION The data on area, yield and production of mustard crop was tested for the presence of stationarity by using Augmented Dickey Fuller test and the results are presented in Table 1. The test results confirmed that the data was not stationary and made stationary by differencing at lag 2. In the next step the order of AR and MA terms such as p and q were identified using the ACF and PACF plots shown in Fig. 1, 2 and 3. Different tentative models were identified using the orders of AR and MA terms. The tentative models of area and their estimated coefficients along with error measures are shown in the Table 2. The study of the table reveals that ARIMA (1,0,0) with constant model has all the estimated coefficients significant. Table 3 shows the model diagnostics test and model fit statistics for the fitted ARIMA models. ARIMA (1,0,0) model satisfies both the test of normality and independency of residuals. Thus this model is selected to be the best fit model for area under mustard crop. Fig. 4 also shows that none of the autocorrelations and partial autocorrelations of residuals are significant. This furthers confirms the selection of the respective best fit models. The tentative ARIMA models of yield and their estimated coefficients are shown in the Table 4. The study of the table reveals that ARIMA (1, 1, 0), ARIMA (0, 1, 1) with and without constant model has all the estimated coefficients significant. Table 5 shows the model diagnostics test and model fit statistics for the ARIMA models fit to data on yield of mustard. ARIMA (0, 1, 1) without constant model satisfies both the test of normality and independency of residuals. The RMSE, MAPE and AICc are least for ARIMA (0, 1, 1) without constant model. Thus, ARIMA (0, 1, 1) without constant model is selected to be the best fit model for yield of mustard crop. Fig. 5 also shows that none of the autocorrelations and partial autocorrelations of residuals are significant. This furthers confirms the selection of the respective best fit models. The tentative models of production and their estimated coefficients along with error measures are shown in the Table 6. The study of the table reveals that ARIMA (2, 0, 0), without constant and ARIMA (2, 0, 2), ARIMA (1, 0, 1) with constant model has all the estimated coefficients significant. Table 7 shows the model fit statistics and model diagnostics test for the fitted ARIMA models for production of mustard. ARIMA (2, 0, 2) with constant model satisfies both the test of normality and independency of residuals and also having least AICc value. The RMSE and MAPE are less for ARIMA (1, 1, 0) without constant model. Thus this model is selected to be the best fit model for production of mustard crop. Fig. 6 also shows that none of the autocorrelations and partial autocorrelations of residuals are significant. This furthers confirms the selection of the respective best fit models. In the Table 8, the result of cross validation of the selected best fit ARIMA model by one-step ahead forecasting has been presented. The APE (absolute percentage error) of area of mustard is found to be in the range between 1 to 21 and the MAPE (mean APE) is found to be 7.1254 for area of niger crop. Similarly for yield the APE range is found between 0.2 to 2.7 and MAPE is 1.2567. And that for production, APE range is between 3 to 38 and MAPE is 13.464. These results show that the selected ARIMA models are successfully cross validated. The appropriate ARIMA models which are represented in the previous tables were used to forecast the area, yield and production of mustard crop in Odisha for the years 2020-21, 2021-22 and 2022-23. Fig. 7, 8 and 9 shows the actual, fitted and forecast values of area, yield and production of mustard in Odisha.

How to cite this article

Gourav Sahu and Abhiram Dash (2022). Forecasting of Area, Yield and Production of Mustard in Odisha. Biological Forum – An International Journal, 14(3): 1157-1163.