Using ARIMA Model to Forecast the Area, Yield and Production of Arhar in Odisha

Author: Balaga Divya and Abhiram Dash

Journal Name:

PDF Download PDF

Abstract

Arhar is one of the important pulse crop of Odisha. It has very high nutritive value and thus contribute towards the nutritional security of the state. Forecasting of arhar production is very much necessary to enable the agriculture planners to formulate appropriate policies regarding the cultivation of the crop. The present research is carried out on forecasting area, yield and production of arhar in Odisha by using ARIMA model. ARIMA, the most widely used model for forecasting is used in the study. The data on area, yield and production of arhar are collected from 1970-71 to 2019-20 are used to fit the models found suitable from ACF and PACF plots. The ACF and PACF plots are obtained from stationarized data. The best fit model was selected on basis of significance of estimated coefficients, model diagnostic tests and model fit statistics. The selected best fit model was cross validated by refitting the model by leaving last 5 years, 4 years, upto last 1 year data and obtaining one step ahead forecast for the years 2015-16 to 2019-20. After successful cross validation the selected best fit model is used for forecasting the area, yield and production of arhar in Odisha for the future years 2020-21, 2021-22, 2022-23. The ARIMA model found to be best fit for area, yield and production of arhar are ARIMA(1,1,2), ARIMA(1,1,0), ARIMA(1,1,0) respectively. All these selected models are fitted without constant as the constant term is insignificant for all these cases. The forecast values shows that area, yield and production of arhar in Odisha remain stagnant in future years with variation in lower and upper class interval of the forecast values.

Keywords

ARIMA, cross validation, forecast, model diagnostics, model fit statistics

Conclusion

ARIMA (1,1,2) without model, ARIMA (1,1,0) without constant model and ARIMA (1,1,0) without constant model are found to be the best fit model for area, yield and production of arhar in Odisha. These selected models are used for forecasting of area, yield and production of arhar in Odisha. The forecast values shows that area, yield and thus production of arhar in Odisha remain stagnant in future years with variation in lower and upper class interval of the forecast values.

References

INTRODUCTION The production of pulses plays a pivotal role in nutritional security as well as agrarian economy of the state of Odisha. The major pulses grown in Odisha are green gram, arhar, horse gram, etc. In Odisha the area under arhar is 130 thousand hectares and its production is 140 thousand MT contributing 7.02% and 10.76% of overall area and production of pulses in Odisha respectively [agricultural statistics at a glance, 2020]. In Odisha the cultivation of arharis concentrated in Cuttack, Puri, Kalahandi, Koraput, Dhenkanal, Balangir, Rayagada, Naupada and Sambalpur districts. Among them Ganjam district stands first with respect to area and production of arhar in Odisha. As arhar occupies an important position among the pulse crops in Odisha, a timely and accurate forecast of area and production of such important pulse crops is valuable in terms of agricultural policy decisions and food and nutritional security of the people. Various researchers have been contributing in this area of research. Mishra et al. (2021) studied the trend in the production of total pulses in major growing states in India using ARIMA. Vishwajit et al. (2018) studied about the modelling and forecasting of arhar in major arhar growing states in India using ARIMA and other models. Shah et al. (2017) conducts a study to forecast the substantial food crop production in Khyber Pakhtunkhwa, Pakistan; the secondary data were utilized by applying ARIMA forecasting strategy. They found that the outcome of the ARIMA model was sufficient. The present study focuses on the forecasting the area, yield and production of arhar crop in Odisha using ARIMA models. Kumari et al. (2022) deals with comparison of different statistical models to predict area, production and yield of citrus in Gujarat state. The study revealed that ARIMA model was superior to explain the area, production and yield of citrus. MATERIALS AND METHODS The secondary data on area, yield and production of arhar are collected for the state of Odisha (kharif and rabi seasons combined) for the period 1970-71 to 2019-20 from Five Decades of Odisha Agriculture Statistics published by Directorate of Agriculture and Food Production, Odisha. Autoregressive Integrated moving Average is a statistical model which is used to predict the future trends. The ARMA models, which includes the order of differencing (which is to stationarize the data) is known as Autoregressive integrated moving average (ARIMA) models. A non-seasonal ARIMA model is classified as an "ARIMA (p,d,q)" model, where, the parameters p,d,q are the non-negative integers where p is the number of autoregressive terms, d is the number of nonseasonal differences necessary for stationarizing the data, and q is the number of moving average terms. Thus, the ARIMA (p,d,q) model can be represented y the following general forecasting equation: Y_t=μ+∑_(i=1)^p▒〖α_i Y_(t-i)+ ∑_(j=1)^q▒〖θ_j ε_(t-j)+ε_t 〗〗 Whereμ is a mean,α_1,α_2,…..α_pand θ_1,θ_2,……θ_jare the parameters of the model, p is the order of autoregressive term and q is the order of moving average term and ε_t ,ε_(t-1),..ε_(t-j) are noise error terms. Model identification: The ARIMA model is estimated only if the data under study is stationary, it can be tested by using Augmented Dickey-Fuller test. If it is not stationary then it should be converted into stationary series by differencing the data at suitable lag. Usually the data is stationarized after 1 or 2 differencing. After stationarizing the data, the Auto Correlation Factor(ACF) and Partial Auto Correlation Factor(PACF) plots are used to identify tentative Auto Regression (AR) and Moving Average (MA) orders. Various tentative models based on identified AR and MA orders are fitted and parameters are estimated. After fitting the tentative models normality and independency of the residuals of the fitted models is tested by using Shapiro-Wilk’s test and Box-pierce test respectively. Then the tentative models satisfying the normality and independency of residuals are compared by using the model fit statistics such as mean absolute percentage error(MAPE), root mean square error(RMSE) and Akaike’s Information criteria corrected (AICc) which are mathematically as follows: Root mean square error (RMSE): √((∑_(T=1)^n▒(y ̂_t-y_t )^2 )/n) Mean absolute percentage error (MAPE): 100/n ∑_(t=1)^n▒|(y_t-y ̂_t)/y_t | (Mishra, et al., 2020) Where y ̂_t= forecasted value, y_t= actual value and n = number of times the summation iteration happens Akaike’s information criteria corrected (AICc): AIC + (〖2K〗^2+2K)/(n-k-1) Where k denotes the number of parameters and n denotes the sample size. The model with lowest RMSE, MAPE and AICc values is selected as the best fit ARIMA model among selected tentative models and it is taken for forecasting. RESULTS AND DISCUSSION The data on area, yield and production of arhar crop was tested for the presence of stationarity by using Augmented Dickey Fuller test and the results are presented in Table 1. The test results confirmed that the data was not stationary and made stationary by differencing at lag 2. The next step was to identify the order of AR and MA terms such as p and q using the ACF and PACF plots. Different tentative models were identified using the orders of AR and MA terms. Figs. 1, 2 and 3 shows the ACF and PACF plots of first order difference of area, yield and production of arhar crop in Odisha. The tentative models of area and their estimated coefficients along with error measures are shown in the Table 2. The study of the table reveals that ARIMA(1,1,2) without constant model has all the estimated coefficients significant. Table 3 shows the model diagnostics test and model fit statistics for the fitted ARIMA models. ARIMA (1,1,2) model satisfies both the test of normality and independency of residuals. Thus this model is selected to be the best fit model for area under arhar crop. The tentative ARIMA models of yield and their estimated coefficients along with error measures are shown in the Table 4. The study of the table reveals that ARIMA (1,1,0) and ARIMA (0,1,1) without constant model has all the estimated coefficients significant. Table 5 shows the model diagnostics test and model fit statistics for the fitted ARIMA models for yield of arhar. ARIMA (1,1,0) without constant model satisfies both the test of normality and independency of residuals. The RMSE, MAPE and AICc are less for ARIMA (1,1,0) without constant model. Thus, this model is selected to be the best fit model for production of arhar crop. Fig. 5 also shows that none of the autocorrelations and partial autocorrelations of residuals are significant. This further confirms the selection of the respective best fit models. The tentative models of production and their estimated coefficients along with error measures are shown in the Table 6. The study of the table reveals that ARIMA (0,1,1) and ARIMA (1,1,0) constant model has the estimated coefficients significant. Table 7 shows the model fit statistics and model diagnostics test for the fitted ARIMA models for production of arhar. ARIMA (1,1,0) with constant model satisfies both the test of normality and independency of residuals. Thus this model is selected to be the best fit model for production of arhar crop. Cross Validation. In the following table, we have done stepwise cross validation by considering the best fitted ARIMA model of respective variables of arhar crop of Odisha. The APE(absolute percentage error) of area under arhar is found to be in the range between 0 to 11 and the MAPE(mean APE) is found to be 3.8544 for area under arhar crop. Similarly for yield the APE range is found between 0.5 to 13 and MAPE is 5.688 and that for production, APE range is between 1 to 17 and MAPE is 5.0659. These results shows that the selected ARIMA models are successfully cross validated. The appropriate ARIMA models which are represented in the previous tables were used to forecast the area, yield and production of arhar crop in Odisha for the years 2020-21, 2021-22 and 2022-23.

How to cite this article

Balaga Divya and Abhiram Dash (2022). Using ARIMA Model to Forecast the Area, Yield and Production of Arhar in Odisha. Biological Forum – An International Journal, 14(3): 1179-1185.