Methods for identification of probability distribution of random variables from data samples with R statistical computing language

Authors

  • Oleksandr O. Dykhovychnyi Igor Sikorsky Kyiv Polytechnic Institute, Kyiv, Ukraine
  • Nataliia V. Kruglova Igor Sikorsky Kyiv Polytechnic Institute, Kyiv, Ukraine
  • Olha I. Virstiuk Igor Sikorsky Kyiv Polytechnic Institute, Kyiv, Ukraine

DOI:

https://doi.org/10.20535/mmtu-2018.1-091

Keywords:

R language, Chentsov field, Gaussian process, Method of moments, Method of quantiles, Maximum likelihood estimation, Minimum distance estimation, Statistical tests

Abstract

The following article discusses various methods for probability distribution fitting to simulated data by means of R statistical computing language. In particular, some graphical methods like plotting of histograms, empirical and theoretical density functions, P-P and Q-Q plots, were considered. Estimation functions for probability distribution parameters were investigated by applying method of moments, method of quantiles, method of maximum likelihood, and shortest distance method. Hypothesis about probability distribution were verifi ed with Kolmogorov–Smirnov, AIC, and BIC tests. The corresponding data set used to illustrate the above methods was taken from probability distribution of the maximum of Chenstov field restriction to a particular curve. The distribution was simulated with the special original algorithm in R statistical software.

References

Chentsov, N. N. (1956). Wiener random fields depending on several parameters [in Russian]. Doklady Akademii Nauk SSSR, 106(4), 607–609.

The Comprehensive R Archive Network. (n.d.). https://cran.cnr.berkeley.edu/

Cramér, H. (1946). Mathematical methods of statistics. Princeton, NJ: Princeton University Press.

Dykhovychnyi, O. O., & Kruglova, N. V. (2018). Simulation of a gaussian process with correlation function of a special form. In Abstracts of International conference "Stochastic Equations, Limit Theorems and Statistics of Stochastic Processes dedicated to the 100th anniversary of I. I. Gikhman, 2018, September 17–22, Kyiv, Ukraine (pp. 18–19). http://matan.kpi.ua/gikhman100conf/g100-abstracts.pdf

Kobzar, A. I. (2006). Applied mathematical statistics [in Russian]. Moscow: Fizmatlit.

Lapko, A. V., Chentsov, S. V., Krokhov, S. I., & Feldman, L. A. (1996). Self-learning systems for information processing and decision making [in Russian]. Novosibirsk: Nauka.

Park, C., & Paranjape, S. R. (1974). Probabilities of Wiener paths crossing differentiable curves. Pacific journal of mathematics, 53(2), 579–583. https://projecteuclid.org/euclid.pjm/1102911625

Syzrantsev, V. N., Nevelev, Y. P., & Golofast, S. L. (2006). Adaptive method for probability density function reconstruction [in Russian]. Proceedings of Higher Educational Institutions. Machine Building, 2006(12), 3–11.

Wolfowitz, J. (1957). The minimum distance method. The Annals of Mathematical Statistics, 28(1), 75–88. https://doi.org/10.1214/aoms/1177707038

Wolverton, C. T., & Wagner, T. J. (1969). Asymptotically optimal discriminant functions for pattern classifi cation. IEEE Transactions on Information Theory, 15(2), 258–265. https://doi.org/10.1109/TIT.1969.1054295

Yeh, J. (1960). Wiener measure in a space of functions of two variables. Transactions of the American Mathematical Society, 95(3), 433–450. https://doi.org/10.1090/S0002-9947-1960-0125433-1

Issue

Section

Application of mathematics in related sciences