Data Augmentation Using Gaussian Mixture Model on CSV Files

作者: Ashish Arora , Niloufar Shoeibi , Vishwani Sati , Alfonso González-Briones , Pablo Chamoso

DOI: 10.1007/978-3-030-53036-5_28

关键词:

摘要: One of the biggest challenges in training supervised models is lack amount labeled data for model and facing overfitting underfitting problems. solutions solving this problem augmentation. There have been many developments augmentation image files, especially medical type datasets, by doing some changes on original file such as Random cropping, Filliping, Rotating, so on, order to make a new sample file. Or use Deep Learning generate similar samples like Generative Adversarial Networks, Convolutional Neural Networks on. However, numerical dataset, there not enough advances. In paper, we are proposing Gaussian Mixture Models (GMMs) augment more very Numerical dataset. The results demonstrated that Mean Absolute Error decreases meaning regression became accurate.

参考文章(17)
Jaques Grobler, Mathieu Blondel, Robert Layton, Olivier Grisel, Brian Holt, Alexandre Gramfort, Gaël Varoquaux, Peter Prettenhofer, Gilles Louppe, Fabian Pedregosa, Vlad Niculae, Lars Buitinck, Arnaud Joly, Jake Vanderplas, Andreas Mueller, API design for machine learning software: experiences from the scikit-learn project arXiv: Learning. ,(2013)
Dumitru Dan Burdescu, Cristian Gabriel Mihai, Liana Stanescu, Marius Brezovan, Automatic image annotation and semantic based image retrieval for medical domain Neurocomputing. ,vol. 109, pp. 33- 48 ,(2013) , 10.1016/J.NEUCOM.2012.07.030
T.K. Moon, The expectation-maximization algorithm IEEE Signal Processing Magazine. ,vol. 13, pp. 47- 60 ,(1996) , 10.1109/79.543975
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay, Scikit-learn: Machine Learning in Python Journal of Machine Learning Research. ,vol. 12, pp. 2825- 2830 ,(2011)
Maayan Frid-Adar, Idit Diamant, Eyal Klang, Michal Amitai, Jacob Goldberger, Hayit Greenspan, GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification Neurocomputing. ,vol. 321, pp. 321- 331 ,(2018) , 10.1016/J.NEUCOM.2018.09.013
Inés Sittón Candanedo, Elena Hernández Nieves, Sara Rodríguez González, M. Teresa Santos Martín, Alfonso González Briones, Machine Learning Predictive Model for Industry 4.0 International Conference on Knowledge Management in Organizations. pp. 501- 510 ,(2018) , 10.1007/978-3-319-95204-8_42
Sergio Márquez Sánchez, Roberto Casado Vara, Francisco Javier García Criado, Sara Rodríguez González, Javier Prieto Tejedor, Juan Manuel Corchado, Smart PPE and CPE Platform for Electric Industry Workforce soft computing. pp. 422- 431 ,(2019) , 10.1007/978-3-030-20055-8_40
Alfonso González-Briones, Roberto Casado-Vara, Sergio Márquez, Javier Prieto, Juan M. Corchado, Intelligent Livestock Feeding System by Means of Silos with IoT Technology Distributed Computing and Artificial Intelligence, Special Sessions II, 15th International Conference. pp. 38- 48 ,(2020) , 10.1007/978-3-030-00524-5_7
Samuel Gallego Chimeno, Joaquín Delgado Fernández, Sergio Márquez Sánchez, Pablo Pueyo Ramón, Óscar Mauricio Salazar Ospina, Marcel Vicente Muñoz, Aarón González Hernández, Domestic Violence Prevention System Distributed Computing and Artificial Intelligence, Special Sessions II, 15th International Conference. pp. 10- 14 ,(2020) , 10.1007/978-3-030-00524-5_3
Ricardo S. Alonso, Inés Sittón-Candanedo, Sara Rodríguez-González, Óscar García, Javier Prieto, A Survey on Software-Defined Networks and Edge Computing over IoT practical applications of agents and multi agent systems. pp. 289- 301 ,(2019) , 10.1007/978-3-030-24299-2_25