Meta-Learning for Automated Hyperparameter Optimization of Variational Autoencoders

Dr. Farah Al-Dabbagh

Authors

Dr. Farah Al-Dabbagh Department of Computer Science, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia

Keywords:

Meta-learning, automated hyperparameter optimization, variational autoencoders, machine learning

Abstract

Variational Autoencoders (VAEs) are powerful deep generative models widely used for representation learning, data generation, and anomaly detection. However, their performance is highly sensitive to hyperparameter choices, such as the dimensionality of the latent space, the weighting of the Kullback-Leibler (KL) divergence term, and network architecture specifics. Manual tuning of these hyperparameters is time-consuming and often suboptimal, while automated methods like Bayesian Optimization or Random Search can be computationally expensive, especially for complex VAE architectures and large datasets. This article proposes a novel meta-learning approach to automate the hyperparameter optimization (HPO) process for VAEs. By learning from the HPO experiences across a diverse collection of previous tasks (datasets), the meta-learner can predict promising hyperparameter configurations for new, unseen tasks, significantly accelerating the optimization process. We demonstrate the effectiveness of this approach by building a meta-dataset of VAE performance across various data characteristics and training a meta-model to recommend optimal hyperparameters. Our results show that the meta-learning framework can efficiently identify near-optimal VAE hyperparameters, leading to substantial computational savings while maintaining competitive model performance, thereby advancing the field of automated machine learning for generative models.

References

Aguiar, G. J., Santana, E. J., de Carvalho, A. C. P. L., & Barbon Junior, S. (2022). Using meta-learning for multi-target regression. Information Sciences, 584, 665–684. https://doi.org/10.1016/j.ins.2021.11.003

Alcobaça, A., Siqueira, F., Rivolli, A., Garcia, L. P. F., Oliva, J. T., & de Carvalho, A. C. P. L. F. (2020). MFE: Towards reproducible meta-feature extraction. Journal of Machine Learning Research, 21(111), 1–5.

Ali, S., & Smith, K. A. (2006). On learning algorithm selection for classification. Applied Soft Computing, 6(2).

Attig, A., & Perner, P. (2009). A study on the case image description for learning the model of the watershed segmentation. Transactions on Case-Based Reasoning, 2(1).

Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(2). http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

Bergstra, J., Yamins, D., & Cox, D. (2013). Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. Proceedings of the 30th International Conference on Machine Learning, PMLR 28(1), 115–123. http://proceedings.mlr.press/v28/bergstra13.pdf

Bonheme, L., & Grzes, M. (2022). FONDUE: An algorithm to find the optimal dimensionality of the latent representations of variational autoencoders. arXiv preprint. https://arxiv.org/abs/2209.12806

Brazdil, P., Giraud Carrier, C., Soares, C., & Vilalta, R. (2008). Metalearning: Applications to Data Mining. Springer.

Brazdil, P., van Rijn, J. N., Soares, C., & Vanschoren, J. (2022). Metalearning: Applications to Automated Machine Learning and Data Mining. Springer Nature.

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785

Deng, D., & Lindauer, M. (2022). Searching in the forest for local Bayesian optimization. Proceedings of the ECMLPKDD Workshop Meta-Knowledge Transfer, PMLR 191, 38–50. https://proceedings.mlr.press/v191/deng22a/deng22a.pdf

Eggensperger, K., Hutter, F., Hoos, H., & Leyton-Brown, K. (2015). Efficient benchmarking of hyperparameter optimizers via surrogates. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9375

Eggensperger, K., Lindauer, M., Hoos, H. H., Hutter, F., & Leyton-Brown, K. (2018). Efficient benchmarking of algorithm configurators via model-based surrogates. Machine Learning, 107, 15–41. https://link.springer.com/article/10.1007/s10994-017-5657-4

Eriksson, D., Pearce, M., Gardner, J., Turner, R. D., & Poloczek, M. (2019). Scalable global optimization via local Bayesian optimization. Advances in Neural Information Processing Systems, 32.

Franceschi, L., Frasconi, P., Salzo, S., Grazzi, R., & Pontil, M. (2018). Bilevel programming for hyperparameter optimization and meta-learning. Proceedings of the 35th International Conference on Machine Learning, PMLR 80, 1568–1577. http://proceedings.mlr.press/v80/franceschi18a/franceschi18a.pdf

Greco, G., Guzzo, A., & Nardiello, G. (2020). FD-VAE: A feature driven VAE architecture for flexible synthetic data generation. International Conference on Database and Expert Systems Applications.

Head, T., Kumar, M., Nahrstaedt, H., Louppe, G., & Shcherbatyi, I. (2020). scikit-optimize/scikit-optimize. Zenodo, version v0.8.1 (September 2020). https://doi.org/10.5281/zenodo.4014775

He, X., Zhao, K., & Chu, X. (2021). AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212. https://doi.org/10.1016/j.knosys.2020.106622

Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., & Lerchner, A. (2017). beta-VAE: Learning basic visual concepts with a constrained variational framework. International Conference on Learning Representations.

Hospedales, T. M., Antoniou, A., Micaelli, P., & Storkey, A. J. (2022). Meta-learning in neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9), 5149–5169. https://doi.org/10.1109/TPAMI.2021.3079209

Huisman, M., van Rijn, J. N., & Plaat, A. (2021). A survey of deep meta-learning. Artificial Intelligence Review, 54(6), 4483–4541. https://doi.org/10.1007/s10462-021-10004-4

Hutter, F., Kotthoff, L., & Vanschoren, J. (2019). Automated Machine Learning: Methods, Systems, Challenges. Springer Nature.

Jones, D. R., Schonlau, M., & Welch, W. J. (1998). Efficient global optimization of expensive black-box functions. Journal of Global Optimization, 13(4), 455.

Kingma, D. P., & Welling, M. (2022). Auto-encoding variational Bayes. arXiv preprint. arXiv:1312.6114

Kalousis, A., & Hilario, M. (2001). Model selection via meta-learning: A comparative study. International Journal of Artificial Intelligence Tools, 10(4).

Lavesson, N., & Davidsson, P. (2005). Quantifying the impact of learning algorithm parameter tuning (short version). Mälardalen Universit.

Frontiers in Emerging Computer Science and Information Technology

Article Details Page