Machine Learning 

VIMIMA27  |  Computer Engineering MSc  |  Semester: 1  |  Credit: 5

Objectives, learning outcomes and obtained knowledge

The course deals with the possibilities of computer implementation of one of the fundamental abilities of intelligent systems: learning. It introduces the types of machine learning, summarizes the theoretical foundations of machine learning, and analyses the most important learning architectures in detail. The subject examines machine learning within a unified probabilistic framework, touching upon mathematical, philosophical, and programming aspects. Beyond presenting theoretical foundations, the course aims to develop practical problem-solving skills. This is achieved through the use of a unified approach and the presentation of complex application examples. The methods learned in the course serve as a foundation and background for solving research and development tasks.

Lecturers

Antal Péter
Péter Antal

associate professor

Course coordinator

Alekszejenkó Levente
Levente Alekszejenkó

research assistant

Fetter László
László Fetter

PhD student

Hadházi Dániel
Dániel Hadházi

research assistant

Hullám Gábor
Gábor Hullám

deputy head of department, associate professor

Synopsis

  1. Introduction. Artificial intelligence, machine learning, and data science. Machine learning as inference. Learning from observations and interventions. Trustworthy and explainable machine learning.
  2. Basic concepts of Bayesian probability theory. Probability, prior, likelihood, posterior. Maximum likelihood (ML), maximum a posteriori (MAP), fully Bayesian inference, model averaging. The difficulties of fully Bayesian inference (examples when there is an analytical solution). Conjugated priors (examples of their use).
  3. Basic concepts of machine learning. Generative and discriminative models, discriminative functions in machine learning (examples). Bias-variance decomposition, underfitting, overfitting, regularization. Probabilistic derivation of commonly used loss functions and regularization schemes. Evaluation (CV, AUC, AUPR).
  4. Regression. The basic task, the probabilistic model of linear regression, ML and MAP estimation, derivation of analytical formulas for these estimations, the solution process, numerical aspects. Fully Bayesian inference. Non-linear extensions: application of basis functions, commonly used basis functions.
  5. Classification. The basic task, the probabilistic model of logistic regression. Derivation of the perceptron using Bayes' theorem, ML and MAP estimation, derivation of iterative formulas (sigmoid function, gradient), the solution process, numerical aspects.
  6. Neural networks. MLP architecture, ML and MAP estimation, derivation of the backpropagation algorithm. Activation functions used in neural models, methods of regularization. Convolutional and recurrent architectures, the types of layers used in them, example applications.
  7. Optimization in neural models. The difficulties of optimization, analytical and numerical aspects. Basic principles of optimization algorithms (batch, momentum, adaptive learning rate, higher-order methods). Notable algorithms.
  8. Variational methods. Approximate Bayesian inference, ELBO+KL decomposition, the basic principle of variational methods. BBVI, stochastic gradient-based optimization. Reparametrization trick, VAE. The idea of adversarial training, the basic principle of GAN architectures.
  9. MCMC. The basic principle of MCMC methods. Properties of Markov chains. Sufficient condition for the existence of the equilibrium distribution. Metropolis, Metropolis-Hastings algorithm. Gibbs sampling, conjugated priors. Example: Bayesian linear regression with Gibbs sampling.
  10. Probabilistic Graphical Models: Covers Bayesian Networks and Markov Random Fields for modelling conditional dependencies among variables. Includes inference, network structure learning, and parameter estimation, emphasizing practical applications and inference techniques.
  11. Transformers: Focuses on the transformer architecture's impact on deep learning, especially in NLP. Discusses self-attention mechanisms, positional encoding, and advancements in machine translation and text summarization, highlighting recent developments.

Detailed topics of the exercises:

  1. Bayesian Thinking. Maximum likelihood estimation, posterior calculation in conjugated models.
  2. Linear Models. Bayesian models of regression and classification, calculating posterior and predictive distributions, numerical stability of implementation.
  3. Neural Networks. Implementation of MLP, convolutional networks using PyTorch/Tensorflow, optimization, evaluating predictive performance.
  4. Variational Inference. Inference in non-conjugated models, generative modelling with variational autoencoders.
  5. MCMC. Gibbs sampling in hierarchical models, time series analysis, changepoint models.
  6. Bayesian Networks. Constructing, querying, learning Bayesian Networks: Students will learn how to build Bayesian Networks to represent probabilistic relationships among variables. Exercises include constructing networks from real-world data, performing inference to calculate conditional probabilities, and using software tools to query the networks and perform diagnostic reasoning.
  7. Transformers. Implementing a Transformer Model: Students will gain hands-on experience with the transformer architecture by implementing a simple transformer model. The exercise will cover key components of transformers, including self-attention mechanisms, and use frameworks to build and train the model on a dataset. Further, students will evaluate the model's performance and explore the impact of different hyperparameters.