Intelligent Data Analysis and Decision Support
VIMIMB09 | Computer Engineering MSc | Semester: 3 | Credit: 5
Objectives, learning outcomes and obtained knowledge
Intelligent Data Analysis and Decision Support presents advanced approaches at the forefront of machine learning and deep learning research, helping to solve a wider range of real-world problems in engineering. We will first review Bayesian statistical and decision-theoretic frameworks that provide a unified framework for using background knowledge, dealing with incomplete and uncertain data, applying complex models and intelligent forms of inference, adaptive data collection.
Among intelligent data analysis methods, we present techniques that can help improve the efficiency and goodness of the analysis as a pre-processing step. Among these, dimension reduction and representation learning methods improve efficiency - the latter providing a more abstract solution - and clustering is an important part of the data analysis process. The performance of machine learning methods for data analysis can be improved by using ensemble machine learning methods, and more robust performance on real test sets can be achieved by regularisation. We describe in detail the data-driven decision support with these machine learning methods and the process of evaluating the decisions, and demonstrate their use in practice on different types of data (simple, hierarchical, time-series, unstructured).
We will present the probabilistic graphical models and the associated decision nets and causal nets, as well as probabilistic, causal and counterfactual inference methods to handle intervention data and support intelligent data mining. We describe approximate computational methods for Bayesian inference, particularly Markov chain Monte Carlo methods. We present modern machine learning methods for causal models and the role of background knowledge in learning, data and knowledge fusion. Within the framework of adaptive data mining, we present active learning, reinforcement learning, and multi-armed bandits, and their applications in recommender systems and discovery systems.
Lecturers
Péter Antal
associate professor
Course coordinator

Gábor Hullám
deputy head of department, associate professor

Dániel Sándor
PhD student
Synopsis
- Estimation and decision theory, optimal decision and properties of human decisions, types of utility functions. Intelligent inference types: probabilistic, causal and counterfactual inference. Value of information and optimal information gathering strategies.
- Intelligent data analysis methods, data analysis on different types of data (tabular, time series, unstructured).
- Regression type decision problems. Regularized regression methods: ridge, lasso, elastic net.
- Non-linear dimension reduction methods (autoencoder, manifold). Applications of dimensionality reduction.
- Clustering for clustering tasks and as a preprocessing of classification problems. Biclustering, spectral clustering methods.
- Improving the performance (accuracy) of ML methods. Ensemble (ECOC) machine learning methods.
- Types of recommender systems and data analysis methods. Matrix factorization and collaborative filtering in recommender systems.
- Data-driven decision support with machine learning models. Decision evaluation process.
- Definitions, parametric and structural semantics of probabilistic graphical models, use of sparse representations, inference algorithms, notable classes of models (naive Bayes nets, Hidden Markov Models). Extensions to first-order probabilistic logics and stochastic grammars.
- Derivation of causal models, notion of observational equivalence. Modelling interventions using do(.) semantics and graph truncation. The notion of correction in causal power estimation. Counterfactual inference.
- Conjugacy and sufficient statistics in exact Bayesian inference. Approximation methods for Bayesian inference. Monte Carlo methods, rejection sampling and importance sampling. Markov Chain Monte Carlo Methods (MLMC): convergence and confidence diagnostics, multilinear methods, Metropolis-linked MLMC. Hybrid MLMC.
- Learning causal models from observation and intervention data. Learning with background knowledge, data and knowledge fusion in learning system models. Bayesian learning of model properties.
- Active learning, learning with cost. k-armed bandits, Monte Carlo tree search. Reinforcement learning, deep reinforcement learning.
- Recommender systems, noise and informative miss handling. Discovery systems, early discovery performance measures, expected utility of experiment, adaptive experiment design.
- Decision model construction. Optimal decision and value of information.
- Advanced regression exercise in Python
- Spectral clustering on images (Python)
- Joint machine learning methods (computational exercise)
- Constructing a causal model. Probabilistic, causal and counterfactual inference testing.
- Examination of Markov Chain Monte Carlo methods: Gibbs and hybrid MCMC sampling.
- Hyperparameter optimization with k-armed robbers and deep learning Monte Carlo tree search.