Trustworthy AI and Data Analysis

VIMIMB10 | Computer Engineering MSc | Semester: 3 | Credit: 5

Objectives, learning outcomes and obtained knowledge

The results of artificial intelligence, machine learning, and data analytics are increasingly used for several real-life purposes as a service embedded in complex IT systems. However, the operational safety of these IT systems is currently often not addressed, as their correct functioning is typically not guaranteed, there are no standardized development/testing methods, the robustness of such systems is not ensured, and they are not protected against accidental or malicious input errors. However, there is a wide range of research and regulatory activity to improve reliability, which has led to new ethical, legal, technological, and theoretical approaches to managing societal-level risks.

The objective of this course is to introduce the approaches, concepts, and engineering best practices of trustworthy data analysis, machine learning, and artificial intelligence. The course will also review issues related to the integration of intelligent algorithms into IT systems, methods for data-driven solutions to technical problems, and integration of these into development/operations processes.

The course will introduce the human-centered approach to data analytics and artificial intelligence at a societal level, its ethical background, legal regulation, its representation in standards, and its implementation in engineering practice. For both data analytics and AI, it will present the potential and limitations of interpretability, explainability, testability, and sensitivity analysis. It describes the comprehensive formalization of the data analysis workflow and the lifecycle of creating an AI service/product, specifically validated documentation, with the potential of using blockchain tools and the auditing of the result.

Lecturers

Synopsis

Detailed topics of the lectures:

Fundamental concepts of trustworthiness in data analysis and Artificial Intelligence. Approaches to reliable data analysis and artificial intelligence, human-centered artificial intelligence. Ethical background of analysis and AI, legal regulations, standardization, and integration in engineering best practices.
Data quality and veracity. Validation of input datasets: goals and applications of exploratory data analysis. Measurement of data quality, data processing, tidy data, ETL / ELT frameworks, automated data processing and visualization. Use of engineering assumptions in data analysis: considering causal, temporal, and topological relationships.
Understanding and explainability of data through data visualization: comparison, trend analysis, outlier detection, determining relationships, clustering. Use cases of visualization and their supporting technologies: monitoring/dashboard, business reporting, evaluation of alternatives/hypotheses, reproducible research.
Evaluation, testing, and assurance of data analysis and machine learning models: defining performance metrics, evaluating alternatives, visual support for evaluating results and parameterization. Sensitivity analysis, examination of variable importance.
Data analysis lifecycle. Cloud-based systems. Application of blockchain in the data sharing process.
Use of qualitative models to describe the construction and changes of reliable systems. Validation of qualitative models/model details based on measured data.
Data-driven model building: methods and applications of process mining: model building, conformance checking, log analysis, fraud detection. Parameterization of business rule systems based on data, rule mining.
Use of intelligent learning methods in critical systems. Application of fault-tolerant patterns. Test generation for AI services.
Reliable and explainable artificial intelligence: black and white box approaches. Probabilistic and causal models.
Reliable probabilistic, causal, decision-theoretic, and counterfactual reasoning.
Interpretable AI models in the formalization of AI: explainability, utility, fairness.
Lifecycle of white box models, auditing, evaluation, and risk analysis of models: ALTAI approach, process of model acceptance/adoption, analytical/hybrid methods, model testing, explanation generation.
Explainability of black box models, model derivation.
Reliable human-machine hybrid systems, "human in the loop" approach, reliable multi-agent systems.

Detailed topics of the exercises:

Data quality evaluation, transformation and validaiton of input data, data profiling.
Visual Exploratory Data Analysis, automated visualization derivation.
Application of process mining in model building and validation.
Test generation for black box testing of AI models.
Sensitivity analysis of models, examination of variable importance, CP, PDP, Shapley DALEX.
Derivation of interpretable models, representation of dependencies and causal relationships.
Methods of explanation generation, generating logical, probabilistic, and causal explanations.

BME-MIT

Trustworthy AI and Data Analysis

Objectives, learning outcomes and obtained knowledge

Lecturers

László Gönczy

Nada Akel

Péter Antal

Földvári András

Gábor Hullám

Tamás Mészáros

Gábor Révy

György Strausz

Mihály Vetró

Synopsis