Computer Architectures

VIHIAA03  |  Computer Engineering BSc  |  Credit: 5

Objectives, learning outcomes and obtained knowledge

The aim of the course is to teach the design, operation and features of modern computers. Knowing the main characteristics of the hardware enables the development of efficient software that makes better use of the resources of computers.
A tantárgy tanulási eredményei
-    Information processing models, control flow architectures, instruction sets
-    Fundamentals of I/O peripherals, traffic control, interrupts, DMA, interconnects
-    PCI, PCI Express and USB interfaces
-    Design, operation, and performance analysis of mass storage devices
-    Memory technologies, DRAM based memory systems, design, operation and performance analysis
-    Virtual memory management, concepts, operation, basic data structures, performance implications
-    Cache memory, organisation, management
-    Locality-aware programming techniques
-    Pipeline based instruction execution, optimization
-    Basic algorithms for out-of-order instruction execution, register renaming
-    Wide pipelines, superscalar processors
-    Branch prediction algorithms, branch prediction-aware programming techniques
-    Forms of parallel processing, Flynn taxonomy
-    Data parallelism, vector processors, SIMD instruction sets
-    Classification of multiprocessor systems, basic concepts
-    Fundamental problems of distributed memory management, cache coherence and memory consistency

Lecturers

Horváth Gábor
Gábor Horváth

címzetes egyetemi tanár

Course coordinator

Synopsis

Introduction (2 lectures). Introduction to information processing models. Control flow architectures: Neumann architecture, Harvard architecture, modified Harvard architecture. Characteristics of instruction sets, CISC-RISC strategies.
Input-output peripherals (3 lectures). Dedicated I/O instructions and memory mapped peripheral management. Flow control. Processing peripheral signals: polling, interrupt, interrupt in multiprocessor environment, interrupt moderation. Processor offloading: DMA, I/O processor. Interconnections: bus, point-to-point, serial, parallel, timing, arbitration. Single-bus, multi-bus, bridge-based systems. PCI, PCI Express and USB interfaces.
Mass storage devices (1 lecture). Operation of hard disks: concept and parts of a sector, zoned bit recording. Main components of data transfer service times. Queuing and reordering of commands. How SSD devices work: concept and role of pages, blocks. Management and side effects of read/write operations. Causes and importance of aging. Functions of SSD controllers.
Memory (4 lectures). Synchronous DRAM based memory systems: concept and operation of memory controller, module, rank, bank. DRAM commands and their timing, out-of-order execution of commands. Virtual memory management: address translation, TLB, page table implementations, single level and hierarchical page tables. Cache memory: role of locality principles, cache organization, cache organization and virtual memory management. Cache content management: cache pollution prevention, block prefetching, block replacement algorithms. Locality-aware programming techniques.
Processor (6 lectures). Pipeline instruction processing. Concepts and handling of hazards. Implementation of a simple 5 stage pipeline. Exception handling, precise interrupt handling. Handling arithmetic operations with different delays. Dynamic scheduling (out-of-order instruction execution). The concept of precedence graph and data flow-based instruction scheduling. The role and implementation of the instruction queue, register renaming, and the reorder buffer. The Tomasulo algorithm. Wide pipelines: superscalar, VLIW and EPIC architectures. Branch prediction: predicting the outcome of the jump condition and the jump target address. Branch prediction-aware programming. Attacks against speculative execution.
Parallel processing (4 lectures) Data parallelism: vector processors, SIMD instruction set extensions. Multiprocessor systems: notion of explicit parallelism, multi-threaded processors, classification of multiprocessor systems, interconnection networks. Shared memory management: cache coherence protocols, memory consistency problems and solutions.


The material of the classroom practices:
- Review of digital design basics through a simple hardware-software design project
- Calculating the relative load of the CPU for peripheral processing with polling and interrupt based I/O peripheral management
- HDD latency and throughput calculations, manual walkthrough of SSD write management algorithms
- Memory management: DRAM command scheduling, command latency and throughput calculations, exercises related to virtual memory management with and without TLB
- Cache memory: practicing cache organization, cache error rate calculation and code optimization for simple C programs
- Pipeline scheduling: scheduling low-level programs with various instruction pipelines, determining optimal instruction order
- Advanced pipeline techniques: dependency analysis, elimination of anti-dependencies by register renaming, following the operation of branch predictors using simple C programs