Scalable Systems and Development Processes
VIMIA021 | Computer Engineering BSc | Credit: 2
Objectives, learning outcomes and obtained knowledge
A key engineering and entrepreneurial challenge is not only to quickly deploy the initial version of a great product but also, upon successful adaption by the market, to scale it up.
A product or product family can be scaled both "horizontally" by adding more feature sets to it and "vertically" attracting more users, on different platforms and in different markets, such as desktop and mobile, stand-alone and cloud, enterprise and consumer, local and international.
How and when to scale a product may be a business decision. The architecture of the code base and the engineering organization must be prepared for these requirements.
This course teaches both the software architectural and engineering organizational aspects of building large scale products. It emphasizes the dynamic, evolutionary nature of this process. Continuous innovation, scaling and adaptability are essential for successful companies. They should be prepared to build upon their existing products and engineering processes and organization.
The course teaches basic software
architectural concepts, technologies and practices to architect products that
can be quickly deployed but as the need arises can be smoothly, incrementally
scaled.
The aim of the course is to introduce to students best practices and technologies to build products that can evolve and scale over time. How to build products that grow from thousands of line of code to millions, developed and maintained from tens of software engineers to tens of thousands with a user base (supported load) from tens of thousands to tens of millions.
The course touches upon both engineering processes, such as source control, testing, bug tracking, monitoring and applicable technologies such as networking, load balancing, parallel computing, large scale data repositories.
Lecturers

András Pataricza
professor emeritus
Course coordinator
Synopsis
Part 1: Building Scalable Products
L1: Aspects of scalability and how to measure it.
● Data storage. Capacity, throughput, load.
● Parallel computing. Speedup, efficiency.
● Compute server. Megaflops.
● Network. Throughput, load, capacity
● Application load. qps, users
L2-L3: Scalable Storage Technologies
● Sql.
● Bigtable, Datastore.
● BigQuery, Cloud storage. Paxos.
● Parallel algorithms
● MapReduce, MillWheel
● Flume
● Messaging. Pub/sub
● Load balancing, DNS
● Logging, monitoring
L6: Computing in the Cloud, Case studies.
● Google App Engine
● Amazon Web Services
Part 2. Scaling the Engineering Processes
● Code. complexity. LOC.
● Measuring and tracking code health.
● Size, number of bugs, trends in it.
● Code size/Organization size
● Code complexity/Organization complexity
● Using open source as a free and infinite resource.
● Releasing code into open source as a business decision
● Standards, extensions, libraries. Portability, platform independence.
● Agile programming. Pair programming.
● Source Control. Code review. Check-in process.
● Code reuse. Module replacement, rewrite.
● Build. Libraries, continuous build.
● Testing frameworks.
● Testing during development.
● Black box, white box, grey box testing.
● Unit test. System test. Integration test. Performance test. Regression test.
● Continuous testing.
L13-L14: Running the System
● Monitoring.
● Minor, major upgrade. Canary, roll-back.
● Alerts, alarms. Triggering, escalating.
● Crash recovery. Post mortem.