This course is targeted toward post-graduate students with basic understanding of computer
organization and architecture.
This course discusses in detail the methodologies and trade-offs
involved in designing a shared memory parallel computer.
Contents:
Single-threaded execution, traditional microprocessors, DLP, ILP, TLP, memory
wall, parallel programming and performance issues, shared memory multiprocessors,
synchronization, small-scale symmetric multiprocessors on a snoopy bus, cache coherence
on snoopy buses.
Scalable multiprocessors, directory-based cache coherence,
interconnection network, memory consistency models, software distributed shared
memory, multithreading in hardware, chip-multiprocessing, current research and future
trends.