Performance and Parallelism

This post originally appeared on the Software Carpentry website.

Some topics for a lecture on parallel programming:

  1. how to measure/compare performance (raw speed, weak scaling, strong scaling, Amdahl's Law, response time vs. throughput)
  2. the register/cache/RAM/virtual memory/local disk/remote storage hierarchy and the relative performance of each (order of magnitude)
  3. in-processor pipelining (or, why branches reduce performance, and why vectorized operations are a good thing)
  4. how that data-parallel model extends to distributed-memory systems, and what the limits of that model are
  5. the shared-memory (threads and locks) model, its performance
    limitations, deadlock, and race conditions
  6. the pure task farm model, its map/reduce cousin, and their limitations
  7. the actors model (processes with their own state communicating only through messages, as in MPI)

It's too much (each point should be an hour-long lecture in its own right, rather than 10-12 minutes of a larger lecture); what do we cut, and what's in there that doesn't need to be?

Dialogue & Discussion

Comments must follow our Code of Conduct.

Edit this page on Github