Computational scientists should know that most of the time their CPUs are waiting for data to arrive. Knowing where the low-level bottlenecks are, and knowing what can be done to ameliorate them, may save hours of frustration when trying to understand why apparently well-written programs perform poorly.
I'll talk about why current CPUs are starving for data, and how to address this issue in modern computers by using different techniques.
1. Motivation
2. The Data Access Issue
3. High Performance Libraries
Slides here. The multiprocessing script for NumPy is here.
Fetch the tarball with the guidelines and sources from here.
Fetch the solutions from here.
With a two-core intel Core 2 Duo processor:
$ tail /sys/devices/system/cpu/cpu0/cache/*/size ==> /sys/devices/system/cpu/cpu0/cache/index0/size <== 32K # -> Level 1 cache size ==> /sys/devices/system/cpu/cpu0/cache/index1/size <== 32K # -> Level 1 cache size ==> /sys/devices/system/cpu/cpu0/cache/index2/size <== 3072K -> Level 2 cache size
With a four-core intel E5520:
$ tail /sys/devices/system/cpu/cpu0/cache/*/size ==> /sys/devices/system/cpu/cpu0/cache/index0/size <== 32K # -> Level 1 cache size ==> /sys/devices/system/cpu/cpu0/cache/index1/size <== 32K # -> Level 1 cache size ==> /sys/devices/system/cpu/cpu0/cache/index2/size <== 256K # -> Level 2 cache size ==> /sys/devices/system/cpu/cpu0/cache/index3/size <== 8192K # -> Level 3 cache size
The exercises need to be performed on a machine with more power than a normal notebook.
Please save the details of the ssh connection to a config file (just once):
echo -e 'Host gnu\n\tHostName gnu.fuw.edu.pl\n\tPort 2005\n\tVisualHostkey yes' >> ~/.ssh/config
Please login:
ssh <login>@gnu
The machine thinks of itself as 'debian'.
The login is the same as the one used for git.