课前预习
Random Access Machine Model
standard theoretical model of computation
- infinite memory
- uniform access cost
simple model crucial for success of computer industry
Hierarchical Memory
modern machines have complicated memory hierarchy
- levels get larger and slower further away from CPU
- data moved between levels using large blocks
many issues influence performance
- cache misses, TLB misses, branch mis-predictions etc.
Disk access is \(10^6\) slower than main memory access
important to store/access data to take advantage of blocks
Scalability problems
most programs developed in RAM-model
- run on large dataset because OS moves blocks as needed
modern OS utilize sophisticated paging and pre-fetching strategies
- but if program makes scattered accesses even good OS can't take advantage of block access
I/O Model
\(N\) num of items in the problem instance
\(B\) num of items per disk block
\(M\) num of items that fit in main memory
\(T\) num of items in output
I/O: move block between memory and disk
assume that \(M\gt B^2\)
Fundamental bounds
internal | external | |
---|---|---|
scanning | \(N\) | \(\frac{N}{B}\) |
sorting | \(N\log N\) | \(\frac{N}{B}\log_{M/B}\frac{N}{B}\) |
permuting | \(N\) | \(\min(N,\frac{N}{B}\log_{M/B}\frac{N}{B})\) |
searching | \(\log_2N\) | \(\log_bN\) |
Note:
- Linear I/O \(O(N/B)\)
- Permuting not linear
- Permuting and sorting bounds are equal in all practical cases
- cannot sort optimally with search tree
Large difference between \(N\) and \(N/B\)!!!
Queue and Stacks
queue
- maintain push and pop blocks in main memory \(O(1/B)\)
stack
- maintain push and pop block in main memory \(O(1/B)\)
Sorting
- \(<M/B\) sorted lists and be merged in \(O(N/B)\) I/O
- unsorted list can be distributed using \(<M/B\) split elements in \(O(N/B)\) I/O
Merge sort
- create \(N/M\) memory sized sorted lists
- repeatedly merge lists together \(\Theta(M/B)\) at a time
Multi-way quick-sort
- compute \(\Theta(M/B)\) splitting elements
- distribute unsorted list into \(\Theta(M/B)\) unsorted lists of equal size
- recursively split lists until fit in memory
Computing splitting elements
in internal memory: linear time selection
selection algorithm: finding i-th element in sorted order
- select median of every group of 5 elements
- recursively select median of \(N/5\) selected elements
- distribute elements into 2 lists using computed median
- recursively select in one of two lists.
analysis
- step 1 and 3 performed in \(O(N/B)\) I/O
- step 4 recursion on at most \(\sim\frac{7}{10}N\)
\(T(N)=O(N/B)+T(N/5)+T(7N/10)=O(N/B)\) I/O
后面的分析一点也看不懂,头大……
课堂笔记
Intro and Motivation
2 trends:
- Parallelism
- Non-Uniform Memory Access, caches...
start with 2 level of memory
I/O Model
- slow but large mem
- fast but limited mem of size \(M\)
- block transfer of \(O(B)\) elements at a time
block transfer is a common architectural phenomenon
I/O algorithms borrowed from PRAM
I/O Model
aka external memory model
input size: \(N\)
memory size: \(M\)
blocked storage, block size: \(B\)
fixed cost to read/write one block
almost the same as pre-learning...
......
Searching
store N items (values)
search for a key (value)
ordinary:
-
\(O(\log N)\) search
-
\(O(\log N+K)\) report values in an interval
static, blocking:
- \(O(\log n)\) search
- \(O(\log n+K/B)\) report values in an interval
Dynamic Search Tree
like Splay tree.
rotation destroys blocking!
B-Tree
(a,b)-tree (a>=2, b>=2*a-1)
- completely balanced
- leaves store between a and b values
- except root, degrees between a and b
- root degree between 2 and b
- ordered like a search tree
insertion
- search and insert in a leaf v
- if v not overflowing done!
- else split v
split
- split v into two
- add a key to parent of v
- recurse into parent of v
deletion
- search and delete from a leaf v
- if v not underflowing done!
- else re-balance b
Buffer Trees
idea: to be lazy