1.5
---

Revised loop structure, improved local variable usage.
Added better optimization flags.
Ported to the T3E, single precision doesn't work here.
Changed collection to only doubles as that's all anyone
cares about.

1.4
---

Added -l flag to hold LDA constant.
Added -d flag to report rank instead of bytes.

1.3
---

Added MFlops/sec option.
Fixed iteration count bug.
Added REGISTERs where appropriate.

1.2
---

Streamlined graphing. 
Added comparison and vendor BLAS graphs.
Fixed Linux build.
Fixed SGI PCA and O2K build.
Set default iteration count to 1.

1.1
---

Added compile on Solaris and HPUX with native BLAS.
Added output to tty.
Added new configure makefile structure.
Added -c option to keep iteration count the same.              

1.0
---

First version

