Changes
* update only sections
* small CPU core to execute short programs?
== Components ==
=== Memory ===
External SRAM needs to be fast enough to handle all reads during one
cycle. Per connected matrix we need to read 6 bytes per cycle.
Clearly this would require use to run at 150MHz, which is infeasible.
If we use a separate chip per matrix block (6 chips or 3 chips with
16bit width), we can still operate with 25MHz.
Concurrent writes need to be possible, but at a lower speed. Either
buffer reads and slip in writes when there is slack (complicated), or
alternate between reads and writes at higher speed (DDR?).