see http://lists.libre-riscv.org/pipermail/libre-riscv-dev/2020-March/005482.html links recommended by staf: https://en.wikipedia.org/wiki/Clock_gating https://www.allaboutcircuits.com/technical-articles/use-of-clock-gating-to-reduce-power-consumption/
> The principle is that you save power by not clocking the parts of the circuit > that don't have to do any computing. I think this could be a more > general way to only enable the stages in your pipeline who actually > are doing computation. ok so if i understand this correctly: * the clock still runs at 1600mhz * the clock runs a cyclic shift-register of length equal to the number of stages, at 1600 mhz. * only every *alternate* one of those elements in the shift register is enabled (or, if you want full speed, all of them). * through EnableInserter each stage is clocked by a *different* bit in the shifted-register > That said I think this feature does not fit in the MVP scope of the October > prototype so that chip should IMO not use clock gating nor the pass-through > register feature from the original discussion. no, i agree, and, more to the point, we don't need it for the 180nm ASIC (except perhaps to test the concept). one thing that we have is, the use of OO python has the entirety of the stages themselves *completely* separated firmly behind a general-purpose API, where the construction of pipelines, from those stages, using entirely different pipeline techniques, is *literally* a one-line change. so we could conceivably do the *entire* suite of pipelines - convert them to use this clock gating technique - *literally* in well under a day, after first experimenting with EnableInserter and a quick and simple unit test. re-running the IEEE754 FP unit tests on the other hand... *sigh* :)
(In reply to Luke Kenneth Casson Leighton from comment #1) > > The principle is that you save power by not clocking the parts of the circuit > > that don't have to do any computing. I think this could be a more > > general way to only enable the stages in your pipeline who actually > > are doing computation. > > ok so if i understand this correctly: > > * the clock still runs at 1600mhz > * the clock runs a cyclic shift-register of length equal to the > number of stages, at 1600 mhz. > * only every *alternate* one of those elements in the shift register > is enabled (or, if you want full speed, all of them). > * through EnableInserter each stage is clocked by a *different* bit > in the shifted-register Correct, the clock is the pipeline clock. In theory other parts of the CPU could for example run at half the clock frequency. This will then naturally automatically only committing a new operation every other cycle at maximum. I did not test it but EnableInserter should work in simulation and FPGA. Depending on FPGA you likely won't see the full power improvements as I think that the enabling is implemented as an enable input to each FF and not with gating parts of the clock tree. It will still guarantee that the output of FFs don't change. As said implementing clock gating for ASICs will not be a simple task.