OMAP2/3 clock: convert remaining MPU barriers into OCP barriers
Several parts of the OMAP2/3 clock code use wmb() to try to ensure
that the hardware write completes before continuing. This approach is
problematic: wmb() only ensures that the write leaves the ARM. It
does not ensure that the write actually reaches the endpoint device.
The endpoint device in this case - either the PRM, CM, or SCM - is
three interconnects away from the ARM - and the final interconnect is
low-speed. And the OCP interconnects will post the write, and who
knows how long that will take to complete. So the wmb() is not what
we want. Worse, the wmb() is indiscriminate; it causes the ARM to
flush any other unrelated buffered writes and wait for the local
interconnect to acknowledge them - potentially very expensive.
Fix this by converting the wmb()s into readbacks of the same PRM/CM/SCM
register. Since the PRM/CM/SCM devices use a single OCP thread, this
will cause the MPU to block while waiting for posted writes to that device
to complete.
Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Tony Lindgren <tony@atomide.com>