The doorbell writes may be seen out of order by the firmware if they
are in WC memory since the tx spin(un)lock does not flush WC writes.
Hence if the "stop" is written on a different CPU than the "go", it
is possible that the stop will arrive after the go unless we add an
explicit memory barrier (and mmiowb() is not enough).
It fixes transmit hangs in multi tx queue mode.
Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>