From: Alan Stern Date: Wed, 30 Mar 2005 20:05:45 +0000 (-0500) Subject: [SCSI] return success after retries in scsi_eh_tur X-Git-Tag: v2.6.14-rc1~522^2^2~5 X-Git-Url: http://pilppa.com/gitweb/?a=commitdiff_plain;h=e47373ec1c9aab9ee134f4e2b8249957e9f4c7ef;p=linux-2.6-omap-h63xx.git [SCSI] return success after retries in scsi_eh_tur The problem lies in the way the error handler uses TEST UNIT READY to tell whether error recovery has succeeded. The scsi_eh_tur function gives up after one round of retrying; after that it decides that more error recovery is needed. However TUR is liable to report sense data indicating a retry is needed when in fact error recovery has succeeded. A typical example might be SK=2, ASC=4, ASCQ=1 (Logical unit in process of becoming ready). The mere fact that we were able to get a sensible reply to the TUR should indicate that the device is working well enough to stop error recovery. I ran across a case back in January where this happened. A CD-ROM drive timed out the INQUIRY command, and a device reset fixed the blockage. But then the drive kept responding with 2/4/1 -- because it was spinning up I suppose -- until the error handler gave up and placed it offline. If the initial INQUIRY had received the 2/4/1 instead, everything would have worked okay. It doesn't seem reasonable for things to fail just because the error handler had started running. Signed-off-by: Alan Stern Signed-off-by: James Bottomley --- diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index e9c451ba71f..688bce74078 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -776,9 +776,11 @@ retry_tur: __FUNCTION__, scmd, rtn)); if (rtn == SUCCESS) return 0; - else if (rtn == NEEDS_RETRY) + else if (rtn == NEEDS_RETRY) { if (retry_cnt--) goto retry_tur; + return 0; + } return 1; }