From: Nathan Lynch Date: Thu, 2 Jun 2005 22:15:09 +0000 (-0500) Subject: [SCSI] fix slab corruption during ipr probe X-Git-Tag: v2.6.12-rc6~9^2 X-Git-Url: http://pilppa.com/gitweb/?a=commitdiff_plain;h=c92715b3c22e94105a8fd9e4a23047d05c5077e7;p=linux-2.6-omap-h63xx.git [SCSI] fix slab corruption during ipr probe With CONFIG_DEBUG_SLAB=y I see slab corruption messages during boot on pSeries machines with IPR adapters with any 2.6.12-rc kernel. The change which seems to have introduced the problem is "SCSI: revamp target scanning routines" and may be found at: http://marc.theaimsgroup.com/?l=bk-commits-head&m=111093946426333&w=2 In order to revert that in a 2.6.12-rc1 tree, I had to revert "target code updates to support scanned targets" first: http://marc.theaimsgroup.com/?l=bk-commits-head&m=111094132524649&w=2 With both patches reverted, the corruption messages go away. ipr: IBM Power RAID SCSI Device Driver version: 2.0.13 (February 21, 2005) ipr 0001:d0:01.0: Found IOA with IRQ: 167 ipr 0001:d0:01.0: Starting IOA initialization sequence. ipr 0001:d0:01.0: Adapter firmware version: 020A005C ipr 0001:d0:01.0: IOA initialized. scsi0 : IBM 570B Storage Adapter Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770 Type: Enclosure ANSI SCSI revision: 02 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM H0 Model: HUS103036FL3800 Rev: RPQF Type: Direct-Access ANSI SCSI revision: 04 Vendor: IBM Model: VSBPD4E1 U4SCSI Rev: 4770 Type: Enclosure ANSI SCSI revision: 02 Slab corruption: start=c0000001e8de5268, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](.scsi_target_dev_release+0x28/0x50) 080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a Prev obj: start=c0000001e8de5050, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<0000000000000000>](0x0) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=c0000001e8de5480, len=512 Redzone: 0x170fc2a5/0x170fc2a5. Last user: [](.as_init_queue+0x5c/0x228) 000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00 010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98 Slab corruption: start=c0000001e8de5268, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [](.scsi_target_dev_release+0x28/0x50) 080: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6a Prev obj: start=c0000001e8de5050, len=512 Redzone: 0x5a2cf071/0x5a2cf071. Last user: [<0000000000000000>](0x0) 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b Next obj: start=c0000001e8de5480, len=512 Redzone: 0x170fc2a5/0x170fc2a5. Last user: [](.as_init_queue+0x5c/0x228) 000: c0 00 00 01 e8 83 26 08 00 00 00 00 00 00 00 00 010: 00 00 00 00 00 00 00 00 c0 00 00 01 e8 de 54 98 ... I did some digging and the problem seems to be a refcounting issue in __scsi_add_device. The target gets freed in scsi_target_reap, and then __scsi_add_device tries to do another device_put on it. Signed-off-by: Nathan Lynch Signed-off-by: James Bottomley --- diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c index cca772624ae..8d0d302844a 100644 --- a/drivers/scsi/scsi_scan.c +++ b/drivers/scsi/scsi_scan.c @@ -1197,6 +1197,7 @@ struct scsi_device *__scsi_add_device(struct Scsi_Host *shost, uint channel, if (!starget) return ERR_PTR(-ENOMEM); + get_device(&starget->dev); down(&shost->scan_mutex); res = scsi_probe_and_add_lun(starget, lun, NULL, &sdev, 1, hostdata); if (res != SCSI_SCAN_LUN_PRESENT)