xen: use spin_lock_nest_lock when pinning a pagetable
When pinning/unpinning a pagetable with split pte locks, we can end up
holding multiple pte locks at once (we need to hold the locks while
there's a pending batched hypercall affecting the pte page). Because
all the pte locks are in the same lock class, lockdep thinks that
we're potentially taking a lock recursively.
This warning is spurious because we always take the pte locks while
holding mm->page_table_lock. lockdep now has spin_lock_nest_lock to
express this kind of dominant lock use, so use it here so that lockdep
knows what's going on.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
The balloon driver doesn't have any externally callable functions at
the moment, so remove the (effectively empty) header. We can add it
back if we need to.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
There are a couple of Xen features which rely on directly accessing
per-cpu data via a segment register, which is not yet available on
x86-64. In the meantime, just disable direct access to the vcpu info
structure; this leaves some of the code as dead, but it will come to
life in time, and the warnings are suppressed.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
kernel/time/tick-common.c: In function ‘tick_setup_periodic’:
kernel/time/tick-common.c:113: error: implicit declaration of function ‘tick_broadcast_oneshot_active’
Thomas Gleixner [Mon, 22 Sep 2008 17:02:25 +0000 (19:02 +0200)]
x86: prevent C-states hang on AMD C1E enabled machines
Impact: System hang when AMD C1E machines switch into C2/C3
AMD C1E enabled systems do not work with normal ACPI C-states
even if the BIOS is advertising them. Limit the C-states to
C1 for the ACPI processor idle code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Mon, 22 Sep 2008 17:04:02 +0000 (19:04 +0200)]
clockevents: prevent mode mismatch on cpu online
Impact: timer hang on CPU online observed on AMD C1E systems
When a CPU is brought online then the broadcast machinery can
be in the one shot state already. Check this and setup the timer
device of the new CPU in one shot mode so the broadcast code
can pick up the next_event value correctly.
Another AMD C1E oddity, as we switch to broadcast immediately and
not after the full bring up via the ACPI cpu idle code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Mon, 22 Sep 2008 17:02:25 +0000 (19:02 +0200)]
clockevents: check broadcast device not tick device
Impact: Possible hang on CPU online observed on AMD C1E machines.
The broadcast setup code looks at the mode of the tick device to
determine whether it needs to be shut down or setup. This is wrong
when the broadcast mode is set to one shot already. This can happen
when a CPU is brought online as it goes through the periodic setup
first.
The problem went unnoticed as sane systems do not call into that code
before the switch to one shot for the clock event device happens.
The AMD C1E idle routine switches over immediately and thereby shuts
down the just setup device before the first interrupt happens.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Mon, 22 Sep 2008 16:56:01 +0000 (18:56 +0200)]
clockevents: prevent stale tick_next_period for onlining CPUs
Impact: possible hang on CPU onlining in timer one shot mode.
The tick_next_period variable is only used during boot on nohz/highres
enabled systems, but for CPU onlining it needs to be maintained when
the per cpu clock events device operates in one shot mode.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Mon, 22 Sep 2008 16:54:29 +0000 (18:54 +0200)]
x86: prevent stale state of c1e_mask across CPU offline/online
Impact: hang which happens across CPU offline/online on AMD C1E systems.
When a CPU goes offline then the corresponding bit in the broadcast
mask is cleared. For AMD C1E enabled CPUs we do not reenable the
broadcast when the CPU comes online again as we do not clear the
corresponding bit in the c1e_mask, which keeps track which CPUs
have been switched to broadcast already. So on those !$@#& machines
we never switch back to broadcasting after a CPU offline/online cycle.
Clear the bit when the CPU plays dead.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Thomas Gleixner [Mon, 22 Sep 2008 16:46:37 +0000 (18:46 +0200)]
clockevents: prevent cpu online to interfere with nohz
Impact: rare hang which can be triggered on CPU online.
tick_do_timer_cpu keeps track of the CPU which updates jiffies
via do_timer. The value -1 is used to signal, that currently no
CPU is doing this. There are two cases, where the variable can
have this state:
boot:
necessary for systems where the boot cpu id can be != 0
nohz long idle sleep:
When the CPU which did the jiffies update last goes into
a long idle sleep it drops the update jiffies duty so
another CPU which is not idle can pick it up and keep
jiffies going.
Using the same value for both situations is wrong, as the CPU online
code can see the -1 state when the timer of the newly onlined CPU is
setup. The setup for a newly onlined CPU goes through periodic mode
and can pick up the do_timer duty without being aware of the nohz /
highres mode of the already running system.
Use two separate states and make them constants to avoid magic
numbers confusion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Atsushi Nemoto [Tue, 5 Aug 2008 14:45:14 +0000 (23:45 +0900)]
[MIPS] vmlinux.lds.S: handle .text.*
The -ffunction-sections puts each text in .text.function_name section.
Without this patch, most functions are placed outside _text..._etext
area and it breaks show_stacktrace(), etc.
Akinobu Mita [Sat, 13 Sep 2008 10:03:32 +0000 (19:03 +0900)]
mmc_test: initialize mmc_test_lock statically
The mutex mmc_test_lock is initialized at every time mmc_test device
is probed. Probing another mmc_test device may break the mutex, if
the probe function is called while the mutex is locked.
This patch fixes it by statically initializing mmc_test_lock.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
We used to store a binary register snapshot in the "regs" file, so we
set the file size to be the size of this snapshot. This is no longer
valid since we switched to using seq_file.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
The debugfs hook atmci_regs_show allocates a temporary buffer for
storing a register snapshot, but it doesn't free it before returning.
Plug this leak.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Andrew Paprocki [Sat, 20 Sep 2008 08:25:19 +0000 (10:25 +0200)]
hwmon: (it87) Fix fan tachometer reading in IT8712F rev 0x7 (I)
The IT8712F v0.9.1 datasheet applies to revisions >= 0x8 (J).
The driver was incorrectly attempting to enable 16-bit fan
readings on rev 0x7 (I) which led to incorrect RPM values.
Signed-off-by: Andrew Paprocki <andrew@ishiboo.com> Tested-by: John Gumb <john.gumb@tandberg.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Commit 4611a77 ("[IA64] fix compile failure with non modular builds")
introduced struct fdesc into asm/elf.h, which duplicates KVM's definition.
Remove the latter to avoid the build error.
Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* git://oss.sgi.com:8090/xfs/linux-2.6:
[XFS] Don't do I/O beyond eof when unreserving space
[XFS] Fix use-after-free with buffers
[XFS] Prevent lockdep false positives when locking two inodes.
[XFS] Fix barrier status change detection.
[XFS] Prevent direct I/O from mapping extents beyond eof
[XFS] Fix regression introduced by remount fixup
[XFS] Move memory allocations for log tracing out of the critical path
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop()
RDMA/nes: Fix client side QP destroy
IB/mlx4: Fix up fast register page list format
mlx4_core: Set RAE and init mtt_sz field in FRMR MPT entries
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fix deadlock in setting scheduler parameter to zero
sched: fix 2.6.27-rc5 couldn't boot on tulsa machine randomly
Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clockevents: make device shutdown robust
clocksource, acpi_pm.c: fix check for monotonicity
clockevents: remove WARN_ON which was used to gather information
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
Fix compile failure with non modular builds
powerpc: Holly board needs dtbImage target
powerpc: Fix interrupt values for DMA2 in MPC8610 HPCD device tree
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5255/1: Update jornada ssp to remove build errors/warnings
[ARM] omap: back out 'internal_clock' support
[ARM] 5249/1: davinci: remove redundant check in davinci_psc_config()
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Fix SMP bootup with CONFIG_STACK_DEBUG or ftrace.
sparc64: Fix OOPS in psycho_pcierr_intr_other().
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
e100: Use pci_pme_active to clear PME_Status and disable PME#
e1000: prevent corruption of EEPROM/NVM
forcedeth: call restore mac addr in nv_shutdown path
bnx2: Promote vector field in bnx2_irq structure from u16 to unsigned int
sctp: Fix oops when INIT-ACK indicates that peer doesn't support AUTH
sctp: do not enable peer features if we can't do them.
sctp: set the skb->ip_summed correctly when sending over loopback.
udp: Fix rcv socket locking
Manfred Spraul [Wed, 20 Aug 2008 13:39:59 +0000 (15:39 +0200)]
avr32: nmi_enter() without nmi_exit()
While updating the rcu code, I noticed that do_nmi() for AVR32 is odd:
There is an nmi_enter() call without an nmi_exit().
This can't be correct, it breaks rcu (at least the preempt version) and
lockdep.
[haavard.skinnemoen@atmel.com: fixed another case that returned directly] Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
On AVR32, all parameters beyond the 5th are passed on the stack. System
calls don't use the stack -- they borrow a callee-saved register
instead. This means that syscalls that take 6 parameters must be called
through a stub that pushes the last parameter on the stack.
This patch adds a stub for sync_file_range syscall on AVR32
architecture. Tested with uClibc snapshot.
md: Don't wait UNINTERRUPTIBLE for other resync to finish
When two md arrays share some block device (e.g each uses different
partitions on the one device), a resync of one array will wait for
the resync on the other to finish.
This can be a long time and as it currently waits TASK_UNINTERRUPTIBLE,
the softlockup code notices and complains.
So use TASK_INTERRUPTIBLE instead and make sure to flush signals
before calling schedule.
e100: Use pci_pme_active to clear PME_Status and disable PME#
Currently e100 uses pci_enable_wake() to clear pending wake-up events
and disable PME# during intitialization, but that function is not
suitable for this purpose, because it immediately returns error code
if device_may_wakeup() returns false for given device.
Make e100 use pci_pme_active(), which carries out exactly the
required operations, instead.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Christopher Li [Fri, 5 Sep 2008 21:04:05 +0000 (14:04 -0700)]
e1000: prevent corruption of EEPROM/NVM
Andrey reports e1000 corruption, and that a patch in vmware's ESX fixed
it.
The EEPROM corruption is triggered by concurrent access of the EEPROM
read/write. Putting a lock around it solve the problem.
[akpm@linux-foundation.org: use DEFINE_SPINLOCK to avoid confusing lockdep] Signed-off-by: Christopher Li <chrisl@vmware.com> Reported-by: Andrey Borzenkov <arvidjaar@mail.ru> Cc: Zach Amsden <zach@vmware.com> Cc: Pratap Subrahmanyam <pratap@vmware.com> Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Bruce Allan <bruce.w.allan@intel.com> Cc: PJ Waskiewicz <peter.p.waskiewicz.jr@intel.com> Cc: John Ronciak <john.ronciak@intel.com> Cc: Jeff Garzik <jeff@garzik.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Yinghai Lu [Sat, 13 Sep 2008 20:10:31 +0000 (13:10 -0700)]
forcedeth: call restore mac addr in nv_shutdown path
after
| commit f735a2a1a4f2a0f5cd823ce323e82675990469e2
| Author: Tobias Diedrich <ranma+kernel@tdiedrich.de>
| Date: Sun May 18 15:02:37 2008 +0200
|
| [netdrvr] forcedeth: setup wake-on-lan before shutting down
|
| When hibernating in 'shutdown' mode, after saving the image the suspend hook
| is not called again.
| However, if the device is in promiscous mode, wake-on-lan will not work.
| This adds a shutdown hook to setup wake-on-lan before the final shutdown.
|
| Signed-off-by: Tobias Diedrich <ranma+kernel@tdiedrich.de>
| Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
my servers with nvidia ck804 and mcp55 will reverse mac address with kexec.
it turns out that we need to restore the mac addr in nv_shutdown().
[akpm@linux-foundation.org: fix typo in printk] Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Tobias Diedrich <ranma+kernel@tdiedrich.de> Cc: Ayaz Abdulla <aabdulla@nvidia.com> Cc: Jeff Garzik <jeff@garzik.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Benjamin Li [Thu, 18 Sep 2008 23:46:11 +0000 (16:46 -0700)]
bnx2: Promote vector field in bnx2_irq structure from u16 to unsigned int
The bnx2 driver stores/uses the irq value from the pci_dev internally.
But when it stores the irq value, it has been performing an
integer demotion. Because of the recent changes made to
arch/x86/kernel/io_apic.c, the new method in creating the irq value
(using build_irq_for_pci_dev()) has exposed this bug on x86 systems.
Because of this demotion when calling request_irq() from
bnx2_request_irq(), the driver would get a return code of -EINVAL.
This is because the kernel could not find the requested irq descriptor.
By storing the irq value properly, the kernel can find the correct
irq descriptor and the bnx2 driver can operate normally.
Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sctp: Fix oops when INIT-ACK indicates that peer doesn't support AUTH
If INIT-ACK is received with SupportedExtensions parameter which
indicates that the peer does not support AUTH, the packet will be
silently ignore, and sctp_process_init() do cleanup all of the
transports in the association.
When T1-Init timer is expires, OOPS happen while we try to choose
a different init transport.
The solution is to only clean up the non-active transports, i.e
the ones that the peer added. However, that introduces a problem
with sctp_connectx(), because we don't mark the proper state for
the transports provided by the user. So, we'll simply mark
user-provided transports as ACTIVE. That will allow INIT
retransmissions to work properly in the sctp_connectx() context
and prevent the crash.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sctp: do not enable peer features if we can't do them.
Do not enable peer features like addip and auth, if they
are administratively disabled localy. If the peer resports
that he supports something that we don't, neither end can
use it so enabling it is pointless. This solves a problem
when talking to a peer that has auth and addip enabled while
we do not. Found by Andrei Pelinescu-Onciul <andrei@iptel.org>.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[ARM] 5255/1: Update jornada ssp to remove build errors/warnings
* Adds ssp functions into header so we don't get
"implicit declaration" error at builtime.
* Converts jornada_ssp_start/end functions into voids with
proper declarations (to avoid "prototype..." warning).
* Sorts include files in alphabetical order
* Minor comment changes
Signed-off-by: Kristoffer Ericson <Kristoffer.Ericson@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
sctp: set the skb->ip_summed correctly when sending over loopback.
Loopback used to clobber the ip_summed filed which sctp then used
to figure out if it needed to do checksumming or not. Now that
loopback doesn't do that any more, sctp needs to set the ip_summed
field correctly.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Wed, 17 Sep 2008 19:58:11 +0000 (20:58 +0100)]
[ARM] omap: back out 'internal_clock' support
The structures weren't ready for this change:
arch/arm/plat-omap/devices.c:320: error: 'struct omap_mmc_conf' has no member named 'internal_clock'
arch/arm/plat-omap/devices.c:326: error: implicit declaration of function 'omap_ctrl_readl'
arch/arm/plat-omap/devices.c:326: error: 'OMAP2_CONTROL_DEVCONF0' undeclared (first use in this function)
arch/arm/plat-omap/devices.c:328: error: implicit declaration of function 'omap_ctrl_writel'
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
James Bottomley [Tue, 9 Sep 2008 14:04:18 +0000 (14:04 +0000)]
Fix compile failure with non modular builds
Commit deac93df26b20cf8438339b5935b5f5643bc30c9 ("lib: Correct printk
%pF to work on all architectures") broke the non modular builds by
moving an essential function into modules.c. Fix this by moving it
out again and into asm/sections.h as an inline. To do this, the
definition of struct ppc64_opd_entry has been lifted out of modules.c
and put in asm/elf.h where it belongs.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
One of the changes in the bootwrapper makefile introduced the dtbImage
targets for boards that need a simple zImage with a DTB embedded in
them (595be948cce574ff2d5dde5d0426a636a4363c70, "[POWERPC]
bootwrapper: Build multiple cuImages"). When this was done, it broke
booting on the Holly board as the zImage.holly wrapper did not get the
DTB embedded properly.
This changes the target for the Holly board to a dtbImage so that the
wrapper includes the vmlinux, wrapper bits, and DTB.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
[XFS] Don't do I/O beyond eof when unreserving space
When unreserving space with boundaries that are not block aligned we round
up the start and round down the end boundaries and then use this function,
xfs_zero_remaining_bytes(), to zero the parts of the blocks that got
dropped during the rounding. The problem is we don't consider if these
blocks are beyond eof. Worse still is if we encounter delayed allocations
beyond eof we will try to use the magic delayed allocation block number as
a real block number. If the file size is ever extended to expose these
blocks then we'll go through xfs_zero_eof() to zero them anyway.
SGI-PV: 983683
SGI-Modid: xfs-linux-melb:xfs-kern:32055a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
We have a use-after-free issue where log completions access buffers via
the buffer log item and the buffer has already been freed. Fix this by
taking a reference on the buffer when attaching the buffer log item and
release the hold when the buffer log item is detached and we no longer
need the buffer. Also create a new function xfs_buf_item_free() to combine
some common code.
SGI-PV: 985757
SGI-Modid: xfs-linux-melb:xfs-kern:32025a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
David Chinner [Wed, 17 Sep 2008 06:51:21 +0000 (16:51 +1000)]
[XFS] Prevent lockdep false positives when locking two inodes.
If we call xfs_lock_two_inodes() to grab both the iolock and the ilock,
then drop the ilocks on both inodes, then grab them again (as
xfs_swap_extents() does) then lockdep will report a locking order problem.
This is a false positive.
To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks
at once - force calers to make two separate calls. This means that nested
dropping and regaining of the ilocks will retain the same lockdep subclass
and so lockdep will not see anything wrong with this code.
SGI-PV: 986238
SGI-Modid: xfs-linux-melb:xfs-kern:31999a
Signed-off-by: David Chinner <david@fromorbit.com> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Peter Leckie <pleckie@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
[XFS] Prevent direct I/O from mapping extents beyond eof
With the help from some tracing I found that we try to map extents beyond
eof when doing a direct I/O read. It appears that the way to inform the
generic direct I/O path (ie do_direct_IO()) that we have breached eof is
to return an unmapped buffer from xfs_get_blocks_direct(). This will cause
do_direct_IO() to jump to the hole handling code where is will check for
eof and then abort.
This problem was found because a direct I/O read was trying to map beyond
eof and was encountering delayed allocations. The delayed allocations
beyond eof are speculative allocations and they didn't get converted when
the direct I/O flushed the file because there was only enough space in the
current AG to convert and write out the dirty pages within eof. Note that
xfs_iomap_write_allocate() wont necessarily convert all the delayed
allocation passed to it - it will return after allocating the first extent
- so if the delayed allocation extends beyond eof then it will stay that
way.
SGI-PV: 983683
SGI-Modid: xfs-linux-melb:xfs-kern:31929a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
Logically we would return an error in xfs_fs_remount code to prevent users
from believing they might have changed mount options using remount which
can't be changed.
But unfortunately mount(8) adds all options from mtab and fstab to the
mount arguments in some cases so we can't blindly reject options, but have
to check for each specified option if it actually differs from the
currently set option and only reject it if that's the case.
Until that is implemented we return success for every remount request, and
silently ignore all options that we can't actually change.
SGI-PV: 985710
SGI-Modid: xfs-linux-melb:xfs-kern:31908a
Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Tim Shimmin <tes@sgi.com> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
[XFS] Move memory allocations for log tracing out of the critical path
Memory allocations for log->l_grant_trace and iclog->ic_trace are done on
demand when the first event is logged. In xlog_state_get_iclog_space() we
call xlog_trace_iclog() under a spinlock and allocating memory here can
cause us to sleep with a spinlock held and deadlock the system.
For the log grant tracing we use KM_NOSLEEP but that means we can lose
trace entries. Since there is no locking to serialize the log grant
tracing we could race and have multiple allocations and leak memory.
So move the allocations to where we initialize the log/iclog structures.
Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace
since it's not even used.
SGI-PV: 983738
SGI-Modid: xfs-linux-melb:xfs-kern:31896a
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Signed-off-by: Christoph Hellwig <hch@infradead.org>
Arjan van de Ven [Mon, 15 Sep 2008 23:43:18 +0000 (16:43 -0700)]
warn: Turn the netdev timeout WARN_ON() into a WARN()
this patch turns the netdev timeout WARN_ON_ONCE() into a WARN_ONCE(),
so that the device and driver names are inside the warning message.
This helps automated tools like kerneloops.org to collect the data
and do statistics, as well as making it more likely that humans
cut-n-paste the important message as part of a bugreport.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Miller [Tue, 16 Sep 2008 22:00:11 +0000 (15:00 -0700)]
Fix PNP build failure, bugzilla #11276
This fill fix the following regression list entry:
Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11276
Subject : build error: CONFIG_OPTIMIZE_INLINING=y causes gcc 4.2 to do stupid things
Submitter : Randy Dunlap <randy.dunlap@oracle.com>
Date : 2008-08-06 17:18 (38 days old)
References : http://marc.info/?l=linux-kernel&m=121804329014332&w=4
http://lkml.org/lkml/2008/7/22/353
Handled-By : Bjorn Helgaas <bjorn.helgaas@hp.com>
Patch : http://lkml.org/lkml/2008/7/22/364
with what I believe is a better fix than the one referenced
in the regression entry above.
These PNP header interfaces try to work in such a way that
you can reference some of them even if PNP is not enabled,
and the compiler was expected to optimize everything away.
Which is mostly fine, except that there was one interface
for which there was not provided an inline "NOP" implementation.
Once we add that, all of these compile failures cannot handle
any more.
pnp: Provide NOP inline implementation of pnp_get_resource() when !PNP
Fixes kernel bugzilla #11276.
Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This fixes kernel regression for 2.6.27-rc in
http://bugzilla.kernel.org/show_bug.cgi?id=11547
The change to split 8390 into old isa and non-isa versions
overlooked this driver.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit bc19d6e0b74ef03a3baf035412c95192b54dfc6f, which as
Larry Finger reports causes the radio LED on his system to no longer
respond to rfkill switch events.
Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Requested-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Timur Tabi [Tue, 9 Sep 2008 19:43:39 +0000 (14:43 -0500)]
powerpc: Fix interrupt values for DMA2 in MPC8610 HPCD device tree
For Freescale 8xxx devices that use an MPIC, the interrupt numbers in
the device tree must be 16 greater than the values documented in the
reference manual. In these chips, the MPIC is wired to use the first
16 numbers for external interrupts, but the documentation numbers
internal interrupts from 0.
In the MPC8610 HPCD device tree, the interrupt properties for the DMA
channels for DMA2 were not the adjusted values. This fixes that.
Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Thomas Gleixner [Tue, 16 Sep 2008 18:32:50 +0000 (11:32 -0700)]
clockevents: make device shutdown robust
The device shut down does not cleanup the next_event variable of the
clock event device. So when the device is reactivated the possible
stale next_event value can prevent the device to be reprogrammed as it
claims to wait on a event already.
This is the root cause of the resurfacing suspend/resume problem,
where systems need key press to come back to life.
Fix this by setting next_event to KTIME_MAX when the device is shut
down. Use a separate function for shutdown which takes care of that
and only keep the direct set mode call in the broadcast code, where we
can not touch the next_event value.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop()
Taking rtnl_lock in ipoib_mcast_join_complete() causes a deadlock with
ipoib_stop(). We avoid it by scheduling the piece of code that takes
the lock on ipoib_workqueue instead of executing it directly. This
works because we only flush the ipoib_workqueue with the RTNL not held.
The deadlock happens because ipoib_stop() calls ipoib_ib_dev_down()
which calls ipoib_mcast_dev_flush(), which calls ipoib_mcast_free(),
which calls ipoib_mcast_leave(). The latter calls
ib_sa_free_multicast(), and this waits until the multicast completion
handler finishes. This handler is ipoib_mcast_join_complete(), which
waits for the rtnl_lock(), which was already taken by ipoib_stop().
This bug was introduced in commit a77a57a1 ("IPoIB: Fix deadlock on
RTNL in ipoib_stop()").
Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Fix QP not being destroyed properly on the client, which leads to
userspace programs hanging on exit. This is a missing chunk from the
connection management rewrite in commit 6492cdf3 ("RDMA/nes: CM
connection setup/teardown rework").
Signed-off-by: Faisal Latif <flatif@neteffect.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Stefan Weinhuber [Tue, 16 Sep 2008 16:32:19 +0000 (09:32 -0700)]
[S390] cio: fix orb initialization in cio_start_key
The functions cio_tm_start_key and cio_start_key use the same private
orb structure of a subchannel, so the orb needs to be cleared of old
data before it is used again. A respective memset is missing from
cio_start_key and hereby added.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The problem is that dev->driver_data for a ccw device is NULL,
while it should point to the ccwgroup device it is a member of.
This happened due to incorrect cleanup if creating a ccwgroup
device failed because the ccw devices were already grouped.
Fix this by setting cdev[i] to NULL in the error handling of
ccwgroup_create_from_string() after we give up our reference and
by checking if the driver_data points to the ccwgroup device in
ccwgroup_release() just to be really sure.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Byte swap the addresses in the page list for fast register work requests
to big endian to match what the HCA expectx. Also, the addresses must
have the "present" bit set so that the HCA knows it can access them.
Otherwise the HCA will fault the first time it accesses the memory
region.
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Herbert Xu [Mon, 15 Sep 2008 18:48:46 +0000 (11:48 -0700)]
udp: Fix rcv socket locking
The previous patch in response to the recursive locking on IPsec
reception is broken as it tries to drop the BH socket lock while in
user context.
This patch fixes it by shrinking the section protected by the
socket lock to sock_queue_rcv_skb only. The only reason we added
the lock is for the accounting which happens in that function.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Kim Phillips [Sun, 14 Sep 2008 20:41:19 +0000 (13:41 -0700)]
crypto: talitos - Avoid consecutive packets going out with same IV
The SEC's h/w IV out implementation DMAs the trailing encrypted payload
block of the last encryption to ctx->iv. Since the last encryption may
still be pending completion, we can sufficiently prevent successive
packets from being transmitted with the same IV by xoring with sequence
number.
Also initialize alg_list earlier to prevent oopsing on a failed probe.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Lee Nipper <lee.nipper@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
PFN_PHYS() can truncate large addresses unless its passed a suitable
large type. This is fixed more generally in the patch series
introducing phys_addr_t, but we need a short-term fix to solve a
Xen regression reported by Roberto De Ioris.
Reported-by: Roberto De Ioris <roberto@unbit.it> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] Fix PCI_DMA_BUS_IS_PHYS for ARM
[ARM] 5247/1: tosa: SW_EAR_IN support
[ARM] 5246/1: tosa: add proper clock alias for tc6393xb clock
[ARM] 5245/1: Fix warning about unused return value in drivers/pcmcia
[ARM] OMAP: Fix MMC device data
imx serial: fix rts handling for non imx1 based hardware
imx serial: set RXD mux bit on i.MX27 and i.MX31
i.MX serial: fix init failure
pcm037: add rts/cts support for serial port