Michael Ellerman [Wed, 12 Mar 2008 07:03:24 +0000 (18:03 +1100)]
[POWERPC] Fix large hash table allocation on Cell blades
My recent hack to allocate the hash table under 1GB on cell was poorly
tested, *cough*. It turns out on blades with large amounts of memory we
fail to allocate the hash table at all. This is because RTAS has been
instantiated just below 768MB, and 0-x MB are used by the kernel,
leaving no areas that are both large enough and also naturally-aligned.
For the cell IOMMU hack the page tables must be under 2GB, so use that
as the limit instead. This has been tested on real hardware and boots
happily.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
The empty_zero_page symbol is exported by most other architectures
(s390, ia64, x86, um), and an upcoming ext4 patch needs it because
ZERO_PAGE() references empty_zero_page, and we need it to zero out an
unitialized extents in ext4 files.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Fix viodasd driver with scatterlist debug
The iSeries viodasd drivers does some very strange things with
scatterlists, one of these causing a BUG_ON to trigger when
scatterlist debugging is enabled due to initializing the
scatterlist with memset instead of sg_init_table().
This fixes it by using sg_init_table(). The rest of the stuff
it does to that poor list is still pretty awful but it will work.
I may look into fixing things in a nicer way some other time.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Tony Breeds [Tue, 11 Mar 2008 23:48:48 +0000 (10:48 +1100)]
[POWERPC] Fix arch/powerpc/platforms/powermac/pic.c when !CONFIG_ADB_PMU
When building arch/powerpc/platforms/powermac/pic.c when !CONFIG_ADB_PMU
we get the following warnings:
arch/powerpc/platforms/powermac/pic.c: In function 'pmacpic_find_viaint':
arch/powerpc/platforms/powermac/pic.c:623: warning: label 'not_found' defined but not used
This fixes it.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Tony Breeds [Tue, 11 Mar 2008 23:48:48 +0000 (10:48 +1100)]
[POWERPC] Fix drivers/macintosh/mediabay.c when !CONFIG_ADB_PMU
When building drivers/macintosh/mediabay.c if CONFIG_ADB_PMU isn't
defined we get:
drivers/built-in.o: In function `media_bay_step':
mediabay.c:(.text+0x92b84): undefined reference to `pmu_suspend'
mediabay.c:(.text+0x92c08): undefined reference to `pmu_resume'
Create empty place holders in that scenario.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
pmu_sys_suspended is declared extern when:
defined(CONFIG_PM_SLEEP) && defined(CONFIG_PPC32)
but only defined when:
defined(CONFIG_SUSPEND) && defined(CONFIG_PPC32)
which is wrong. Let's fix that.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
on PPC32. The variables aren't wrapped in '#if defined(CONFIG_SUSPEND)'
so we probably shouldn't wrap the exports either. This removes the
CONFIG_SUSPEND part of the export, which fixes compilation on ppc32.
Signed-off-by: Guido Guenther <agx@sigxcpu.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The PMU backlight code would kick in during sleep/resume even on
machines that use a different backlight method. This breaks
sleep on some PowerBooks.
This fixes it by adding a flag to indicate whether the backlight
is controlled by the PMU, and testing that before trying to use
the PMU to turn off the backlight during sleep.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[POWERPC] Fix bogus test for unassigned PCI resources
A bogus test for unassigned resources that came from our 32-bit
PCI code ended up being "merged" by my previous patch series,
breaking some 64-bit setups where devices have legal resources
ending at 0xffffffff.
This fixes it by completely changing the test. We now test for
res->start == 0, as the generic code expects, and we also only
do so on platforms that don't have the PPC_PCI_PROBE_ONLY flag
set, as there are cases of pSeries and iSeries where it could
be a valid value and those can't reassign devices.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Grant Likely [Thu, 21 Feb 2008 18:57:07 +0000 (05:57 +1100)]
[POWERPC] Fix zImage-dtb.initrd build error
The pattern substitution rules were failing when used with zImage-dtb
targets. If zImage-dtb.initrd was selected, the pattern substitution
would generate "zImage.initrd-dtb" instead of "zImage-dtb.initrd" which
caused the build to fail.
This renames zImage-dtb to dtbImage to avoid the problem entirely.
By not using the zImage prefix then is no potential for namespace
collisions.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Geoff Levand <geoffrey.levand@am.sony.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Wed, 12 Mar 2008 22:39:55 +0000 (09:39 +1100)]
[POWERPC] Add __ucmpdi2 for 64-bit comparisons in 32-bit kernels
Some drivers (such as V4L2) have code that causes gcc to generate
calls to __ucmpdi2 when compiling for 32-bit powerpc, which results
in either a link-time error or a module that can't be loaded, as
we don't currently have a __ucmpdi2. This adds one so these drivers
can be used.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
[SCTP]: Fix local_addr deletions during list traversals.
net: fix build with CONFIG_NET=n
[TCP]: Prevent sending past receiver window with TSO (at last skb)
rt2x00: Add new D-Link USB ID
rt2x00: never disable multicast because it disables broadcast too
libertas: fix the 'compare command with itself' properly
drivers/net/Kconfig: fix whitespace for GELIC_WIRELESS entry
[NETFILTER]: nf_queue: don't return error when unregistering a non-existant handler
[NETFILTER]: nfnetlink_queue: fix EPERM when binding/unbinding and instance 0 exists
[NETFILTER]: nfnetlink_log: fix EPERM when binding/unbinding and instance 0 exists
[NETFILTER]: nf_conntrack: replace horrible hack with ksize()
[NETFILTER]: nf_conntrack: add \n to "expectation table full" message
[NETFILTER]: xt_time: fix failure to match on Sundays
[NETFILTER]: nfnetlink_log: fix computation of netlink skb size
[NETFILTER]: nfnetlink_queue: fix computation of allocated size for netlink skb.
[NETFILTER]: nfnetlink: fix ifdef in nfnetlink_compat.h
[NET]: include <linux/types.h> into linux/ethtool.h for __u* typedef
[NET]: Make /proc/net a symlink on /proc/self/net (v3)
RxRPC: fix rxrpc_recvmsg()'s returning of msg_name
net/enc28j60: oops fix
...
Linus Torvalds [Wed, 12 Mar 2008 20:07:12 +0000 (13:07 -0700)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c: chips subdirectory is deprecated
i2c: Keep client->driver and client->dev.driver in sync
i2c-amd756: Fix off-by-one
Linus Torvalds [Wed, 12 Mar 2008 20:04:11 +0000 (13:04 -0700)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Clocksource: Only install r4k counter as clocksource if present.
[MIPS] Lasat: fix LASAT_CASCADE_IRQ
[MIPS] Delete leftovers of old pcspeaker support.
[MIPS] BCM1480: Init pci controller io_map_base
[MIPS] Yosemite: Fix a few more section reference bugs.
[MIPS] Fix yosemite build error
[MIPS] Fix loads of section missmatches
[MIPS] IP27: Tighten up CPU description to fix warnings.
[MIPS] Fix plat_ioremap for JMR3927
[MIPS] Export __ucmpdi2 to modules.
[MIPS] Fix typo in comment
[MIPS] Use KBUILD_DEFCONFIG
[MIPS] Allow 48Hz to be selected if CONFIG_SYS_SUPPORTS_ARBIT_HZ is set.
[MIPS] Added missing cases for rdhwr emulation
[MIPS] Alchemy: Fix ids in Alchemy db dma device table
Bjorn Helgaas [Tue, 11 Mar 2008 21:24:41 +0000 (15:24 -0600)]
PNP: disable PNP motherboard resources that overlap PCI BARs
Some BIOSes have PNP motherboard devices with resources that
partially overlap PCI BARs. The PNP system driver claims these
motherboard resources, which prevents the normal PCI driver from
requesting them later.
This patch disables the PNP resources that conflict with PCI BARs
so they won't be claimed by the PNP system driver.
Of course, this only works if PCI devices have already been enumerated.
Currently this is the case because PCI devices are discovered before
any PNP init via this path:
Bjorn Helgaas [Tue, 11 Mar 2008 21:24:40 +0000 (15:24 -0600)]
PNP: revert Supermicro H8DCE motherboard quirk
There are other systems with similar problems
(http://lkml.org/lkml/2008/1/27/168), so we need a more
generic quirk. Remove the Supermicro-specific one first.
Tom Tucker [Tue, 11 Mar 2008 18:31:40 +0000 (14:31 -0400)]
SVCRDMA: Fix erroneous BUG_ON in send_write
The assertion that checks for sge context overflow is
incorrectly hard-coded to 32. This causes a kernel bug
check when using big-data mounts. Changed the BUG_ON to
use the computed value RPCSVC_MAXPAGES.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tom Tucker [Tue, 11 Mar 2008 18:31:39 +0000 (14:31 -0400)]
SVCRDMA: Add xprt refs to fix close/unmount crash
RDMA connection shutdown on an SMP machine can cause a kernel crash due
to the transport close path racing with the I/O tasklet.
Additional transport references were added as follows:
- A reference when on the DTO Q to avoid having the transport
deleted while queued for I/O.
- A reference while there is a QP able to generate events.
- A reference until the DISCONNECTED event is received on the CM ID
Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Woodhouse [Wed, 12 Mar 2008 16:52:56 +0000 (17:52 +0100)]
Remove <linux/genhd.h> from user-visible headers.
It was all wrapped in '#ifdef CONFIG_BLOCK' anyway, so userspace was
getting nothing useful out of it. And the special #ifndef __KERNEL__
version of 'struct partition' makes me inclined to promote an attitude
of violence...
Stick some comments on some of the #endifs too, while we're at it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current is_vmalloc_addr() users fall in to two camps:
- Determining whether to use vfree()/kfree()
- Whether to do vmlist traversal (only /proc/kcore).
Since we don't support /proc/kcore on nommu, that leaves the
vfree()/kfree() determination use cases. nommu vfree() happens to be a
wrapper to kfree() anyways, so is_vmalloc_addr() can always return 0
and end up with the right behaviour.
Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ralf Baechle [Mon, 25 Feb 2008 16:55:29 +0000 (16:55 +0000)]
[MIPS] Allow 48Hz to be selected if CONFIG_SYS_SUPPORTS_ARBIT_HZ is set.
This allows a 48Hz clock to be selected on Malta and other systems. Note
this not normally a sensible option as it results in rather high latencies
for some kernel stuff.
Chris Dearman [Mon, 8 May 2006 17:02:16 +0000 (18:02 +0100)]
[MIPS] Added missing cases for rdhwr emulation
Some of these are architecturally required for R2 processors so lets try
to be bit closer to the real thing. This also provides access to the
CPU cycle timer, even on multiprocessors. In that aspect its currently
bug compatible to what would happen on a R2-based SMP.
Signed-off-by: Chris Dearman <chris@mips.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Wolfgang Ocker [Sun, 10 Feb 2008 19:31:33 +0000 (20:31 +0100)]
[MIPS] Alchemy: Fix ids in Alchemy db dma device table
0 is a valid device id (DSCR_CMD0_UART0_TX), so we can't use it to mark
an empty entry in the device table. Use ~0 instead and search for id ~0
when looking for a free entry.
Signed-off-by: Wolfgang Ocker <weo@reccoware.de> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Hans Verkuil [Wed, 12 Mar 2008 13:15:00 +0000 (14:15 +0100)]
i2c: Keep client->driver and client->dev.driver in sync
Ensure that client->driver is set to NULL if the probe() returns an
error (this keeps client->driver and client->dev.driver in sync).
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Bjorn Helgaas [Tue, 11 Mar 2008 20:45:15 +0000 (13:45 -0700)]
ACPI: add _PRT quirks to work around broken firmware
This patch works around incorrect _PRT (PCI interrupt routing)
information from firmware. This does not fix any regressions
and can wait for the next kernel release.
On the Medion MD9580-F laptop, the BIOS says the builtin RTL8139
NIC interrupt at 00:09.0[A] is connected to \_SB.PCI0.ISA.LNKA, but
it's really connected to \_SB.PCI0.ISA.LNKB. Before this patch,
the workaround was to use "pci=routeirq". More details at
http://bugzilla.kernel.org/show_bug.cgi?id=4773.
On the Dell OptiPlex GX1, the BIOS says the PCI slot interrupt
00:0d[A] is connected to LNKB, but it's really connected to LNKA.
Before this patch, the workaround was to use "pci=routeirq".
Pierre Ossman tested a previous version of this patch and confirmed
that it fixed the problem. More details at
http://bugzilla.kernel.org/show_bug.cgi?id=5044.
On the HP t5710 thin client, the BIOS says the builtin Radeon
video interrupt at 01:00[A] is connected to LNK1, but it's really
connected to LNK3. The previous workaround was to use a custom
DSDT. I tested this patch and verified that it fixes the problem.
More details at http://bugzilla.kernel.org/show_bug.cgi?id=10138.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com>
There is a problem in the hibernation code that triggers on some NUMA
systems on which pfn_valid() returns 'true' for some PFNs that don't
belong to any zone. Namely, there is a BUG_ON() in
memory_bm_find_bit() that triggers for PFNs not belonging to any
zone and passing the pfn_valid() test. On the affected systems it
triggers when we mark PFNs reported by the platform as not saveable,
because the PFNs in question belong to a region mapped directly using
iorepam() (i.e. the ACPI data area) and they pass the pfn_valid()
test.
Modify memory_bm_find_bit() so that it returns an error if given PFN
doesn't belong to any zone instead of crashing the kernel and ignore
the result returned by it in mark_nosave_pages(), while marking the
"nosave" memory regions.
This doesn't affect the hibernation functionality, as we won't touch
the PFNs in question anyway.
http://bugzilla.kernel.org/show_bug.cgi?id=9966 .
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Len Brown <len.brown@intel.com>
Zhao Yakui [Tue, 11 Mar 2008 08:56:47 +0000 (16:56 +0800)]
ACPI: Ignore _BQC object when registering backlight device
According to acpi spec , the objects of _BCL and _BCM are required if
integrated LCD is present and supports brightness level .The _BQC is
the optional object. So the _BQC object is ignored when the backlight device
is registered in ACPI video driver.
http://bugzilla.kernel.org/show_bug.cgi?id=10206
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
[SCTP]: Fix local_addr deletions during list traversals.
Since the lists are circular, we need to explicitely tag
the address to be deleted since we might end up freeing
the list head instead. This fixes some interesting SCTP
crashes.
Signed-off-by: Chidambar 'ilLogict' Zinnoury <illogict@online.fr> Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Morton [Wed, 12 Mar 2008 01:03:35 +0000 (18:03 -0700)]
net: fix build with CONFIG_NET=n
fs/built-in.o:(.rodata+0x1134): undefined reference to `proc_net_inode_operations'
fs/built-in.o:(.rodata+0x1138): undefined reference to `proc_net_operations'
Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Ilpo Järvinen [Wed, 12 Mar 2008 00:55:27 +0000 (17:55 -0700)]
[TCP]: Prevent sending past receiver window with TSO (at last skb)
With TSO it was possible to send past the receiver window when the skb
to be sent was the last in the write queue while the receiver window
is the limiting factor. One can notice that there's a loophole in the
tcp_mss_split_point that lacked a receiver window check for the
tcp_write_queue_tail() if also cwnd was smaller than the full skb.
Noticed by Thomas Gleixner <tglx@linutronix.de> in form of "Treason
uncloaked! Peer ... shrinks window .... Repaired." messages (the peer
didn't actually shrink its window as the message suggests, we had just
sent something past it without a permission to do so).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Adam Baker [Sun, 9 Mar 2008 21:40:40 +0000 (22:40 +0100)]
rt2x00: never disable multicast because it disables broadcast too
On rt73 and rt61 disabling reception of multicast packets also disables
broadcast traffic which we never want to do. Therefore we should never
disable multicast.
Signed-off-by: Adam Baker <linux@baker-net.org.uk> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
On some Acer systems, the HW fails to clear the GPE source,
causing an interrupt storm.
So in EC interrupt mode, we count how many interrupts we
receive when waiting. If we get more than 5, we give
up on interrupt mode and revert to polling mode.
Also, for polling mode to work on Acers, we need
to insert a delay.
Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de> Signed-off-by: Len Brown <len.brown@intel.com>
Linus Torvalds [Tue, 11 Mar 2008 16:18:56 +0000 (09:18 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel:
keep rd->online and cpu_online_map in sync
Revert "cpu hotplug: adjust root-domain->online span in response to hotplug event"
Thomas Gleixner [Sun, 9 Mar 2008 12:14:37 +0000 (13:14 +0100)]
x86: remove quicklists
quicklists cause a serious memory leak on 32-bit x86,
as documented at:
http://bugzilla.kernel.org/show_bug.cgi?id=9991
the reason is that the quicklist pool is a special-purpose
cache that grows out of proportion. It is not accounted for
anywhere and users have no way to even realize that it's
the quicklists that are causing RAM usage spikes. It was
supposed to be a relatively small pool, but as demonstrated
by KOSAKI Motohiro, they can grow as large as:
given how much trouble this code has caused historically,
and given that Andrew objected to its introduction on x86
(years ago), the best option at this point is to remove them.
[ any performance benefits of caching constructed pgds should
be implemented in a more generic way (possibly within the page
allocator), while still allowing constructed pages to be
allocated by other workloads. ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Fri, 29 Feb 2008 03:57:07 +0000 (19:57 -0800)]
x86: ia32 syscall restart fix
The code to restart syscalls after signals depends on checking for a
negative orig_ax, and for particular negative -ERESTART* values in ax.
These fields are 64 bits and for a 32-bit task they get zero-extended.
The syscall restart behavior is lost, a regression from a native 32-bit
kernel and from 64-bit tasks' behavior.
This patch fixes the problem by doing sign-extension where it matters.
For orig_ax, the only time the value should be -1 but winds up as
0x0ffffffff is via a 32-bit ptrace call. So the patch changes ptrace to
sign-extend the 32-bit orig_eax value when it's stored; it doesn't
change the checks on orig_ax, though it uses the new current_syscall()
inline to better document the subtle importance of the used of
signedness there.
The ax value is stored a lot of ways and it seems hard to get them all
sign-extended at their origins. So for that, we use the
current_syscall_ret() to sign-extend it only for 32-bit tasks at the
time of the -ERESTART* comparisons.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Gregory Haskins [Mon, 10 Mar 2008 21:59:11 +0000 (17:59 -0400)]
keep rd->online and cpu_online_map in sync
It is possible to allow the root-domain cache of online cpus to
become out of sync with the global cpu_online_map. This is because we
currently trigger removal of cpus too early in the notifier chain.
Other DOWN_PREPARE handlers may in fact run and reconfigure the
root-domain topology, thereby stomping on our own offline handling.
The end result is that rd->online may become out of sync with
cpu_online_map, which results in potential task misrouting.
So change the offline handling to be more tightly coupled with the
global offline process by triggering on CPU_DYING intead of
CPU_DOWN_PREPARE.
Signed-off-by: Gregory Haskins <ghaskins@novell.com> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Steve Wise [Tue, 4 Mar 2008 22:44:52 +0000 (16:44 -0600)]
RDMA/iwcm: Don't access a cm_id after dropping reference
cm_work_handler() can access cm_id_priv after it drops its reference
by calling iwch_deref_id(), which might cause it to be freed. The fix
is to look at whether IWCM_F_CALLBACK_DESTROY is set _before_ dropping
the reference. Then if it was set, free the cm_id on this thread.
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Arne Redlich [Tue, 4 Mar 2008 12:07:22 +0000 (14:07 +0200)]
IB/iser: Fix list iteration bug
The iteration through the list of "iser_device"s during device
lookup/creation is broken -- it might result in an infinite loop if
more than one HCA is used with iSER. Fix this by using
list_for_each_entry() instead of the open-coded flawed list iteration
code.
Signed-off-by: Arne Redlich <arne.redlich@xiranet.com> Signed-off-by: Erez Zilber <erezz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Jeremy Kerr [Tue, 11 Mar 2008 01:46:18 +0000 (12:46 +1100)]
[POWERPC] spufs: fix rescheduling of non-runnable contexts
At present, we can hit the BUG_ON in __spu_update_sched_info by reading
the regs file of a context between two calls to spu_run. The
spu_release_saved called by spufs_regs_read() is resulting in the (now
non-runnable) context being placed back on the run queue, so the next
call to spu_run ends up in the bug condition.
This change uses the SPU_SCHED_SPU_RUN flag to only reschedule a context
if it's still in spu_run().
Linus Torvalds [Tue, 11 Mar 2008 01:45:51 +0000 (18:45 -0700)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] Add support for the RB500 PATA CompactFlash
ahci: logical-bitwise and confusion in ahci_save_initial_config()
libata: don't allow sysfs read access to force param
ahci: add the Device IDs for nvidia MCP7B AHCI
libata-sff: handle controllers w/o ctl register
libata: allow LLDs w/o any reset method
ata: replace remaining __FUNCTION__ occurrences
I figured out another ACPI related regression today.
randconfig testing triggered an early boot-time hang on a laptop of mine
(32-bit x86, config attached) - the screen was scrolling ACPI AML
exceptions [with no serial port and no early debugging available].
v2.6.24 works fine on that laptop with the same .config, so after a few
hours of bisection (had to restart it 3 times - other regressions
interacted), it honed in on this commit:
| 10270d4838bdc493781f5a1cf2e90e9c34c9142f is first bad commit
|
| Author: Linus Torvalds <torvalds@woody.linux-foundation.org>
| Date: Wed Feb 13 09:56:14 2008 -0800
|
| acpi: fix acpi_os_read_pci_configuration() misuse of raw_pci_read()
reverting this commit ontop of -rc5 gave a correctly booting kernel.
But this commit fixes a real bug so the real question is, why did it
break the bootup?
After quite some head-scratching, the following change stood out:
- pci_id->bus = tu8;
+ pci_id->bus = val;
pci_id->bus is defined as u16:
struct acpi_pci_id {
u16 segment;
u16 bus;
...
and 'tu8' changed from u8 to u32. So previously we'd unconditionally
mask the return value of acpi_os_read_pci_configuration()
(raw_pci_read()) to 8 bits, but now we just trust whatever comes back
from the PCI access routines and only crop it to 16 bits.
But if the high 8 bits of that result contains any noise then we'll
write that into ACPI's PCI ID descriptor and confuse the heck out of the
rest of ACPI.
So lets check the PCI-BIOS code on that theory. We have this codepath
for 8-bit accesses (arch/x86/pci/pcbios.c:pci_bios_read()):
Aha! The "=a" output constraint puts the full 32 bits of EAX into
*value. But if the BIOS's routines set any of the high bits to nonzero,
we'll return a value with more set in it than intended.
The other, more common PCI access methods (v1 and v2 PCI reads) clear
out the high bits already, for example pci_conf1_read() does:
which explicitly converts the return byte up to 32 bits and zero-extends
it.
So zero-extending the result in the PCI-BIOS read routine fixes the
regression on my laptop. ( It might fix some other long-standing issues
we had with PCI-BIOS during the past decade ... ) Both 8-bit and 16-bit
accesses were buggy.
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6:
PCI Hotplug: Fix small mem leak in IBM Hot Plug Controller Driver
PCI: rename DECLARE_PCI_DEVICE_TABLE to DEFINE_PCI_DEVICE_TABLE
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6:
drivers: fix dma_get_required_mask
firmware: provide stubs for the FW_LOADER=n case
nozomi: fix initialization and early flow control access
sysdev: fix problem with sysdev_class being re-registered
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
lguest: Do not append space to guests kernel command line
lguest: Revert 1ce70c4fac3c3954bd48c035f448793867592bc0, fix real problem.
lguest: Sanitize the lguest clock.
lguest: fix __get_vm_area usage.
lguest: make sure cpu is initialized before accessing it
* git://git.kernel.org/pub/scm/linux/kernel/git/wim/linux-2.6-watchdog:
[WATCHDOG] make watchdog/hpwdt.c:asminline_call() static
[WATCHDOG] Remove volatiles from watchdog device structures
[WATCHDOG] replace remaining __FUNCTION__ occurrences
[WATCHDOG] hpwdt: Use dmi_walk() instead of own copy
[WATCHDOG] Fix return value warning in hpwdt
[WATCHDOG] Fix declaration of struct smbios_entry_point in hpwdt
[WATCHDOG] it8712f_wdt support for 16-bit timeout values, WDIOC_GETSTATUS
Krzysztof Helt [Mon, 10 Mar 2008 18:44:00 +0000 (11:44 -0700)]
mbxfb: fix incorrect argument type
Fix wrong pointer type passed into the dev_dbg() function.
Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Acked-by: Mike Rapoport <mike@compulab.co.il> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Mon, 10 Mar 2008 18:43:59 +0000 (11:43 -0700)]
iov_iter_advance() fix
iov_iter_advance() skips over zero-length iovecs, however it does not properly
terminate at the end of the iovec array. Fix this by checking against
i->count before we skip a zero-length iov.
The bug was reproduced with a test program that continually randomly creates
iovs to writev. The fix was also verified with the same program and also it
could verify that the correct data was contained in the file after each
writev.
Paul E. McKenney [Mon, 10 Mar 2008 18:43:57 +0000 (11:43 -0700)]
rcu: move PREEMPT_RCU config option back under PREEMPT
The original preemptible-RCU patch put the choice between classic and
preemptible RCU into kernel/Kconfig.preempt, which resulted in build failures
on machines not supporting CONFIG_PREEMPT. This choice was therefore moved to
init/Kconfig, which worked, but placed the choice between classic and
preemptible RCU at the top level, a very obtuse choice indeed.
This patch changes from the Kconfig "choice" mechanism to a pair of booleans,
only one of which (CONFIG_PREEMPT_RCU) is user-visible, and is located in
kernel/Kconfig.preempt, where one would expect it to be. The other
(CONFIG_CLASSIC_RCU) is in init/Kconfig so that it is available to all
architectures, hopefully avoiding build breakage. Thanks to Roman Zippel for
suggesting this approach.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: Josh Triplett <josh@freedesktop.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Mon, 10 Mar 2008 18:43:53 +0000 (11:43 -0700)]
modules: warn about suspicious return values from module's ->init() hook
Return value convention of module's init functions is 0/-E. Sometimes,
e.g. during forward-porting mistakes happen and buggy module created,
where result of comparison "workqueue != NULL" is propagated all the way up
to sys_init_module. What happens is that some other module created
workqueue in question, our module created it again and module was
successfully loaded.
Or it could be some other bug.
Let's make such mistakes much more visible. In retrospective, such
messages would noticeably shorten some of my head-scratching sessions.
Note, that dump_stack() is just a way to get attention from user. Sample
message:
sys_init_module: 'foo'->init suspiciously returned 1, it should follow 0/-E convention
sys_init_module: loading module anyway...
Pid: 4223, comm: modprobe Not tainted 2.6.24-25f666300625d894ebe04bac2b4b3aadb907c861 #5
Rusty Russell [Mon, 10 Mar 2008 18:43:52 +0000 (11:43 -0700)]
modules: fix module waiting for dependent modules' init
Commit c9a3ba55 (module: wait for dependent modules doing init.) didn't quite
work because the waiter holds the module lock, meaning that the state of the
module it's waiting for cannot change.
Fortunately, it's fairly simple to update the state outside the lock and do
the wakeup.
Thanks to Jan Glauber for tracking this down and testing (qdio and qeth).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Jan Glauber <jang@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adam Litke [Mon, 10 Mar 2008 18:43:50 +0000 (11:43 -0700)]
hugetlb: correct page count for surplus huge pages
Free pages in the hugetlb pool are free and as such have a reference count of
zero. Regular allocations into the pool from the buddy are "freed" into the
pool which results in their page_count dropping to zero. However, surplus
pages can be directly utilized by the caller without first being freed to the
pool. Therefore, a call to put_page_testzero() is in order so that such a
page will be handed to the caller with a correct count.
This has not affected end users because the bad page count is reset before the
page is handed off. However, under CONFIG_DEBUG_VM this triggers a BUG when
the page count is validated.
Thanks go to Mel for first spotting this issue and providing an initial fix.
Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Dave Hansen <haveblue@us.ibm.com> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnaud Patard [Mon, 10 Mar 2008 18:43:48 +0000 (11:43 -0700)]
gpio/pca953x bugfix: mark as can_sleep
The pca953x driver is an I2C driver so gpio_chip->can_sleep should be set.
This lets upper layers know they should use the gpio_*_cansleep() calls to
access values, and may not access them from nonsleeping contexts.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org> Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: "eric miao" <eric.y.miao@gmail.com> Cc: Jean Delvare <khali@linux-fr.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Mon, 10 Mar 2008 18:43:48 +0000 (11:43 -0700)]
md: reduce CPU wastage on idle md array with a write-intent bitmap
Recent patch titled
Reduce CPU wastage on idle md array with a write-intent bitmap.
would sometimes leave the array with dirty bitmap bits that stay dirty. A
subsequent write would sort things out so it isn't a big problem, but should
be fixed nonetheless.
We need to make sure that when the bitmap becomes not "allclean", the
daemon_sleep really does get set to a sensible value.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
NeilBrown [Mon, 10 Mar 2008 18:43:47 +0000 (11:43 -0700)]
md: fix formatting error in /proc/mdstat
If an md array is "auto-read-only", then this appears in /proc/mdstat as
/dev/md0: active(auto-read-only)
whereas if it is truely readonly, it appears as
/dev/md0: active (read-only)
The difference being a space.
One program known to parse this file expects the space and gets badly
confused. It will be fixed, but it would be best if what the kernel generates
is more consistent too.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lee Schermerhorn [Mon, 10 Mar 2008 18:43:45 +0000 (11:43 -0700)]
mempolicy: fix reference counting bugs
Address 3 known bugs in the current memory policy reference counting method.
I have a series of patches to rework the reference counting to reduce overhead
in the allocation path. However, that series will require testing in -mm once
I repost it.
1) alloc_page_vma() does not release the extra reference taken for
vma/shared mempolicy when the mode == MPOL_INTERLEAVE. This can result in
leaking mempolicy structures. This is probably occurring, but not being
noticed.
Fix: add the conditional release of the reference.
2) hugezonelist unconditionally releases a reference on the mempolicy when
mode == MPOL_INTERLEAVE. This can result in decrementing the reference
count for system default policy [should have no ill effect] or premature
freeing of task policy. If this occurred, the next allocation using task
mempolicy would use the freed structure and probably BUG out.
Fix: add the necessary check to the release.
3) The current reference counting method assumes that vma 'get_policy()'
methods automatically add an extra reference a non-NULL returned mempolicy.
This is true for shmem_get_policy() used by tmpfs mappings, including
regular page shm segments. However, SHM_HUGETLB shm's, backed by
hugetlbfs, just use the vma policy without the extra reference. This
results in freeing of the vma policy on the first allocation, with reuse of
the freed mempolicy structure on subsequent allocations.
Fix: Rather than add another condition to the conditional reference
release, which occur in the allocation path, just add a reference when
returning the vma policy in shm_get_policy() to match the assumptions.
Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Greg KH <greg@kroah.com> Cc: Andi Kleen <ak@suse.de> Cc: Christoph Lameter <clameter@sgi.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: David Rientjes <rientjes@google.com> Cc: <eric.whitney@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>