Ingo Molnar [Tue, 28 Aug 2007 10:53:24 +0000 (12:53 +0200)]
sched: fix wait_start_fair condition in update_stats_wait_end()
Peter Zijlstra noticed the following bug in SCHED_FEAT_SKIP_INITIAL (which
is disabled by default at the moment): it relies on se.wait_start_fair
being 0 while update_stats_wait_end() did not recognize a 0 value,
so instead of 'skipping' the initial interval we gave the new child
a maximum boost of +runtime-limit ...
(No impact on the default kernel, but nice to fix for completeness.)
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
Ingo Molnar [Tue, 28 Aug 2007 10:53:24 +0000 (12:53 +0200)]
sched: make the scheduler converge to the ideal latency
de-HZ-ification of the granularity defaults unearthed a pre-existing
property of CFS: while it correctly converges to the granularity goal,
it does not prevent run-time fluctuations in the range of
[-gran ... 0 ... +gran].
With the increase of the granularity due to the removal of HZ
dependencies, this becomes visible in chew-max output (with 5 tasks
running):
average slice is the ideal 13 msecs and the period is picture-perfect 40
msecs. But the 'ran' field fluctuates around 13.33 msecs and there's no
mechanism in CFS to keep that from happening: it's a perfectly valid
solution that CFS finds.
to fix this we add a granularity/preemption rule that knows about
the "target latency", which makes tasks that run longer than the ideal
latency run a bit less. The simplest approach is to simply decrease the
preemption granularity when a task overruns its ideal latency. For this
we have to track how much the task executed since its last preemption.
( this adds a new field to task_struct, but we can eliminate that
overhead in 2.6.24 by putting all the scheduler timestamps into an
anonymous union. )
with this change in place, chew-max output is fluctuation-less all
around:
this patch has no impact on any fastpath or on any globally observable
scheduling property. (unless you have sharp enough eyes to see
millisecond-level ruckles in glxgears smoothness :-)
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Mike Galbraith <efault@gmx.de>
Mike Galbraith [Tue, 28 Aug 2007 10:53:24 +0000 (12:53 +0200)]
sched: fix sleeper bonus limit
There is an Amarok song switch time increase (regression) under
hefty load.
What is happening is that sleeper_bonus is never consumed, and only
rarely goes below runtime_limit, so for the most part, Amarok isn't
getting any bonus at all. We're keeping sleeper_bonus right at
runtime_limit (sched_latency == sched_runtime_limit == 40ms) forever, ie
we don't consume if we're lower that that, and don't add if we're above
it. One Amarok thread waking (or anybody else) will push us past the
threshold, so the next thread waking gets nada, but will reap pain from
the previous thread waking until we drop back to runtime_limit. It
looks to me like under load, some random task gets a bonus, and
everybody else pays, whether deserving or not.
This diff fixed the regression for me at any load rate.
Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Andrew Vasquez [Mon, 27 Aug 2007 22:25:01 +0000 (15:25 -0700)]
dm-mpath-rdac: don't stomp on a requests transfer bit
Without this, we get qla2xxx complaining about "ISP System Error".
What's happening here is the firmware is detecting a Xfer-ready from the
storage when in fact the data-direction for a mode-select should be a
write (DATA_OUT).
The following patch fixes the problem (typo). Verified by Brian, as
well.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Verified-by: Brian De Wolf <bldewolf@csupomona.edu> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 27 Aug 2007 22:06:28 +0000 (15:06 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
[SPARC32]: Make flush_tlb_kernel_range() an inline function.
[SERIAL]: Fix 32-bit warnings in sunzilog.c and sunsu.c
[SPARC32]: Kill unused vars and macros from prom/console.c
[SPARC32]: Add __cmpdi2() libcall implementation ala. MIPS.
[VIDEO]: Do not prom_halt() in cg3 and bw2 device probe.
[SUNVDC]: Use slice 0xff on VD_DISK_TYPE_DISK.
Linus Torvalds [Mon, 27 Aug 2007 22:06:01 +0000 (15:06 -0700)]
Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
* 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET]: Mark Paul Moore as maintainer of labelled networking.
[VLAN/BRIDGE]: Fix "skb_pull_rcsum - Fatal exception in interrupt"
[ISDN]: Get rid of some pointless allocation casts in common and bsd comp.
[NET]: Avoid pointless allocation casts in BSD compression module
[IRDA]: Do not do pointless kmalloc return value cast in KingSun driver
[NET]: Fix crash in dev_mc_sync()/dev_mc_unsync()
[PPPOL2TP]: Fix endianness annotations.
[IOAT]: ioatdma needs to to play nice in a multi-dma-client world
[SLIP]: trivial sparse warning fix
[EQL]: sparse warning fix
[NET]: is_power_of_2 in net/core/neighbour.c
[TCP]: Describe tcp_init_cwnd() thoroughly in a comment.
[NET]: Fix IP_ADD/DROP_MEMBERSHIP to handle only connectionless
[KBUILD]: Sanitize tc_ematch headers.
[IPSEC] AH4: Update IPv4 options handling to conform to RFC 4302.
Hugh Dickins [Mon, 27 Aug 2007 15:06:19 +0000 (16:06 +0100)]
fix bogus hotplug cpu warning
Fix bogus DEBUG_PREEMPT warning on x86_64, when cpu brought online after
bootup: current_is_keventd is right to note its use of smp_processor_id
is preempt-safe, but should use raw_smp_processor_id to avoid the warning.
Hugh Dickins [Mon, 27 Aug 2007 15:04:39 +0000 (16:04 +0100)]
reverse CONFIG_ACPI_PROC_EVENT default
Sigh. Again an ACPI assault on the Thinkpad's Fn+F4 to suspend to RAM.
The default and text for CONFIG_THINKPAD_ACPI_INPUT_ENABLED were fixed
in -rc3, but now commit 14e04fb34ffa82ee61ae69f98d8fca12d2e8e31c ("ACPI:
Schedule /proc/acpi/event for removal") introduces the ACPI_PROC_EVENT
config entry, and defaults it to 'n' to disable it again.
Change default to y, and add comment to make it clearer that n is for
future distros.
Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Len Brown <len.brown@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
maxcpus=N is now having no effect on x86_64, and freezing bootup on i386
(because of inconsistency with the separate maxcpus parsing down in
arch/i386, I guess). That's because early_param parsing is a little
different from __setup parsing, and needs the "=" omitted: then it seems
to work as the original commit intended (no mention of IO-APIC in
/proc/interrupts when maxcpus=0).
Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Len Brown <len.brown@intel.com> Cc: Andi Kleen <ak@suse.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 27 Aug 2007 16:42:21 +0000 (09:42 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] Fix SLB initialization at boot time
[POWERPC] Fix undefined reference to device_power_up/resume
[POWERPC] cell: Update cell_defconfig for 2.6.23
[POWERPC] axonram: Do not delete gendisks queue in error path
[POWERPC] axonram: Module modification for latest firmware API changes
[POWERPC] cell: Support pinhole-reset on IBM cell blades
[POWERPC] spu_manage: Use newer physical-id attribute
[POWERPC] pasemi: Another IOMMU bugfix for 64K PAGE_SIZE
Linus Torvalds [Mon, 27 Aug 2007 16:30:52 +0000 (09:30 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
[PARISC] Add NOTES section
[PARISC] Use compat_sys_getdents
[PARISC] Do not allow STI_CONSOLE to be modular
[PARISC] Clean up sti_flush
[PARISC] Add dummy isa_(bus|virt)_to_(virt|bus) inlines
[PARISC] Add empty <asm-parisc/vga.h>
David S. Miller [Sat, 25 Aug 2007 22:17:31 +0000 (15:17 -0700)]
[SERIAL]: Fix 32-bit warnings in sunzilog.c and sunsu.c
resource_size_t can be either a u64 or a u32, and we can't
really know for sure, so when printing such a value out
always use long-long printf formatting and cast the argument
to that type.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Aug 2007 05:05:44 +0000 (22:05 -0700)]
[SUNVDC]: Use slice 0xff on VD_DISK_TYPE_DISK.
While debugging issues with the VDS server I made the
driver use partition 2 to get at the whole disk since
this is the "whole disk" partition in the Sun disk
label.
We really should use slice 0xff which really means
the whole physical disk in the VIO disk protocol.
Otherwise things won't work well on a disk image
that doesn't have a proper disk label on it.
Signed-off-by: David S. Miller <davem@davemloft.net>
Evgeniy Polyakov [Sat, 25 Aug 2007 06:36:29 +0000 (23:36 -0700)]
[VLAN/BRIDGE]: Fix "skb_pull_rcsum - Fatal exception in interrupt"
I tried to preserve bridging code as it was before, but logic is quite
strange - I think we should free skb on error, since it is already
unshared and thus will just leak.
Benjamin Thery [Sat, 25 Aug 2007 06:12:08 +0000 (23:12 -0700)]
[NET]: Fix crash in dev_mc_sync()/dev_mc_unsync()
This patch fixes a crash that may occur when the routine dev_mc_sync()
deletes an address from the list it is currently going through. It
saves the pointer to the next element before deleting the current one.
The problem may also exist in dev_mc_unsync().
Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Sat, 25 Aug 2007 06:04:18 +0000 (23:04 -0700)]
[PPPOL2TP]: Fix endianness annotations.
{s,d}_{session,tunnel} in pppol2tp_addr are actually host-endian
everywhere. We might switch them to net-endian, of course, but
that structure is exposed to userland via getname...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 25 Aug 2007 06:02:53 +0000 (23:02 -0700)]
[IOAT]: ioatdma needs to to play nice in a multi-dma-client world
Now that the DMA engine has a multi-client interface, fix the ioatdma
driver to play along. At the same time, remove a couple of unnecessary
reads and writes.
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 25 Aug 2007 05:21:50 +0000 (22:21 -0700)]
[TCP]: Describe tcp_init_cwnd() thoroughly in a comment.
People often get tripped up by this function and think that
it does not implemented the prescribed algorithms from
RFC2414 and RFC3390, even though it does.
So add a comment to head off such misunderstandings in the
future.
Signed-off-by: David S. Miller <davem@davemloft.net>
Nick Bowler [Wed, 22 Aug 2007 19:33:51 +0000 (12:33 -0700)]
[IPSEC] AH4: Update IPv4 options handling to conform to RFC 4302.
In testing our ESP/AH offload hardware, I discovered an issue with how
AH handles mutable fields in IPv4. RFC 4302 (AH) states the following
on the subject:
For IPv4, the entire option is viewed as a unit; so even
though the type and length fields within most options are immutable
in transit, if an option is classified as mutable, the entire option
is zeroed for ICV computation purposes.
The current implementation does not zero the type and length fields,
resulting in authentication failures when communicating with hosts
that do (i.e. FreeBSD).
I have tested record route and timestamp options (ping -R and ping -T)
on a small network involving Windows XP, FreeBSD 6.2, and Linux hosts,
with one router. In the presence of these options, the FreeBSD and
Linux hosts (with the patch or with the hardware) can communicate.
The Windows XP host simply fails to accept these packets with or
without the patch.
I have also been trying to test source routing options (using
traceroute -g), but haven't had much luck getting this option to work
*without* AH, let alone with.
Signed-off-by: Nick Bowler <nbowler@ellipticsemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
shutdown_bridge_irq disconnects the irq so we need to connect the irq or
requesting the same irq a send time will fail. This used to make
things like ifconfig eth0 down; ifconfig eth0 up fail on IP27.
Ralf Baechle [Thu, 23 Aug 2007 13:17:14 +0000 (14:17 +0100)]
[MIPS] PCI: Remove __devinit attribute from pcibios_fixup_bus.
Since 96bde06a2df1b363206d3cdef53134b84ff37813 several callers of
pcibios_resource_to_bus are no longer marked __devinit resulting in a
pile of modpost warnings if PCI && !HOTPLUG:
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x15dde8): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_map_rom' and 'pci_map_rom_copy')
WARNING: vmlinux.o(.text+0x15e140): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_update_resource' and 'pci_claim_resource')
WARNING: vmlinux.o(.text+0x15f0cc): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_setup_cardbus' and 'pci_bus_assign_resources')
WARNING: vmlinux.o(.text+0x15f0f0): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_setup_cardbus' and 'pci_bus_assign_resources')
WARNING: vmlinux.o(.text+0x15f114): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_setup_cardbus' and 'pci_bus_assign_resources')
WARNING: vmlinux.o(.text+0x15f138): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_setup_cardbus' and 'pci_bus_assign_resources')
WARNING: vmlinux.o(.text+0x15f438): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_bus_assign_resources' and 'pbus_size_mem')
WARNING: vmlinux.o(.text+0x15f4f4): Section mismatch: reference to .init.text:pcibios_resource_to_bus (between 'pci_bus_assign_resources' and 'pbus_size_mem')
Removing __devinit from pcibios_resource_to_bus make the same necessary
for pcibios_fixup_device_resources as well.
Ralf Baechle [Thu, 23 Aug 2007 13:12:56 +0000 (14:12 +0100)]
[MIPS] PCI: Remove __devinit attribute from pcibios_fixup_bus.
Since 96bde06a2df1b363206d3cdef53134b84ff37813 pcibios_fixup_bus's caller
pci_scan_child_bus is no longer marked __devinit resulting in this modpost
warning if PCI && !HOTPLUG:
MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x158b9c): Section mismatch: reference to .init.text:pcibios_fixup_bus (between 'pci_scan_child_bus' and 'pci_scan_bus_parented')
Atsushi Nemoto [Sun, 19 Aug 2007 13:32:10 +0000 (22:32 +0900)]
[PATCH] rtc: Make rtc-rs5c348 driver hotplug-aware
The rtc-rs5c348 SPI driver name doesn't match its module name, which
prevents it from properly hotplugging. There is only one in-tree user
of its driver, which is fixed by this patch too.
Ralf Baechle [Thu, 16 Aug 2007 11:10:16 +0000 (12:10 +0100)]
[MIPS] Fix gcc 3.3 warning.
CC arch/mips/kernel/cpu-bugs64.o
arch/mips/kernel/cpu-bugs64.c: In function 'align_mod':
arch/mips/kernel/cpu-bugs64.c:23: warning: asm operand 0 probably doesn't match constraints
arch/mips/kernel/cpu-bugs64.c:23: warning: asm operand 1 probably doesn't match constraints
Leaving these sections is useful to some tools that look at the image, and
none of them are loaded into memory. The .mdebug.abi64 section, in
particular, lets GDB recognize vmlinux.32 as an N64 program instead of
guessing that it is O32.
Signed-off-by: Daniel Jacobowitz <dan@codesourcery.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Ralf Baechle [Wed, 1 Aug 2007 14:25:28 +0000 (15:25 +0100)]
[MIPS] Fix computation of {PGD,PMD,PTE}_T_LOG2.
For the generation of asm-offset.h to work these need to be evaulatable
by gcc as a constant expression. This issue did exist for a while but
didn't bite because they're only in asm-offset.h for debugging purposes.