Peter Williams [Wed, 24 Oct 2007 16:23:51 +0000 (18:23 +0200)]
sched: reduce balance-tasks overhead
At the moment, balance_tasks() provides low level functionality for both
move_tasks() and move_one_task() (indirectly) via the load_balance()
function (in the sched_class interface) which also provides dual
functionality. This dual functionality complicates the interfaces and
internal mechanisms and makes the run time overhead of operations that
are called with two run queue locks held.
This patch addresses this issue and reduces the overhead of these
operations.
Signed-off-by: Peter Williams <pwil3058@bigpond.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
profile=sleep only works if CONFIG_SCHEDSTATS is set. This patch notes
the limitation in Documentation/kernel-parameters.txt and prints a
warning at boot-time if profile=sleep is used without CONFIG_SCHEDSTAT.
Satyam Sharma [Wed, 24 Oct 2007 16:23:50 +0000 (18:23 +0200)]
sched: use show_regs() to improve __schedule_bug() output
A full register dump along with stack backtrace would make the
"scheduling while atomic" message more helpful. Use show_regs() instead
of dump_stack() for this. We already know we're atomic in here (that is
why this function was called) so show_regs()'s atomicity expectations
are guaranteed.
Also, modify the output of the "BUG: scheduling while atomic:" header a
bit to keep task->comm and task->pid together and preempt_count() after
them.
Milton Miller [Wed, 24 Oct 2007 16:23:48 +0000 (18:23 +0200)]
sched: fix sched_domain sysctl registration again
commit 029190c515f15f512ac85de8fc686d4dbd0ae731 (cpuset
sched_load_balance flag) was not tested SCHED_DEBUG enabled as
committed as it dereferences NULL when used and it reordered
the sysctl registration to cause it to never show any domains
or their tunables.
Fixes:
1) restore arch_init_sched_domains ordering
we can't walk the domains before we build them
presently we register cpus with empty directories (no domain
directories or files).
2) make unregister_sched_domain_sysctl do nothing when already unregistered
detach_destroy_domains is now called one set of cpus at a time
unregister_syctl dereferences NULL if called with a null.
While the the function would always dereference null if called
twice, in the previous code it was always called once and then
was followed a register. So only the hidden bug of the
sysctl_root_table not being allocated followed by an attempt to
free it would have shown the error.
3) always call unregister and register in partition_sched_domains
The code is "smart" about unregistering only needed domains.
Since we aren't guaranteed any calls to unregister, always
unregister. Without calling register on the way out we
will not have a table or any sysctl tree.
4) warn if register is called without unregistering
The previous table memory is lost, leaving pointers to the
later freed memory in sysctl and leaking the memory of the
tables.
Before this patch on a 2-core 4-thread box compiled for SMT and NUMA,
the domains appear empty (there are actually 3 levels per cpu). And as
soon as two domains a null pointer is dereferenced (unreliable in this
case is stack garbage):
bu19a:~# ls -R /proc/sys/kernel/sched_domain/
/proc/sys/kernel/sched_domain/:
cpu0 cpu1 cpu2 cpu3
Linus Torvalds [Wed, 24 Oct 2007 03:50:57 +0000 (20:50 -0700)]
Linux 2.6.24-rc1
The patch is big. Really big. You just won't believe how vastly hugely
mindbogglingly big it is. I mean you may think it's a long way down the
road to the chemist, but that's just peanuts to how big the patch from
2.6.23 is.
Greg Ungerer [Wed, 24 Oct 2007 02:03:25 +0000 (12:03 +1000)]
m68knommu: cleanup 68360 startup code
Clean up 68360 timer support code. Removed header includes not needed.
Remove use of old m68knommu timer function pointers. Use common function
naming for 68328 timer functions.
Linus Torvalds [Wed, 24 Oct 2007 01:57:39 +0000 (18:57 -0700)]
Merge branch 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6
* 'irq-upstream' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
[SPARC, XEN, NET/CXGB3] use irq_handler_t where appropriate
drivers/char/riscom8: clean up irq handling
isdn/sc: irq handler clean
isdn/act2000: fix major bug. clean irq handler.
char/pcmcia/synclink_cs: trim trailing whitespace
drivers/char/ip2: separate polling and irq-driven work entry points
drivers/char/ip2: split out irq core logic into separate function
[NETDRVR] lib82596, netxen: delete pointless tests from irq handler
Eliminate pointless casts from void* in a few driver irq handlers.
[PARPORT] Remove unused 'irq' argument from parport irq functions
[PARPORT] Kill useful 'irq' arg from parport_{generic_irq,ieee1284_interrupt}
[PARPORT] Consolidate code copies into a single generic irq handler
Linus Torvalds [Wed, 24 Oct 2007 01:56:54 +0000 (18:56 -0700)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (39 commits)
Remove Andrew Morton from list of net driver maintainers.
bonding: Acquire correct locks in alb for promisc change
bonding: Convert more locks to _bh, acquire rtnl, for new locking
bonding: Convert locks to _bh, rework alb locking for new locking
bonding: Convert miimon to new locking
bonding: Convert balance-rr transmit to new locking
Convert bonding timers to workqueues
Update MAINTAINERS to reflect my (jgarzik's) current efforts.
pasemi_mac: fix typo
defxx.c: dfx_bus_init() is __devexit not __devinit
s390 MAINTAINERS
remove header_ops bug in qeth driver
sky2: crash on remove
MIPSnet: Delete all the useless debugging printks.
AR7 ethernet: small post-merge cleanups and fixes
mv643xx_eth: Hook up mv643xx_get_sset_count
mv643xx_eth: Remove obsolete checksum offload comment
mv643xx_eth: Merge drivers/net/mv643xx_eth.h into mv643xx_eth.c
mv643xx_eth: Remove unused register defines
mv643xx_eth: Clean up mv643xx_eth.h
...
Jay Vosburgh [Thu, 18 Oct 2007 00:37:50 +0000 (17:37 -0700)]
bonding: Convert more locks to _bh, acquire rtnl, for new locking
Convert more lock acquisitions to _bh flavor to avoid deadlock
with workqueue activity and add acquisition of RTNL in appropriate places.
Affects ALB mode, as well as core bonding functions and sysfs.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:49 +0000 (17:37 -0700)]
bonding: Convert locks to _bh, rework alb locking for new locking
Convert locking-related activity to new & improved system.
Convert some lock acquisitions to _bh and rework parts of ALB mode, both
to avoid deadlocks with workqueue activity.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:48 +0000 (17:37 -0700)]
bonding: Convert miimon to new locking
Convert mii (link state) monitor to acquire correct locks for
failover events. In particular, failovers generally require RTNL at a low
level (when manipulating device MAC addresses, for example) and no other
locks. The high level monitor is responsible for acquiring a known set
of locks, RTNL, the bond->lock for read and the slave_lock for write, and
the low level failover processing can then release appropriate locks as
needed. This patch provides the high level portion.
As it is undesirable to acquire RTNL for every monitor pass (which
may occur as often as every 10 ms), the miimon has been converted to
do conditional locking. A first pass inspects all slaves to determine
if any action is required, and if so, a second pass (after acquring RTNL)
is done to perform any actions (doing a complete rescan, as the situation
may have changed when all locks were released).
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:47 +0000 (17:37 -0700)]
bonding: Convert balance-rr transmit to new locking
Change locking in balance-rr transmit processing to use a free
running counter to determine which slave to transmit on. Instead, a
free-running counter is maintained, and modulo arithmetic used to select
a slave for transmit.
This removes lock operations from the TX path, and eliminates
a deadlock introduced by the conversion to work queues.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Jay Vosburgh [Thu, 18 Oct 2007 00:37:45 +0000 (17:37 -0700)]
Convert bonding timers to workqueues
Convert bonding timers to workqueues. This converts the various
monitor functions to run in periodic work queues instead of timers. This
patch introduces the framework and convers the calls, but does not resolve
various locking issues, and does not stand alone.
Signed-off-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Olof Johansson [Sat, 20 Oct 2007 19:10:03 +0000 (14:10 -0500)]
pasemi_mac: fix typo
Add missing &:
drivers/net/pasemi_mac.c: In function 'pasemi_mac_clean_rx':
drivers/net/pasemi_mac.c:553: warning: passing argument 1 of 'prefetch'
makes pointer from integer without a cast
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Ursula Braun [Mon, 22 Oct 2007 14:16:14 +0000 (16:16 +0200)]
remove header_ops bug in qeth driver
Remove qeth bug caused by commit:
[NET]: Move hardware header operations out of netdevice.
This is the second part of the qeth header_ops patch, since
first patch sent 10/19 has been insufficient.
Nevertheless first patch is still valid and should be kept.
Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Neil Brown [Tue, 23 Oct 2007 21:09:13 +0000 (17:09 -0400)]
NFS: Fix for bug in handling of errors for O_DIRECT writes
Commit eda3cef8dd2b83875affe82595db9d0c278879b2 ("NFS: Fix error
handling in nfs_direct_write_result()") ensured that if a WRITE returns
an error, then data->res.verf->committed is not tested (as it is not
initialised).
So move the test so that we never examine ->committed in an error case,
and fix a speeling error while we are there.
Cc: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 23 Oct 2007 23:32:11 +0000 (16:32 -0700)]
Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4630/1: Fix the vector stride of the double vector instruction.
[ARM] 4629/1: Fix VFP emulation code to clear all exception flags of FPEXC
[ARM] 4613/1: pxa300: MFP typo fix
Carlos Corbacho [Fri, 19 Oct 2007 17:51:27 +0000 (18:51 +0100)]
x86: Force enable HPET for CK804 (nForce 4) chipsets
This patch adds a quirk from LinuxBIOS to force enable HPET on
the nVidia CK804 (nForce 4) chipset.
This quirk can very likely support more than just nForce 4
(LinuxBIOS use the same code for nForce 5), and possibly nForce 3,
but I don't have those chipsets, so cannot add and test them.
H. Peter Anvin [Tue, 23 Oct 2007 20:37:25 +0000 (22:37 +0200)]
x86: clean up setup.h and the boot code
Make <asm/setup.h> usable by the boot code.
Clean up vestiges of the old command-line protocol from setup.h and
head_32.S (it is still supported from the boot loader point of
view, since it is converted to the new command-line protocol by the
boot code.)
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Dave Johnson [Tue, 23 Oct 2007 20:37:22 +0000 (22:37 +0200)]
x86: fix more TSC clock source calibration errors
The previous patch wasn't correctly handling the 'count' variable. If
a CPU gave bad results on the 1st or 2nd run but good results on the
3rd, it wouldn't do the correct thing. No idea if any such CPU
exists, but the patch below handles that case by discarding the bad
runs.
If a bad result (too quick, or too slow) occurs on any of the 3 runs
it will be discarded.
Also updated some comments to explain what's going on.
Signed-off-by: Dave Johnson <djohnson@sw.starentnetworks.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>