]> pilppa.com Git - linux-2.6-omap-h63xx.git/log
linux-2.6-omap-h63xx.git
16 years agosched: convert local_cpu_mask to cpumask_var_t, fix
Rusty Russell [Mon, 24 Nov 2008 23:28:41 +0000 (09:58 +1030)]
sched: convert local_cpu_mask to cpumask_var_t, fix

Impact: build fix for !CONFIG_SMP

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert nohz struct to cpumask_var_t, fix
Rusty Russell [Mon, 24 Nov 2008 23:27:51 +0000 (09:57 +1030)]
sched: convert nohz struct to cpumask_var_t, fix

Impact: build fix

Fix the !CONFIG_SMP case.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert remaining old-style cpumask operators
Rusty Russell [Mon, 24 Nov 2008 16:05:14 +0000 (02:35 +1030)]
sched: convert remaining old-style cpumask operators

Impact: Trivial API conversion

  NR_CPUS -> nr_cpu_ids
  cpumask_t -> struct cpumask
  sizeof(cpumask_t) -> cpumask_size()
  cpumask_a = cpumask_b -> cpumask_copy(&cpumask_a, &cpumask_b)

  cpu_set() -> cpumask_set_cpu()
  first_cpu() -> cpumask_first()
  cpumask_of_cpu() -> cpumask_of()
  cpus_* -> cpumask_*

There are some FIXMEs where we all archs to complete infrastructure
(patches have been sent):

  cpu_coregroup_map -> cpu_coregroup_mask
  node_to_cpumask* -> cpumask_of_node

There is also one FIXME where we pass an array of cpumasks to
partition_sched_domains(): this implies knowing the definition of
'struct cpumask' and the size of a cpumask.  This will be fixed in a
future patch.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert local_cpu_mask to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:13 +0000 (02:35 +1030)]
sched: convert local_cpu_mask to cpumask_var_t.

Impact: (future) size reduction for large NR_CPUS.

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert check_preempt_equal_prio to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:13 +0000 (02:35 +1030)]
sched: convert check_preempt_equal_prio to cpumask_var_t.

Impact: stack reduction for large NR_CPUS

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
stack space.

We simply return if the allocation fails: since we don't use it we
could just pass NULL to cpupri_find and have it handle that.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert struct cpupri_vec cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:13 +0000 (02:35 +1030)]
sched: convert struct cpupri_vec cpumask_var_t.

Impact: stack usage reduction, (future) size reduction for large NR_CPUS.

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.

The fact cpupro_init is called both before and after the slab is
available makes for an ugly parameter unfortunately.

We also use cpumask_any_and to get rid of a temporary in cpupri_find.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert falback_doms to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:12 +0000 (02:35 +1030)]
sched: convert falback_doms to cpumask_var_t.

Impact: (future) size reduction for large NR_CPUS.

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert cpu_isolated_map to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:12 +0000 (02:35 +1030)]
sched: convert cpu_isolated_map to cpumask_var_t.

Impact: stack usage reduction, (future) size reduction, cleanup

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.

We can also use cpulist_parse() instead of doing it manually in
isolated_cpu_setup.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert sched_domain_debug to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:12 +0000 (02:35 +1030)]
sched: convert sched_domain_debug to cpumask_var_t.

Impact: stack usage reduction

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
stack space.  cpumask_var_t is just a struct cpumask for
!CONFIG_CPUMASK_OFFSTACK.

In this case, we always alloced, but we don't need to any more.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert struct (sys_)sched_setaffinity() to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:11 +0000 (02:35 +1030)]
sched: convert struct (sys_)sched_setaffinity() to cpumask_var_t.

Impact: stack usage reduction

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space on the stack.  cpumask_var_t is just a struct cpumask for
!CONFIG_CPUMASK_OFFSTACK.

Note the removal of the initializer of new_mask: since the first thing
we did was "cpus_and(new_mask, new_mask, cpus_allowed)" I just changed
that to "cpumask_and(new_mask, in_mask, cpus_allowed);".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: avoid stack var in move_task_off_dead_cpu
Rusty Russell [Mon, 24 Nov 2008 16:05:11 +0000 (02:35 +1030)]
sched: avoid stack var in move_task_off_dead_cpu

Impact: stack usage reduction

With some care, we can avoid needing a temporary cpumask (we can't
really allocate here, since we can't fail).

This version calls cpuset_cpus_allowed_locked() with the task_rq_lock
held.  I'm fairly sure this works, but there might be a deadlock
hiding.

And of course, we can't get rid of the last cpumask on stack until we
can use cpumask_of_node instead of node_to_cpumask.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert sys_sched_getaffinity() to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:11 +0000 (02:35 +1030)]
sched: convert sys_sched_getaffinity() to cpumask_var_t.

Impact: stack usage reduction

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space in the stack.  cpumask_var_t is just a struct cpumask for
!CONFIG_CPUMASK_OFFSTACK.

Some jiggling here to make sure we always exit at the bottom (so we hit
the free_cpumask_var there).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert rebalance_domains() to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:11 +0000 (02:35 +1030)]
sched: convert rebalance_domains() to cpumask_var_t.

Impact: stack usage reduction

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space in the stack.  cpumask_var_t is just a struct cpumask for
!CONFIG_CPUMASK_OFFSTACK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert idle_balance() to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:10 +0000 (02:35 +1030)]
sched: convert idle_balance() to cpumask_var_t.

Impact: stack usage reduction

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space in the stack.  cpumask_var_t is just a struct cpumask for
!CONFIG_CPUMASK_OFFSTACK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert nohz struct to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:09 +0000 (02:35 +1030)]
sched: convert nohz struct to cpumask_var_t.

Impact: (future) size reduction for large NR_CPUS.

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert struct root_domain to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:05 +0000 (02:35 +1030)]
sched: convert struct root_domain to cpumask_var_t.

Impact: (future) size reduction for large NR_CPUS.

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.

def_root_domain is static, and so its masks are initialized with
alloc_bootmem_cpumask_var.  After that, alloc_cpumask_var is used.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert nohz_cpu_mask to cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:04 +0000 (02:35 +1030)]
sched: convert nohz_cpu_mask to cpumask_var_t.

Impact: (future) size reduction for large NR_CPUS.

Dynamically allocating cpumasks (when CONFIG_CPUMASK_OFFSTACK) saves
space for small nr_cpu_ids but big CONFIG_NR_CPUS.  cpumask_var_t
is just a struct cpumask for !CONFIG_CPUMASK_OFFSTACK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert struct sched_group/sched_domain cpumask_ts to variable bitmaps
Rusty Russell [Mon, 24 Nov 2008 16:05:04 +0000 (02:35 +1030)]
sched: convert struct sched_group/sched_domain cpumask_ts to variable bitmaps

Impact: (future) size reduction for large NR_CPUS.

We move the 'cpumask' member of sched_group to the end, so when we
kmalloc it we can do a minimal allocation: saves space for small
nr_cpu_ids but big CONFIG_NR_CPUS.  Similar trick for 'span' in
sched_domain.

This isn't quite as good as converting to a cpumask_var_t, as some
sched_groups are actually static, but it's safer: we don't have to
figure out where to call alloc_cpumask_var/free_cpumask_var.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: wrap sched_group and sched_domain cpumask accesses.
Rusty Russell [Mon, 24 Nov 2008 16:05:04 +0000 (02:35 +1030)]
sched: wrap sched_group and sched_domain cpumask accesses.

Impact: trivial wrap of member accesses

This eases the transition in the next patch.

We also get rid of a temporary cpumask in find_idlest_cpu() thanks to
for_each_cpu_and, and sched_balance_self() due to getting weight before
setting sd to NULL.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: remove any_online_cpu()
Rusty Russell [Mon, 24 Nov 2008 16:05:03 +0000 (02:35 +1030)]
sched: remove any_online_cpu()

Impact: use new API

any_online_cpu() is a good name, but it takes a cpumask_t, not a
pointer.

There are several places where any_online_cpu() doesn't really want a
mask arg at all.  Replace all callers with cpumask_any() and
cpumask_any_and().

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: get rid of boutique sched.c allocations, use cpumask_var_t.
Rusty Russell [Mon, 24 Nov 2008 16:05:03 +0000 (02:35 +1030)]
sched: get rid of boutique sched.c allocations, use cpumask_var_t.

Impact: use new general API

Using lots of allocs rather than one big alloc is less efficient, but
who cares for this setup function?

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: convert sched.c from for_each_cpu_mask to for_each_cpu.
Rusty Russell [Mon, 24 Nov 2008 16:05:02 +0000 (02:35 +1030)]
sched: convert sched.c from for_each_cpu_mask to for_each_cpu.

Impact: trivial API conversion

This is a simple conversion, but note that for_each_cpu() terminates
with i >= nr_cpu_ids, not i == NR_CPUS like for_each_cpu_mask() did.

I don't convert all of them: sd->span changes in a later patch, so
change those iterators there rather than here.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: reduce stack size requirements in kernel/sched.c
Mike Travis [Mon, 24 Nov 2008 16:05:02 +0000 (02:35 +1030)]
sched: reduce stack size requirements in kernel/sched.c

Impact: cleanup

  * use node_to_cpumask_ptr in place of node_to_cpumask to reduce stack
    requirements in sched.c

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branches 'sched/core', 'core/core' and 'tracing/core' into cpus4096
Ingo Molnar [Mon, 24 Nov 2008 16:46:57 +0000 (17:46 +0100)]
Merge branches 'sched/core', 'core/core' and 'tracing/core' into cpus4096

16 years agoMerge branches 'tracing/branch-tracer', 'tracing/fastboot', 'tracing/ftrace', 'tracin...
Ingo Molnar [Mon, 24 Nov 2008 16:46:24 +0000 (17:46 +0100)]
Merge branches 'tracing/branch-tracer', 'tracing/fastboot', 'tracing/ftrace', 'tracing/function-return-tracer', 'tracing/power-tracer', 'tracing/powerpc', 'tracing/ring-buffer', 'tracing/stack-tracer' and 'tracing/urgent' into tracing/core

16 years agoMerge branches 'core/debug', 'core/futexes', 'core/locking', 'core/rcu', 'core/signal...
Ingo Molnar [Mon, 24 Nov 2008 16:44:55 +0000 (17:44 +0100)]
Merge branches 'core/debug', 'core/futexes', 'core/locking', 'core/rcu', 'core/signal', 'core/urgent' and 'core/xen' into core/core

16 years agoMerge branch 'sched/rt' into sched/core
Ingo Molnar [Mon, 24 Nov 2008 16:37:12 +0000 (17:37 +0100)]
Merge branch 'sched/rt' into sched/core

16 years agomutex: __used is needed for function referenced only from inline asm
Török Edwin [Mon, 24 Nov 2008 08:17:42 +0000 (10:17 +0200)]
mutex: __used is needed for function referenced only from inline asm

Impact: fix build failure on llvm-gcc-4.2

According to the gcc manual, the 'used' attribute should be applied to
functions referenced only from inline assembly.
This fixes a build failure with llvm-gcc-4.2, which deleted
__mutex_lock_slowpath, __mutex_unlock_slowpath.

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agovfs, seqfile: fix comment style on mangle_path
Török Edwin [Sun, 23 Nov 2008 21:24:53 +0000 (23:24 +0200)]
vfs, seqfile: fix comment style on mangle_path

Impact: use standard docbook tags

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/function-return-tracer: free the return stack on free_task()
Frederic Weisbecker [Sun, 23 Nov 2008 17:43:39 +0000 (18:43 +0100)]
tracing/function-return-tracer: free the return stack on free_task()

Impact: avoid losing some traces when a task is freed

do_exit() is not the last function called when a task finishes.
There are still some functions which are to be called such as
ree_task().  So we delay the freeing of the return stack to the
last moment.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing, doc: update mmiotrace documentation
Pekka Paalanen [Sun, 23 Nov 2008 19:24:59 +0000 (21:24 +0200)]
tracing, doc: update mmiotrace documentation

Impact: update documentation

Update to reflect the current state of the tracing framework:

 - "none" tracer has been replaced by "nop" tracer
 - tracing_enabled must be toggled when changing buffer size

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86, mmiotrace: fix buffer overrun detection
Pekka Paalanen [Sun, 23 Nov 2008 19:24:30 +0000 (21:24 +0200)]
x86, mmiotrace: fix buffer overrun detection

Impact: fix mmiotrace overrun tracing

When ftrace framework moved to use the ring buffer facility, the buffer
overrun detection was broken after 2.6.27 by commit

| commit 3928a8a2d98081d1bc3c0a84a2d70e29b90ecf1c
| Author: Steven Rostedt <rostedt@goodmis.org>
| Date:   Mon Sep 29 23:02:41 2008 -0400
|
|     ftrace: make work with new ring buffer
|
|     This patch ports ftrace over to the new ring buffer.

The detection is now fixed by using the ring buffer API.

When mmiotrace detects a buffer overrun, it will report the number of
lost events. People reading an mmiotrace log must know if something was
missed, otherwise the data may not make sense.

Signed-off-by: Pekka Paalanen <pq@iki.fi>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/function-return-tracer: don't trace kfree while it frees the return stack
Frederic Weisbecker [Sun, 23 Nov 2008 16:33:12 +0000 (17:33 +0100)]
tracing/function-return-tracer: don't trace kfree while it frees the return stack

Impact: fix a crash

While I killed the cat process, I got sometimes the following (but rare)
crash:

[   65.689027] Pid: 2969, comm: cat Not tainted (2.6.28-rc6-tip #83) AMILO Li 2727
[   65.689027] EIP: 0060:[<00000000>] EFLAGS: 00010082 CPU: 1
[   65.689027] EIP is at 0x0
[   65.689027] EAX: 00000000 EBX: f66cd780 ECX: c019a64a EDX: f66cd780
[   65.689027] ESI: 00000286 EDI: f66cd780 EBP: f630be2c ESP: f630be24
[   65.689027]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[   65.689027] Process cat (pid: 2969, ti=f630a000 task=f66cd780 task.ti=f630a000)
[   65.689027] Stack:
[   65.689027]  00000012 f630bd54 f630be7c c012c853 00000000 c0133cc9 f66cda54 f630be5c
[   65.689027]  f630be68 f66cda54 f66cd88c f66cd878 f7070000 00000001 f630be90 c0135dbc
[   65.689027]  f614a614 f630be68 f630be68 f65ba200 00000002 f630bf10 f630be90 c012cad6
[   65.689027] Call Trace:
[   65.689027]  [<c012c853>] ? do_exit+0x603/0x850
[   65.689027]  [<c0133cc9>] ? next_signal+0x9/0x40
[   65.689027]  [<c0135dbc>] ? dequeue_signal+0x8c/0x180
[   65.689027]  [<c012cad6>] ? do_group_exit+0x36/0x90
[   65.689027]  [<c013709c>] ? get_signal_to_deliver+0x20c/0x390
[   65.689027]  [<c0102b69>] ? do_notify_resume+0x99/0x8b0
[   65.689027]  [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[   65.689027]  [<c014db9b>] ? trace_hardirqs_on+0xb/0x10
[   65.689027]  [<c02e6d1a>] ? tty_ldisc_deref+0x5a/0x80
[   65.689027]  [<c02e39b0>] ? n_tty_write+0x0/0x340
[   65.689027]  [<c02e1812>] ? redirected_tty_write+0x82/0x90
[   65.689027]  [<c019ee99>] ? vfs_write+0x99/0xd0
[   65.689027]  [<c02e1790>] ? redirected_tty_write+0x0/0x90
[   65.689027]  [<c019f342>] ? sys_write+0x42/0x70
[   65.689027]  [<c01035ca>] ? work_notifysig+0x13/0x19
[   65.689027] Code:  Bad EIP value.
[   65.689027] EIP: [<00000000>] 0x0 SS:ESP 0068:f630be24

This is because on do_exit(), kfree is called to free the return addresses stack
but kfree is traced and stored its return address in this stack.
This patch fixes it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branch 'ppc/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Ingo Molnar [Sun, 23 Nov 2008 12:47:54 +0000 (13:47 +0100)]
Merge branch 'ppc/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/powerpc

16 years agotracing/stack-tracer: avoid races accessing file
Török Edwin [Sun, 23 Nov 2008 11:08:10 +0000 (13:08 +0200)]
tracing/stack-tracer: avoid races accessing file

Impact: fix race

vma->vm_file reference is only stable while holding the mmap_sem,
so move usage of it to within the critical section.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORT
Török Edwin [Sun, 23 Nov 2008 10:39:08 +0000 (12:39 +0200)]
tracing/stack-tracer: introduce CONFIG_USER_STACKTRACE_SUPPORT

Impact: cleanup

User stack tracing is just implemented for x86, but it is not x86 specific.

Introduce a generic config flag, that is currently enabled only for x86.
When other arches implement it, they will have to
SELECT USER_STACKTRACE_SUPPORT.

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/stack-tracer: fix locking and refcounts
Török Edwin [Sun, 23 Nov 2008 10:39:07 +0000 (12:39 +0200)]
tracing/stack-tracer: fix locking and refcounts

Impact: fix refcounting/object-access bug

Hold mmap_sem while looking up/accessing vma.
Hold the RCU lock while using the task we looked up.

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/stack-tracer: fix style issues
Török Edwin [Sun, 23 Nov 2008 10:39:06 +0000 (12:39 +0200)]
tracing/stack-tracer: fix style issues

Impact: cleanup

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotrace: fix compiler warning in branch profiler
Steven Rostedt [Fri, 21 Nov 2008 19:44:57 +0000 (14:44 -0500)]
trace: fix compiler warning in branch profiler

Impact: fix compiler warning

The ftrace_pointers used in the branch profiler are constant values.
They should never change. But the compiler complains when they are
passed into the debugfs_create_file as a data pointer, because the
function discards the qualifier.

This patch typecasts the parameter to debugfs_create_file back to
a void pointer. To remind the callbacks that they are pointing to
a constant value, I also modified the callback local pointers to
be const struct ftrace_pointer * as well.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: add ftrace_off_permanent
Steven Rostedt [Fri, 21 Nov 2008 17:59:38 +0000 (12:59 -0500)]
ftrace: add ftrace_off_permanent

Impact: add new API to disable all of ftrace on anomalies

It case of a serious anomaly being detected (like something caught by
lockdep) it is a good idea to disable all tracing immediately, without
grabing any locks.

This patch adds ftrace_off_permanent that disables the tracers, function
tracing and ring buffers without a way to enable them again. This should
only be used when something serious has been detected.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoring-buffer: add tracing_off_permanent
Steven Rostedt [Fri, 21 Nov 2008 17:41:55 +0000 (12:41 -0500)]
ring-buffer: add tracing_off_permanent

Impact: feature to permanently disable ring buffer

This patch adds a API to the ring buffer code that will permanently
disable the ring buffer from ever recording. This should only be
called when some serious anomaly is detected, and the system
may be in an unstable state. When that happens, shutting down the
recording to the ring buffers may be appropriate.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: scripts/recordmcount.pl support for ARM
Jim Radford [Fri, 21 Nov 2008 03:48:39 +0000 (19:48 -0800)]
ftrace: scripts/recordmcount.pl support for ARM

Impact: extend scripts/recordmcount.pl to ARM

Arm uses %progbits instead of @progbits and requires only 4 byte alignment.

[ Thanks to Sam Ravnborg for mentioning that ARM uses %progbits ]

Signed-off-by: Jim Radford <radford@galvanix.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: specify $alignment for sh architecture
Matt Fleming [Thu, 20 Nov 2008 21:49:52 +0000 (21:49 +0000)]
ftrace: specify $alignment for sh architecture

Impact: extend scripts/recordmcount.pl with default alignment for SH

Set $alignment=2 for the sh architecture so that a ".align 2" directive
will be emitted for all __mcount_loc sections. Fix a whitspace error
while I'm here (converted spaces to tabs).

Signed-off-by: Matt Fleming <mjf@gentoo.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotrace: profile all if conditionals
Steven Rostedt [Fri, 21 Nov 2008 06:30:54 +0000 (01:30 -0500)]
trace: profile all if conditionals

Impact: feature to profile if statements

This patch adds a branch profiler for all if () statements.
The results will be found in:

  /debugfs/tracing/profile_branch

For example:

   miss      hit    %        Function                  File              Line
 ------- ---------  -        --------                  ----              ----
       0        1 100 x86_64_start_reservations      head64.c             127
       0        1 100 copy_bootdata                  head64.c             69
       1        0   0 x86_64_start_kernel            head64.c             111
      32        0   0 set_intr_gate                  desc.h               319
       1        0   0 reserve_ebda_region            head.c               51
       1        0   0 reserve_ebda_region            head.c               47
       0        1 100 reserve_ebda_region            head.c               42
       0        0   X maxcpus                        main.c               165

Miss means the branch was not taken. Hit means the branch was taken.
The percent is the percentage the branch was taken.

This adds a significant amount of overhead and should only be used
by those analyzing their system.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotrace: branch profiling should not print percent without data
Steven Rostedt [Fri, 21 Nov 2008 06:51:53 +0000 (01:51 -0500)]
trace: branch profiling should not print percent without data

Impact: cleanup on output of branch profiler

When a branch has not been taken, it does not make sense to show
a percentage incorrect or hit. This patch changes the behaviour
to print out a 'X' when the branch has not been executed yet.

For example:

 correct incorrect  %        Function                  File              Line
 ------- ---------  -        --------                  ----              ----
    2096        0   0 do_arch_prctl                  process_64.c         832
       0        0   X do_arch_prctl                  process_64.c         804
    2604        0   0 IS_ERR                         err.h                34
  130228     5765   4 __switch_to                    process_64.c         673
       0        0   X enable_TSC                     process_64.c         448
       0        0   X disable_TSC                    process_64.c         431

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotrace: consolidate unlikely and likely profiler
Steven Rostedt [Fri, 21 Nov 2008 05:40:40 +0000 (00:40 -0500)]
trace: consolidate unlikely and likely profiler

Impact: clean up to make one profiler of like and unlikely tracer

The likely and unlikely profiler prints out the file and line numbers
of the annotated branches that it is profiling. It shows the number
of times it was correct or incorrect in its guess. Having two
different files or sections for that matter to tell us if it was a
likely or unlikely is pretty pointless. We really only care if
it was correct or not.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotrace: remove extra assign in branch check
Steven Rostedt [Fri, 21 Nov 2008 04:57:47 +0000 (23:57 -0500)]
trace: remove extra assign in branch check

Impact: clean up of branch check

The unlikely/likely profiler does an extra assign of the f.line.
This is not needed since it is already calculated at compile time.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: create default variables for archs in recordmcount.pl
Steven Rostedt [Thu, 20 Nov 2008 20:07:34 +0000 (15:07 -0500)]
ftrace: create default variables for archs in recordmcount.pl

Impact: cleanup of recordmcount.pl

Now that more architectures are being ported to the MCOUNT_RECORD
method, there is no reason to have each declare their own arch
specific variable if most of them share the same value. This patch
creates a set of default values for the arch specific variables
based off of i386.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoftrace: add support for powerpc to recordmcount.pl script
Steven Rostedt [Thu, 20 Nov 2008 15:16:16 +0000 (07:16 -0800)]
ftrace: add support for powerpc to recordmcount.pl script

Impact: Add PowerPC port to recordmcount.pl script

This patch updates the recordmcount.pl script to process
PowerPC.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosh: dynamic ftrace support.
Matt Fleming [Wed, 12 Nov 2008 11:11:47 +0000 (20:11 +0900)]
sh: dynamic ftrace support.

First cut at dynamic ftrace support.

[
  Steven Rostedt - only updated the recordmcount.pl file.
    There are updates for PowerPC that will conflict with this,
    and we need to base off of these changes.
]

Signed-off-by: Matt Fleming <mjf@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoinit/main.c: use ktime accessor function in initcall_debug code
Will Newton [Fri, 21 Nov 2008 22:08:59 +0000 (14:08 -0800)]
init/main.c: use ktime accessor function in initcall_debug code

Impact: fix initcall debug output on non-scalar ktime platforms (32-bit embedded)

The initcall_debug code access the tv64 member of ktime.  This won't work
correctly for large deltas on platforms that don't use the scalar ktime
implementation.

Signed-off-by: Will Newton <will.newton@gmail.com>
Acked-by: Tim Bird <tim.bird@am.sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing: allow tracing of suspend/resume & hibernation code again
Ingo Molnar [Sun, 23 Nov 2008 09:37:12 +0000 (10:37 +0100)]
tracing: allow tracing of suspend/resume & hibernation code again

Impact: widen function-tracing to suspend+resume (and hibernation) sequences

Now that the ftrace kernel thread is gone, we can allow tracing
during suspend/resume again.

So revert these two commits:

  f42ac38c5 "ftrace: disable tracing for suspend to ram"
  41108eb10 "ftrace: disable tracing for hibernation"

This should be tested very carefully, as it could interact with
altneratives instruction patching, etc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing: identify which executable object the userspace address belongs to
Török Edwin [Sat, 22 Nov 2008 11:28:48 +0000 (13:28 +0200)]
tracing: identify which executable object the userspace address belongs to

Impact: modify+improve the userstacktrace tracing visualization feature

Store thread group leader id, and use it to lookup the address in the
process's map. We could have looked up the address on thread's map,
but the thread might not exist by the time we are called. The process
might not exist either, but if you are reading trace_pipe, that is
unlikely.

Example usage:

 mount -t debugfs nodev /sys/kernel/debug
 cd /sys/kernel/debug/tracing
 echo userstacktrace >iter_ctrl
 echo sym-userobj >iter_ctrl
 echo sched_switch >current_tracer
 echo 1 >tracing_enabled
 cat trace_pipe >/tmp/trace&
 .... run application ...
 echo 0 >tracing_enabled
 cat /tmp/trace

You'll see stack entries like:

   /lib/libpthread-2.7.so[+0xd370]

You can convert them to function/line using:

   addr2line -fie /lib/libpthread-2.7.so 0xd370

Or:

   addr2line -fie /usr/lib/debug/libpthread-2.7.so 0xd370

For non-PIC/PIE executables this won't work:

   a.out[+0x73b]

You need to run the following: addr2line -fie a.out 0x40073b
(where 0x400000 is the default load address of a.out)

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agovfs, seqfile: make mangle_path() global
Török Edwin [Sat, 22 Nov 2008 11:28:48 +0000 (13:28 +0200)]
vfs, seqfile: make mangle_path() global

Impact: expose new VFS API

make mangle_path() available, as per the suggestions of Christoph Hellwig
and Al Viro:

  http://lkml.org/lkml/2008/11/4/338

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing: add support for userspace stacktraces in tracing/iter_ctrl
Török Edwin [Sat, 22 Nov 2008 11:28:47 +0000 (13:28 +0200)]
tracing: add support for userspace stacktraces in tracing/iter_ctrl

Impact: add new (default-off) tracing visualization feature

Usage example:

 mount -t debugfs nodev /sys/kernel/debug
 cd /sys/kernel/debug/tracing
 echo userstacktrace >iter_ctrl
 echo sched_switch >current_tracer
 echo 1 >tracing_enabled
 .... run application ...
 echo 0 >tracing_enabled

Then read one of 'trace','latency_trace','trace_pipe'.

To get the best output you can compile your userspace programs with
frame pointers (at least glibc + the app you are tracing).

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/function-return-tracer: clean up task start/exit callbacks
Ingo Molnar [Sun, 23 Nov 2008 08:18:56 +0000 (09:18 +0100)]
tracing/function-return-tracer: clean up task start/exit callbacks

Impact: cleanup

Eliminate #ifdefs in core code by using empty inline functions.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agotracing/function-return-tracer: store return stack into task_struct and allocate...
Frederic Weisbecker [Sun, 23 Nov 2008 05:22:56 +0000 (06:22 +0100)]
tracing/function-return-tracer: store return stack into task_struct and allocate it dynamically

Impact: use deeper function tracing depth safely

Some tests showed that function return tracing needed a more deeper depth
of function calls. But it could be unsafe to store these return addresses
to the stack.

So these arrays will now be allocated dynamically into task_struct of current
only when the tracer is activated.

Typical scheme when tracer is activated:
- allocate a return stack for each task in global list.
- fork: allocate the return stack for the newly created task
- exit: free return stack of current
- idle init: same as fork

I chose a default depth of 50. I don't have overruns anymore.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' into traci...
Ingo Molnar [Sun, 23 Nov 2008 08:10:32 +0000 (09:10 +0100)]
Merge branches 'tracing/profiling', 'tracing/options' and 'tracing/urgent' into tracing/core

16 years agolockdep: consistent alignement for lockdep info
Li Zefan [Fri, 21 Nov 2008 07:57:32 +0000 (15:57 +0800)]
lockdep: consistent alignement for lockdep info

Impact: prettify /proc/lockdep_info

Just feel odd that not all lines of lockdep info are aligned.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agosched: update comment for move_task_off_dead_cpu
Vegard Nossum [Fri, 21 Nov 2008 00:30:36 +0000 (01:30 +0100)]
sched: update comment for move_task_off_dead_cpu

Impact: cleanup

This commit:

commit f7b4cddcc5aca03e80e357360c9424dfba5056c2
Author: Oleg Nesterov <oleg@tv-sign.ru>
Date:   Tue Oct 16 23:30:56 2007 -0700

    do CPU_DEAD migrating under read_lock(tasklist) instead of write_lock_irq(ta

    Currently move_task_off_dead_cpu() is called under
    write_lock_irq(tasklist).  This means it can't use task_lock() which is
    needed to improve migrating to take task's ->cpuset into account.

    Change the code to call move_task_off_dead_cpu() with irqs enabled, and
    change migrate_live_tasks() to use read_lock(tasklist).

...forgot to update the comment in front of move_task_off_dead_cpu.

Reference: http://lkml.org/lkml/2008/6/23/135

Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge commit 'v2.6.28-rc6' into sched/core
Ingo Molnar [Fri, 21 Nov 2008 07:57:04 +0000 (08:57 +0100)]
Merge commit 'v2.6.28-rc6' into sched/core

16 years agofunction tracing: fix wrong position computing of stack_trace
Liming Wang [Fri, 21 Nov 2008 03:00:18 +0000 (11:00 +0800)]
function tracing: fix wrong position computing of stack_trace

Impact: make output of stack_trace complete if buffer overruns

When read buffer overruns, the output of stack_trace isn't complete.

When printing records with seq_printf in t_show, if the read buffer
has overruned by the current record, then this record won't be
printed to user space through read buffer, it will just be dropped in
this printing.

When next printing, t_start should return the "*pos"th record, which
is the one dropped by previous printing, but it just returns
(m->private + *pos)th record.

Here we use a more sane method to implement seq_operations which can
be found in kernel code. Thus we needn't initialize m->private.

About testing, it's not easy to overrun read buffer, but we can use
seq_printf to print more padding bytes in t_show, then it's easy to
check whether or not records are lost.

This commit has been tested on both condition of overrun and non
overrun.

Signed-off-by: Liming Wang <liming.wang@windriver.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Fri, 21 Nov 2008 02:08:09 +0000 (18:08 -0800)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5330/1: mach-pxa: Fixup reset for systems using reboot=cold or other strings
  [ARM] pxa: fix incorrect PCMCIA PSKTSEL pin configuration for spitz
  [ARM] pxa: fix I2C controller device being registered twice on Akita
  pxafb: only initialize the smart panel thread when dealing with a smartpanel
  pxafb: introduce LCD_TYPE_MASK and use it.

16 years agoLinux 2.6.28-rc6 v2.6.28-rc6
Linus Torvalds [Thu, 20 Nov 2008 23:19:22 +0000 (15:19 -0800)]
Linux 2.6.28-rc6

16 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
Linus Torvalds [Thu, 20 Nov 2008 23:07:40 +0000 (15:07 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] xen: fix xen_get_eflags.
  [IA64] ia64/pv_ops/pv_cpu_ops: fix _IA64_REG_IP case.
  [IA64] remove duplicate include iommu.h
  [IA64] use mprintk instead of printk, in ia64_mca_modify_original_stack
  [IA64] Rationalize kernel mode alignment checking

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
Linus Torvalds [Thu, 20 Nov 2008 21:53:21 +0000 (13:53 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: ACE1001 patch for cp2101.c
  USB: usbmon: fix read(2)
  USB: gadget rndis: send notifications
  USB: gadget rndis: stop windows self-immolation
  USB: storage: update unusual_devs entries for Nokia 5300 and 5310
  USB: storage: updates unusual_devs entry for the Nokia 6300
  usb: musb: fix bug in musb_schedule
  USB: fix SB700 usb subsystem hang bug

16 years ago[IA64] xen: fix xen_get_eflags.
Isaku Yamahata [Tue, 18 Nov 2008 10:20:51 +0000 (19:20 +0900)]
[IA64] xen: fix xen_get_eflags.

fix xen_get_eflags. It doesn't take any argument.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years ago[IA64] ia64/pv_ops/pv_cpu_ops: fix _IA64_REG_IP case.
Isaku Yamahata [Tue, 18 Nov 2008 10:19:50 +0000 (19:19 +0900)]
[IA64] ia64/pv_ops/pv_cpu_ops: fix _IA64_REG_IP case.

pv_cpu_ops.getreg(_IA64_REG_IP) returned constant.
But the returned ip valued should be the one in the caller, not of the callee.
This patch fixes that.

Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years ago[IA64] remove duplicate include iommu.h
Huang Weiyi [Thu, 20 Nov 2008 21:38:16 +0000 (13:38 -0800)]
[IA64] remove duplicate include iommu.h

arch/ia64/kernel/pci-dma.c only needs to include iommu once.

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years ago[IA64] use mprintk instead of printk, in ia64_mca_modify_original_stack
Hidetoshi Seto [Mon, 17 Nov 2008 01:18:08 +0000 (10:18 +0900)]
[IA64] use mprintk instead of printk, in ia64_mca_modify_original_stack

Using printk from MCA/INIT context is unsafe since it can cause deadlock.
The ia64_mca_modify_original_stack is called from both of mca handler and
init handler, so it should use mprintk instead of printk.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years ago[IA64] Rationalize kernel mode alignment checking
Tony Luck [Thu, 20 Nov 2008 21:27:12 +0000 (13:27 -0800)]
[IA64] Rationalize kernel mode alignment checking

Itanium processors can handle some misaligned data accesses. They
also provide a mode where all such accesses are forced to trap. The
kernel was schizophrenic about use of this mode:

* Base kernel code ran in permissive mode where the only traps
  generated were from those cases that the h/w could not handle.
* Interrupt, syscall and trap code ran in strict mode where all
  unaligned accesses caused traps to the 0x5a00 unaligned reference
  vector.

Use strict alignment checking throughout the kernel, but make
sure that we continue to let user mode use more relaxed mode
as the default.

Signed-off-by: Tony Luck <tony.luck@intel.com>
16 years agox86: Fix interrupt leak due to migration
Matthew Wilcox [Thu, 20 Nov 2008 21:09:33 +0000 (14:09 -0700)]
x86: Fix interrupt leak due to migration

When we migrate an interrupt from one CPU to another, we set the
move_in_progress flag and clean up the vectors later once they're not
being used.  If you're unlucky and call destroy_irq() before the vectors
become un-used, the move_in_progress flag is never cleared, which causes
the interrupt to become unusable.

This was discovered by Jesse Brandeburg for whom it manifested as an
MSI-X device refusing to use MSI-X mode when the driver was unloaded
and reloaded repeatedly.

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoSUNRPC: Fix a performance regression in the RPC authentication code
Trond Myklebust [Thu, 20 Nov 2008 21:06:21 +0000 (16:06 -0500)]
SUNRPC: Fix a performance regression in the RPC authentication code

Fix a regression reported by Max Kellermann whereby kernel profiling
showed that his clients were spending 45% of their time in
rpcauth_lookup_credcache.

It turns out that although his processes had identical uid/gid/groups,
generic_match() was failing to detect this, because the task->group_info
pointers were not shared. This again lead to the creation of a huge number
of identical credentials at the RPC layer.

The regression is fixed by comparing the contents of task->group_info
if the actual pointers are not identical.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Thu, 20 Nov 2008 21:14:16 +0000 (13:14 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  [CIFS] Do not attempt to close invalidated file handles
  [CIFS] fix check for dead tcon in smb_init

16 years agoMerge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus
Linus Torvalds [Thu, 20 Nov 2008 21:13:48 +0000 (13:13 -0800)]
Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: csrc-r4k: Fix declaration depending on the wrong CONFIG_ symbol.
  MIPS: csrc-r4k: Fix spelling mistake.
  MIPS: RB532: Provide functions for gpio configuration
  MIPS: IP22: Make indy_sc_ops variable static
  MIPS: RB532: GPIO register offsets are relative to GPIOBASE
  MIPS: Malta: Fix include paths in malta-amon.c

16 years agoMerge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 20 Nov 2008 21:13:03 +0000 (13:13 -0800)]
Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  intel-iommu: fix compile warnings

16 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Thu, 20 Nov 2008 21:12:14 +0000 (13:12 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (23 commits)
  net: fix tiny output corruption of /proc/net/snmp6
  atl2: don't request irq on resume if netif running
  ipv6: use seq_release_private for ip6mr.c /proc entries
  pkt_sched: fix missing check for packet overrun in qdisc_dump_stab()
  smc911x: Fix printf format typo in smc911x driver.
  asix: Fix asix-based cards connecting to 10/100Mbs LAN.
  mv643xx_eth: fix recycle check bound
  mv643xx_eth: fix the order of mdiobus_{unregister, free}() calls
  sh: sh_eth: Update to change of mii_bus
  TPROXY: supply a struct flowi->flags argument in inet_sk_rebuild_header()
  TPROXY: fill struct flowi->flags in udp_sendmsg()
  net: ipg.c fix bracing on endian swapping
  phylib: Fix auto-negotiation restart avoidance
  net: jme.c rxdesc.flags is __le16, other missing endian swaps
  phylib: fix phy name example in documentation
  net: Do not fire linkwatch events until the device is registered.
  phonet: fix compilation with gcc-3.4
  ixgbe: fix compilation with gcc-3.4
  pktgen: fix multiple queue warning
  net: fix ip_mr_init() error path
  ...

16 years agoMerge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 20 Nov 2008 21:11:21 +0000 (13:11 -0800)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ftrace: fix dyn ftrace filter selection
  ftrace: make filtered functions effective on setting
  ftrace: fix set_ftrace_filter
  trace: introduce missing mutex_unlock()
  tracing: kernel/trace/trace.c: introduce missing kfree()

16 years agoMerge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 20 Nov 2008 21:09:32 +0000 (13:09 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: uaccess_64: fix return value in __copy_from_user()
  x86: quirk for reboot stalls on a Dell Optiplex 330

16 years agoparisc: fix bug in compat_arch_ptrace
Helge Deller [Thu, 20 Nov 2008 09:54:09 +0000 (10:54 +0100)]
parisc: fix bug in compat_arch_ptrace

Commit 81e192d6ce303b6792aa38ff35f41a1a7357f23a ("parisc: convert to
generic compat_sys_ptrace") introduced a bug which segfaults the parisc
64bit kernel when stracing 32bit applications:

  Kernel Fault: Code=15 regs=00000000bafa42b0 (Addr=00000001baf5ab57)
       YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
  PSW: 00001000000001101111111100001011 Tainted: G        W
  r00-03  000000ff0806ff0b 000000004068edc0 00000000401203f8 00000000fb3e2508
  r04-07  0000000040686dc0 00000000baf5a800 fffffffffffffffc fffffffffb3e2508
  r08-11  00000000baf5a800 000000000004b068 00000000000402b0 0000000000040d68
  r12-15  0000000000042a9c 0000000000040a9c 0000000000040d60 0000000000042e9c
  r16-19  000000000004b060 000000000004b058 0000000000042d9c ffffffffffffffff
  r20-23  000000000800000b 0000000000000000 000000000800000b fffffffffb3e2508
  r24-27  00000000fffffffc 0000000000000003 00000000fffffffc 0000000040686dc0
  r28-31  00000001baf5a7ff 00000000bafa4280 00000000bafa42b0 00000000000001d7
  sr00-03  0000000000fca000 0000000000000000 0000000000000000 0000000000fca000
  sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000

  IASQ: 0000000000000000 0000000000000000 IAOQ: 0000000040120400 0000000040120404
   IIR: 4b9a06b0    ISR: 0000000000000000  IOR: 00000001baf5ab57
   CPU:        0   CR30: 00000000bafa4000 CR31: 00000000d22344e0
   ORIG_R28: 00000000fb3e2248
   IAOQ[0]: compat_arch_ptrace+0xb8/0x160
   IAOQ[1]: compat_arch_ptrace+0xbc/0x160
   RP(r2): compat_arch_ptrace+0xb0/0x160
  Backtrace:
   [<00000000401612ac>] compat_sys_ptrace+0x15c/0x180
   [<0000000040104ef8>] syscall_exit+0x0/0x14

The problem is that compat_arch_ptrace() enters with an addr value of
type compat_ulong_t and calls translate_usr_offset() to translate the
address offset into a struct pt_regs offset like this:

addr = translate_usr_offset(addr)

this means that any return value of translate_usr_offset() is stored
back as compat_ulong_t type into the addr variable.

But since translate_usr_offset() returns -1 for invalid offsets, addr
can now get the value 0xffffffff which then fails the next return-value
sanity check and thus the kernel tries to access invalid memory:

if (addr < 0)
break;

Fix this bug by modifying translate_usr_offset() to take and return
values of type compat_ulong_t, and by returning the value
"sizeof(struct pt_regs)" as an error indicator.

Additionally change the sanity check to check for return values
for >= sizeof(struct pt_regs).

This patch survived my compile and run-tests.

Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
16 years ago[CIFS] Do not attempt to close invalidated file handles
Steve French [Thu, 20 Nov 2008 20:00:44 +0000 (20:00 +0000)]
[CIFS] Do not attempt to close invalidated file handles

If a connection with open file handles has gone down
and come back up and reconnected without reopening
the file handle yet, do not attempt to send an SMB close
request for this handle in cifs_close.  We were
checking for the connection being invalid in cifs_close
but since the connection may have been reconnected
we also need to check whether the file handle
was marked invalid (otherwise we could close the
wrong file handle by accident).

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
16 years agoMIPS: csrc-r4k: Fix declaration depending on the wrong CONFIG_ symbol.
Ralf Baechle [Mon, 3 Nov 2008 11:32:34 +0000 (11:32 +0000)]
MIPS: csrc-r4k: Fix declaration depending on the wrong CONFIG_ symbol.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: csrc-r4k: Fix spelling mistake.
Ralf Baechle [Mon, 3 Nov 2008 11:31:54 +0000 (11:31 +0000)]
MIPS: csrc-r4k: Fix spelling mistake.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: RB532: Provide functions for gpio configuration
Phil Sutter [Sat, 1 Nov 2008 14:13:21 +0000 (15:13 +0100)]
MIPS: RB532: Provide functions for gpio configuration

As gpiolib doesn't support pin multiplexing, it provides no way to
access the GPIOFUNC register. Also there is no support for setting
interrupt status and level. These functions provide access to them and
are needed by the CompactFlash driver.

Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: IP22: Make indy_sc_ops variable static
Dmitri Vorobiev [Fri, 31 Oct 2008 17:54:11 +0000 (19:54 +0200)]
MIPS: IP22: Make indy_sc_ops variable static

The indy_sc_ops variable in arch/mips/mm/sc-ip22.c is needlessly defined
global, and this patch makes it static.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
---

16 years agoMIPS: RB532: GPIO register offsets are relative to GPIOBASE
Florian Fainelli [Fri, 31 Oct 2008 13:24:29 +0000 (14:24 +0100)]
MIPS: RB532: GPIO register offsets are relative to GPIOBASE

This patch fixes the wrong use of GPIO register offsets
in devices.c. To avoid further problems, use gpio_get_value
to return the NAND status instead of our own expanded code.

Also define the zero offset of the alternate function register to allow
consistent access.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Phil Sutter <n0-1@freewrt.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agoMIPS: Malta: Fix include paths in malta-amon.c
David Daney [Sat, 20 Sep 2008 17:16:36 +0000 (10:16 -0700)]
MIPS: Malta: Fix include paths in malta-amon.c

On linux-queue, malta doesn't build after the include file relocation.
This should fix it.

There some occurrences of 'asm-mips' in the comments of quite a few
files, but this is the only place I found it in any code.

Signed-off-by: David Daney <ddaney@avtrex.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
16 years agopowerpc/ppc32: ftrace, dynamic ftrace to handle modules
Steven Rostedt [Sat, 15 Nov 2008 07:39:05 +0000 (02:39 -0500)]
powerpc/ppc32: ftrace, dynamic ftrace to handle modules

Impact: add ability to trace modules on 32 bit PowerPC

This patch performs the necessary trampoline calls to handle
modules with dynamic ftrace on 32 bit PowerPC.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
16 years agopowerpc/ppc64: ftrace, handle module trampolines for dyn ftrace
Steven Rostedt [Sat, 15 Nov 2008 04:47:03 +0000 (20:47 -0800)]
powerpc/ppc64: ftrace, handle module trampolines for dyn ftrace

Impact: Allow 64 bit PowerPC to trace modules with dynamic ftrace

This adds code to handle the PPC64 module trampolines, and allows for
PPC64 to use dynamic ftrace.

Thanks to Paul Mackerras for these updates:

  - fix the mod and rec->arch.mod NULL checks.
  - fix to is_bl_op compare.

Thanks to Milton Miller for:

  - finding the nasty race with using two nops, and recommending
    instead that I use a branch 8 forward.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
16 years agopowerpc: ftrace, use probe_kernel API to modify code
Steven Rostedt [Sat, 15 Nov 2008 00:21:20 +0000 (16:21 -0800)]
powerpc: ftrace, use probe_kernel API to modify code

Impact: use cleaner probe_kernel API over assembly

Using probe_kernel_read/write interface is a much cleaner approach
than the current assembly version.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
16 years agopowerpc: ftrace, convert to new dynamic ftrace arch API
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)]
powerpc: ftrace, convert to new dynamic ftrace arch API

Impact: update to PowerPC ftrace arch API

This patch converts PowerPC to use the new dynamic ftrace arch API.

Thanks to Paul Mackennas for pointing out the mistakes of my original
test_24bit_addr function.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
16 years agopowerpc: ftrace, do not latency trace idle
Steven Rostedt [Sat, 15 Nov 2008 00:21:19 +0000 (16:21 -0800)]
powerpc: ftrace, do not latency trace idle

Impact: fix for irq off latency tracer

When idle is called, interrupts are disabled, but the idle function
will still wake up on an interrupt. The problem is that the interrupt
disabled latency tracer will take this call to idle as a latency.

This patch disables the latency tracing when going into idle.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
16 years agonet: fix tiny output corruption of /proc/net/snmp6
Alexey Dobriyan [Thu, 20 Nov 2008 12:20:10 +0000 (04:20 -0800)]
net: fix tiny output corruption of /proc/net/snmp6

Because "name" is static, it can be occasionally be filled with
somewhat garbage if two processes read /proc/net/snmp6.

Also, remove useless casts and "-1" -- snprintf() correctly terminates it's
output.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoatl2: don't request irq on resume if netif running
Alan Jenkins [Thu, 20 Nov 2008 12:18:25 +0000 (04:18 -0800)]
atl2: don't request irq on resume if netif running

If the device is suspended with the cable disconnected, then
resumed with the cable connected, dev->open is called before
resume. During resume, we request an IRQ, but the IRQ was
already assigned during dev->open, resulting in the warning
shown below.

Don't request an IRQ if the device is running.

Call Trace:
 [<c011b89a>] warn_on_slowpath+0x40/0x59
 [<c023df15>] raw_pci_read+0x4d/0x55
 [<c023dff3>] pci_read+0x1c/0x21
 [<c01bcd81>] __pci_find_next_cap_ttl+0x44/0x70
 [<c01bce86>] __pci_find_next_cap+0x1a/0x1f
 [<c01bcef9>] pci_find_capability+0x28/0x2c
 [<c01c4144>] pci_msi_check_device+0x53/0x62
 [<c01c49c2>] pci_enable_msi+0x3a/0x1cd
 [<e019f17b>] atl2_write_phy_reg+0x40/0x5f [atl2]
 [<c01061b1>] dma_generic_alloc_coherent+0x0/0xd7
 [<e019f107>] atl2_request_irq+0x15/0x49 [atl2]
 [<e01a1481>] atl2_open+0x20b/0x297 [atl2]
 [<c024a35c>] dev_open+0x62/0x91
 [<c0248b9a>] dev_change_flags+0x93/0x141
 [<c024f308>] do_setlink+0x238/0x2d5
 [<c02501b2>] rtnl_setlink+0xa9/0xbf
 [<c0297f0c>] mutex_lock+0xb/0x19
 [<c024ffa7>] rtnl_dump_ifinfo+0x0/0x69
 [<c0250109>] rtnl_setlink+0x0/0xbf
 [<c024fe42>] rtnetlink_rcv_msg+0x185/0x19f
 [<c0240fd1>] sock_rmalloc+0x23/0x57
 [<c024fcbd>] rtnetlink_rcv_msg+0x0/0x19f
 [<c0259457>] netlink_rcv_skb+0x2d/0x71
 [<c024fcb7>] rtnetlink_rcv+0x14/0x1a
 [<c025929e>] netlink_unicast+0x184/0x1e4
 [<c025992a>] netlink_sendmsg+0x233/0x240
 [<c023f405>] sock_sendmsg+0xb7/0xd0
 [<c0129131>] autoremove_wake_function+0x0/0x2b
 [<c0129131>] autoremove_wake_function+0x0/0x2b
 [<c0147796>] mempool_alloc+0x2d/0x9e
 [<c020c923>] scsi_pool_alloc_command+0x35/0x4f
 [<c0297f0c>] mutex_lock+0xb/0x19
 [<c028e867>] unix_stream_recvmsg+0x357/0x3e2
 [<c01b81c9>] copy_from_user+0x23/0x4f
 [<c02452ea>] verify_iovec+0x3e/0x6c
 [<c023f5ab>] sys_sendmsg+0x18d/0x1f0
 [<c023ffa8>] sys_recvmsg+0x146/0x1c8
 [<c0240016>] sys_recvmsg+0x1b4/0x1c8
 [<c0118f48>] __wake_up+0xf/0x15
 [<c02586cd>] netlink_table_ungrab+0x17/0x19
 [<c01b83ba>] copy_to_user+0x25/0x3b
 [<c023fe4a>] move_addr_to_user+0x50/0x68
 [<c0240266>] sys_getsockname+0x6f/0x9a
 [<c0240280>] sys_getsockname+0x89/0x9a
 [<c015046a>] do_wp_page+0x3ae/0x41a
 [<c0151525>] handle_mm_fault+0x4c5/0x540
 [<c02405d0>] sys_socketcall+0x176/0x1b0
 [<c010376d>] sysenter_do_call+0x12/0x21

Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Signed-off-by: Jay Cliburn <jcliburn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoipv6: use seq_release_private for ip6mr.c /proc entries
Benjamin Thery [Thu, 20 Nov 2008 12:16:12 +0000 (04:16 -0800)]
ipv6: use seq_release_private for ip6mr.c /proc entries

In ip6mr.c, /proc entries /proc/net/ip6_mr_cache and /proc/net/ip6_mr_vif
are opened with seq_open_private(), thus seq_release_private() should be
used to release them.
Should fix a small memory leak.

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agopkt_sched: fix missing check for packet overrun in qdisc_dump_stab()
Patrick McHardy [Thu, 20 Nov 2008 12:07:14 +0000 (04:07 -0800)]
pkt_sched: fix missing check for packet overrun in qdisc_dump_stab()

nla_nest_start() might return NULL, causing a NULL pointer dereference.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Thu, 20 Nov 2008 12:01:29 +0000 (04:01 -0800)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6

16 years agosmc911x: Fix printf format typo in smc911x driver.
Vernon Sauder [Thu, 20 Nov 2008 09:56:08 +0000 (01:56 -0800)]
smc911x: Fix printf format typo in smc911x driver.

Signed-off-by: Vernon Sauder <VernonInHand@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agoasix: Fix asix-based cards connecting to 10/100Mbs LAN.
Pantelis Koukousoulas [Thu, 20 Nov 2008 09:48:46 +0000 (01:48 -0800)]
asix: Fix asix-based cards connecting to 10/100Mbs LAN.

Add AX_MEDIUM_ENCK also when speed = 10/100Mbps. This allows my belkin
f5d5055 to work with my 100Mbps switch and with an old 10Mbps ISA card.
Without this patch, the card is recognized and the interface is brought
up fine, but no packets actually flow through the interface.

Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com>
Acked-by: David Hollis <dhollis@davehollis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
16 years agomv643xx_eth: fix recycle check bound
Lennert Buytenhek [Tue, 18 Nov 2008 04:28:58 +0000 (04:28 +0000)]
mv643xx_eth: fix recycle check bound

When mv643xx_eth allocates skbuffs, it adds
'dma_get_cache_alignment() - 1' to the length it needs, so that it can
align the skb's ->data pointer to a cache boundary.  When checking
whether a transmitted skbuff can be reused as a receive buffer, these
bytes needs to be included into the minimum bound for the recycle check.

Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>