Eric Dumazet [Tue, 13 Dec 2005 06:17:14 +0000 (22:17 -0800)]
[PATCH] x86_64: Bug correction in populate_memnodemap()
As reported by Keith Mannthey, there are problems in populate_memnodemap()
The bug was that the compute_hash_shift() was returning 31, with incorrect
initialization of memnodemap[]
To correct the bug, we must use (1UL << shift) instead of (1 << shift) to
avoid an integer overflow, and we must check that shift < 64 to avoid an
infinite loop.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
john stultz [Tue, 13 Dec 2005 06:17:13 +0000 (22:17 -0800)]
[PATCH] x86_64: Fix collision between pmtimer and pit/hpet
On systems that do not support the HPET legacy functions (basically the IBM
x460, but there could be others), in time_init() we accidentally fall into a
PM timer conditional and set the vxtime_hz value to the PM timer's frequency.
We then use this value with the HPET for timekeeping.
This patch (which mimics the behavior in time_init_gtod) corrects the
collision.
Andi Kleen [Tue, 13 Dec 2005 06:17:11 +0000 (22:17 -0800)]
[PATCH] i386/x86-64 Correct for broken MCFG tables on K8 systems
They report all busses as MMCONFIG capable, but it never works for the
internal devices in the CPU's builtin northbridge.
It just probes all func 0 devices on bus 0 (the internal northbridge is
currently always on bus 0) and if they are not accessible using MCFG they are
put into a special fallback bitmap.
On systems where it isn't we assume the BIOS vendor supplied correct MCFG.
Requires the earlier patch for mmconfig type1 fallback
Andi Kleen [Tue, 13 Dec 2005 06:17:10 +0000 (22:17 -0800)]
[PATCH] i386/x86-64 Fall back to type 1 access when no entry found
When there is no entry for a bus in MCFG fall back to type1. This is
especially important on K8 systems where always some devices can't be accessed
using mmconfig (in particular the builtin northbridge doesn't support it for
its own devices)
Andi Kleen [Tue, 13 Dec 2005 06:17:09 +0000 (22:17 -0800)]
[PATCH] i386/x86-64: Don't call change_page_attr with a spinlock held
It's illegal because it can sleep.
Use a two step lookup scheme instead. First look up the vm_struct, then
change the direct mapping, then finally unmap it. That's ok because nobody
can change the particular virtual address range as long as the vm_struct is
still in the global list.
Also added some LinuxDoc documentation to iounmap.
Shaohua Li [Tue, 13 Dec 2005 06:17:08 +0000 (22:17 -0800)]
[PATCH] i386/x86-64 disable LAPIC completely for offline CPU
Disabling LAPIC timer isn't sufficient. In some situations, such as we
enabled NMI watchdog, there is still unexpected interrupt (such as NMI)
invoked in offline CPU. This also avoids offline CPU receives spurious
interrupt and anything similar.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Andi Kleen <ak@suse.de> Acked-by: "Seth, Rohit" <rohit.seth@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dave Airlie [Tue, 13 Dec 2005 04:18:41 +0000 (04:18 +0000)]
[drm] fix radeon aperture issue
Ben noticed that on certain cards we've landed the AGP space on top of
the second aperture instead of after it.. Which messes things up a lot
on those machines.
This just moves the gart further out, a more correct fix is in the works
from Ben for after 2.6.15.
Signed-off-by: Dave Airlie <airlied@linux.ie> CC: Ben Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Get rid of warning in case of race with ring full and lockless
tx on the skge driver. It is possible to be in the transmit
routine with no available slots and already stopped.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Mark Lord [Tue, 13 Dec 2005 04:19:28 +0000 (23:19 -0500)]
[PATCH] libata-core.c: fix parameter bug on kunmap_atomic() calls
Fix incorrect pointer usage on two calls to kunmap_atomic().
This seems to happen a lot, because kunmap() wants the struct page *,
whereas kunmap_atomic() instead wants the mapped virtual address.
Signed-off-by: Mark Lord <liml@rtr.ca> Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
Linus Torvalds [Tue, 13 Dec 2005 00:24:33 +0000 (16:24 -0800)]
get_user_pages: don't try to follow PFNMAP pages
Nick Piggin points out that a few drivers play games with VM_IO (why?
who knows..) and thus a pfn-remapped area may not have that bit set even
if remap_pfn_range() set it originally.
So make it explicit in get_user_pages() that we don't follow VM_PFNMAP
pages, since pretty much by definition they do not have a "struct page"
associated with them.
Marcus Sundberg [Mon, 12 Dec 2005 23:02:48 +0000 (15:02 -0800)]
[NETFILTER]: ip_nat_tftp: Fix expectation NAT
When a TFTP client is SNATed so that the port is also changed, the
port is never changed back for the expected connection.
Signed-off-by: Marcus Sundberg <marcus@ingate.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Jackson [Mon, 12 Dec 2005 22:42:31 +0000 (14:42 -0800)]
[SPARC]: atomic_clear_mask build fix
This fixes one build error introduced in sparc with the patch of Oct 30,
resent Nov 4 "[patch 3/5] atomic: atomic_inc_not_zero" I still can't get
sparc to build, but at least it gets further after I remove this line.
Apparently, this change was agreed to by Andrew and Nick on Nov 14, but
everyone thought someone else was doing it.
Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Brian King [Mon, 12 Dec 2005 19:05:08 +0000 (13:05 -0600)]
[PATCH] Fix SCSI scanning slab corruption
There is a double free in the scsi scan code if a LLDD's slave_alloc()
call fails. There is a direct call to scsi_free_queue and then the
following put_device calls the release function, which also frees the
queue.
Remove the redundant scsi_free_queue.
Signed-off-by: Brian King <brking@us.ibm.com> Tested-by: Nathan Lynch <ntl@pobox.com>
[ Also removed some strange whitespace artifacts in that area ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Olaf Hering [Fri, 9 Dec 2005 18:12:10 +0000 (19:12 +0100)]
[PATCH] pcnet32: use MAC address from prom also on powerpc
The CSR contains garbage after a coldboot on RS/6000.
One some systems (like my 44p 270) the MAC address is all FF,
on others (like my B50) it is ff:ff:ff:fd:ff:6b.
It can eventually be fixed by loading pcnet32, set the interface
into the UP state, rmmod pcnet32 and load it again. But this worked
only on the 270.
Only netbooting after a cold start provides the correct MAC address
via prom and CSR. This makes it very unreliable.
I dont know why the MAC is stored in two different places. Remove
the special case for powerpc, which was added in early 2.4 development.
Signed-off-by: Olaf Hering <olh@suse.de>
drivers/net/pcnet32.c | 5 -----
1 files changed, 5 deletions(-) Signed-off-by: Jeff Garzik <jgarzik@pobox.com>
"All it's doing is deferring the device_put() from the
scsi_put_command() to after the scsi_run_queue(), which doesn't fix
the sleep while atomic problem of the device release method. In both
cases we still get the semaphore in atomic context problem which is
caused by scsi_reap_target() doing a device_del(), which I assumed
(wrongly) was valid from atomic context."
NeilBrown [Mon, 12 Dec 2005 10:39:17 +0000 (02:39 -0800)]
[PATCH] md: use correct size of raid5 stripe cache when measuring how full it is
The raid5 stripe cache was recently changed from fixed size (NR_STRIPES) to
variable size (conf->max_nr_stripes). However there are two places that still
use the constant and as a result, reducing the size of the stripe cache can
result in a deadlock.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Dave Jones [Mon, 12 Dec 2005 08:37:40 +0000 (00:37 -0800)]
[PATCH] ACPI: fix sleeping whilst atomic warnings on resume
This has been broken for months. On resume, we call acpi_pci_link_set()
with interrupts off, so we get a warning when we try to do a kmalloc of non
atomic memory. The actual allocation is just 2 long's (plus extra byte for
some reason I can't fathom), so a simple conversion to GFP_ATOMIC is
probably the safest way to fix this.
The error looks like this..
Debug: sleeping function called from invalid context at mm/slab.c:2486
in_atomic():0, irqs_disabled():1
[<c0143f6c>] kmem_cache_alloc+0x40/0x56
[<c0206a2e>] acpi_pci_link_set+0x3f/0x17f
[<c0206f96>] irqrouter_resume+0x1e/0x3c
[<c0239bca>] __sysdev_resume+0x11/0x6b
[<c0239e88>] sysdev_resume+0x34/0x52
[<c023de21>] device_power_up+0x5/0xa
Signed-off-by: Dave Jones <davej@redhat.com> Cc: "Brown, Len" <len.brown@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Haren Myneni [Mon, 12 Dec 2005 08:37:39 +0000 (00:37 -0800)]
[PATCH] fix in __alloc_bootmem_core() when there is no free page in first node's memory
Hitting BUG_ON() in __alloc_bootmem_core() when there is no free page
available in the first node's memory. For the case of kdump on PPC64
(Power 4 machine), the captured kernel is used two memory regions - memory
for TCE tables (tce-base and tce-size at top of RAM and reserved) and
captured kernel memory region (crashk_base and crashk_size). Since we
reserve the memory for the first node, we should be returning from
__alloc_bootmem_core() to search for the next node (pg_dat).
Currently, find_next_zero_bit() is returning the n^th bit (eidx) when there
is no free page. Then, test_bit() is failed since we set 0xff only for the
actual size initially (init_bootmem_core()) even though rounded up to one
page for bdata->node_bootmem_map. We are hitting the BUG_ON after failing
to enter second "for" loop.
Signed-off-by: Haren Myneni <haren@us.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Nicolas Pitre [Mon, 12 Dec 2005 08:37:36 +0000 (00:37 -0800)]
[PATCH] input: fix ucb1x00-ts breakage after conversion to dynamic input_dev allocation
The bd622663192e8ebebb27dc1d9397f352a82d2495 commit broke the UCB1x00
touchscreen driver since the idev structure was assumed to be into the ts
structure, simply casting the former to the later in a couple places.
This patch fixes those, and also cache the idev pointer between multiple
calls to input_report_abs() to avoid growing the compiled code needlessly.
Signed-off-by: Nicolas Pitre <nico@cam.org> Cc: Dmitry Torokhov <dtor_core@ameritech.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] kprobes: increment kprobe missed count for multiprobes
When multiple probes are registered at the same address and if due to some
recursion (probe getting triggered within a probe handler), we skip calling
pre_handlers and just increment nmissed field.
The below patch make sure it walks the list for multiple probes case.
Without the below patch we get incorrect results of nmissed count for
multiple probe case.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For Kprobes critical path is the path from debug break exception handler
till the control reaches kprobes exception code. No probes can be
supported in this path as we will end up in recursion.
This patch prevents this by moving the below function to safe __kprobes
section onto which no probes can be inserted.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matt Domsch [Mon, 12 Dec 2005 08:37:32 +0000 (00:37 -0800)]
[PATCH] ipmi: fix panic generator ID
The IPMI specifcation says the generator ID is 0x20, but that is for bits
7-1. Bit 0 is set to specify it is a software event. The correct value is
0x41. Without this fix, panic events written into the System Event Log
appear to come from an "unknown" generator, rather than from the kernel.
Signed-off-by: Jordan Hargrave <Jordan_Hargrave@dell.com> Signed-off-by: Matt Domsch <Matt_Domsch@dell.com> Acked-by: Corey Minyard <minyard@acm.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Hugh Dickins [Mon, 12 Dec 2005 08:37:23 +0000 (00:37 -0800)]
[PATCH] mips: setup_zero_pages count 1
Page count should be initialized to 1 on each of the MIPS empty zero pages,
to avoid a bad_page warning whenever one of them is freed from all mappings.
Pierre Ossman [Mon, 12 Dec 2005 08:37:22 +0000 (00:37 -0800)]
[PATCH] Add try_to_freeze to kauditd
kauditd was causing suspends to fail because it refused to freeze. Adding
a try_to_freeze() to its sleep loop solves the issue.
Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Acked-by: Pavel Machek <pavel@suse.cz> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pekka J Enberg [Mon, 12 Dec 2005 08:37:16 +0000 (00:37 -0800)]
[PATCH] uml: fix compile error for tt
arch/um/kernel/tt/uaccess.c: In function `copy_from_user_tt':
arch/um/kernel/tt/uaccess.c:11: error: `FIXADDR_USER_START' undeclared (first use in this function)
arch/um/kernel/tt/uaccess.c:11: error: (Each undeclared identifier is reported only once
arch/um/kernel/tt/uaccess.c:11: error: for each function it appears in.)
I get the compile error when I disable CONFIG_MODE_SKAS.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Paolo Giarrusso <blaisorblade@yahoo.it> Cc: Jeff Dike <jdike@addtoit.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
John McCutchan [Mon, 12 Dec 2005 08:37:14 +0000 (00:37 -0800)]
[PATCH] inotify: add two inotify_add_watch flags
The below patch lets userspace have more control over the inodes that
inotify will watch. It introduces two new flags.
IN_ONLYDIR -- only watch the inode if it is a directory.
This is needed to avoid the race that can occur when we want to be
sure that we are watching a directory.
IN_DONT_FOLLOW -- don't follow a symlink. In combination
with IN_ONLYDIR we can make sure that we don't watch the target of
symlinks.
The issues the flags fix came up when writing the gnome-vfs inotify
backend. Default behaviour is unchanged.
Signed-off-by: John McCutchan <ttb@tentacle.dhs.org> Acked-by: Robert Love <rml@novell.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Jens Axboe [Mon, 12 Dec 2005 08:37:13 +0000 (00:37 -0800)]
[PATCH] cciss: double put_disk()
This undoes the put_disk patch I sent in before.
If I had been paying attention I would have seen that we call put_disk
from free_hba during driver unload. That's the only time we want to
call it. If it's called from deregister disk we may remove the
controller (cNd0) unintentionally.
Signed-off-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
[PATCH] kprobes: fix race in aggregate kprobe registration
When registering multiple kprobes at the same address, we leave a small
window where the kprobe hlist will not contain a reference to the
registered kprobe, leading to potentially, a system crash if the breakpoint
is hit on another processor.
Patch below now automically relpace the old kprobe with the new
kprobe from the hash list.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matt Helsley [Mon, 12 Dec 2005 08:37:10 +0000 (00:37 -0800)]
[PATCH] Add timestamp field to process events
This adds a timestamp field to the events sent via the process event
connector. The timestamp allows listeners to accurately account the
duration(s) between a process' events and offers strong means with which
to determine the order of events with respect to a given task while also
avoiding the addition of per-task data.
This alters the size and layout of the event structure and hence would
break compatibility if process events connector as it stands in 2.6.15-rc2
were released as a mainline kernel.
Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Each has problems with combinations of SMP-safety, low resolution, and
monotonicity. This patch adds a new function that returns a monotonic SMP-safe
timestamp with nanosecond resolution where available.
Changes:
Split timestamp into separate patch
Moved to kernel/time.c
Renamed to getnstimestamp
Fixed unintended-pointer-arithmetic bug
Signed-off-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Gentoo users with kernels with selinux compiled in, and coreutils compiled
with acl support, noticed that they could not copy files on tmpfs using
'cp'.
cp (compiled with acl support) copies the file, lists the extended
attributes on the old file, copies them all to the new file, and then
exits. However the listxattr() calls were failing with this odd behaviour:
llistxattr("a.out", (nil), 0) = 17
llistxattr("a.out", 0x7fffff8c6cb0, 17) = -1 ERANGE (Numerical result out of
range)
I believe this is a simple problem in the logic used to check the buffer
sizes; if the user sends a buffer the exact size of the data, then its ok
:)
This change solves the problem.
More info can be found at http://bugs.gentoo.org/113138
Signed-off-by: Daniel Drake <dsd@gentoo.org> Acked-by: James Morris <jmorris@namei.org> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Accessing nohz_cpu_mask before incrementing rcp->cur is racy. It can cause
tickless idle CPUs to be included in rsp->cpumask, which will extend
graceperiods unnecessarily.
Fix this race. It has been tested using extensions to RCU torture module
that forces various CPUs to become idle.
While doing some test of RCU torture module, I hit a OOPS in rcu_do_batch,
which was trying to processes callback of a module that was just removed.
This is because we weren't waiting long enough for all callbacks to fire.
Dipankar Sarma [Mon, 12 Dec 2005 08:37:05 +0000 (00:37 -0800)]
[PATCH] add rcu_barrier() synchronization point
This introduces a new interface - rcu_barrier() which waits until all
the RCUs queued until this call have been completed.
Reiser4 needs this, because we do more than just freeing memory object
in our RCU callback: we also remove it from the list hanging off
super-block. This means, that before freeing reiser4-specific portion
of super-block (during umount) we have to wait until all pending RCU
callbacks are executed.
The only change of reiser4 made to the original patch, is exporting of
rcu_barrier().
Cc: Hans Reiser <reiser@namesys.com> Cc: Vladimir V. Saveliev <vs@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andreas Schwab [Mon, 12 Dec 2005 08:37:03 +0000 (00:37 -0800)]
[PATCH] KERNELRELEASE depends on CONFIG_LOCALVERSION
Sam Ravnborg <sam@ravnborg.org> writes:
> Author: Uwe Zeisberger <zeisberg@informatik.uni-freiburg.de>
>
> [PATCH] kbuild: make kernelrelease in unconfigured kernel prints an error
>
> Do not include .config for target kernelrelease
This is wrong. KERNELRELEASE depends on CONFIG_LOCALVERSION, thus you
need .config.
Signed-off-by: Andreas Schwab <schwab@suse.de> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Shaohua Li [Mon, 12 Dec 2005 08:37:02 +0000 (00:37 -0800)]
[PATCH] x86: fix NMI with CPU hotplug
With CPU hotplug enabled, NMI watchdog stoped working. It appears the
violation is the cpu_online check in nmi handler. local ACPI based NMI
watchdog is initialized before we set CPU online for APs. It's quite
possible a NMI is fired before we set CPU online, and that's what happens
here.
Signed-off-by: Shaohua Li <shaohua.li@intel.com> Acked-by: Zwane Mwaikambo <zwane@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Mao, Bibo [Mon, 12 Dec 2005 08:37:00 +0000 (00:37 -0800)]
[PATCH] Kprobes: Reference count the modules when probed on it
When a Kprobes are inserted/removed on a modules, the modules must be ref
counted so as not to allow to unload while probes are registered on that
module.
Without this patch, the probed module is free to unload, and when the
probing module unregister the probe, the kpobes code while trying to
replace the original instruction might crash.
Signed-off-by: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Mao Bibo <bibo.mao@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Keith Owens [Sat, 10 Dec 2005 03:24:28 +0000 (14:24 +1100)]
[IA64] Define an ia64 version of __raw_read_trylock
IA64 is using the generic version of __raw_read_trylock, which always
waits for the lock to be free instead of returning when the lock is in
use. Define an ia64 version of __raw_read_trylock which behaves
correctly, and drop the generic one.
Signed-off-by: Keith Owens <kaos@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Linus Torvalds [Mon, 12 Dec 2005 04:38:17 +0000 (20:38 -0800)]
Allow arbitrary read-only shared pfn-remapping too
The VM layer (for historical reasons) turns a read-only shared mmap into
a private-like mapping with the VM_MAYWRITE bit clear. Thus checking
just VM_SHARED isn't actually sufficient.
So use a trivial helper function for the cases where we wanted to inquire
if a mapping was COW-like or not.
Linus Torvalds [Mon, 12 Dec 2005 03:57:52 +0000 (19:57 -0800)]
Remove (at least temporarily) the "incomplete PFN mapping" support
With the previous commit, we can handle arbitrary shared re-mappings
even without this complexity, and since the only known private mappings
are for strange users of /dev/mem (which never create an incomplete one),
there seems to be no reason to support it.
Linus Torvalds [Mon, 12 Dec 2005 03:46:02 +0000 (19:46 -0800)]
Allow arbitrary shared PFNMAP's
A shared mapping doesn't cause COW-pages, so we don't need to worry
about the whole vm_pgoff logic to decide if a PFN-remapped page has
gone through COW or not.
This makes it possible to entirely avoid the special "partial remapping"
logic for the common case.
Johannes Berg [Sun, 11 Dec 2005 02:41:50 +0000 (18:41 -0800)]
[PATCH] ppc32: set smp_tb_synchronized on UP with SMP kernel
ppc32 kernel, when built with CONFIG_SMP and booted on a single CPU
machine, will not properly set smp_tb_synchronized, thus causing
gettimeofday() to not use the HW timebase and to be limited to jiffy
resolution. This, among others, causes unacceptable pauses when launching
X.org.
Signed-Off-By: Johannes Berg <johannes@sipsolutions.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Using ptrace() functionality, run to main(), and start singlestepping.
Singlestep over \"BX LR\" instruction won\'t transfer the control back
to main, but run the code to completion.
This problems seems to be in the function get_branch_address() in
arch/arm/kernel/ptrace.c. The function doesn\'t seem to recognize BX
and BLX instructions as branches. BX and BLX instructions can be used
to convert from ARM to Thumb mode if the target address has the low
bit set. However, they are also perfectly legal in the ARM only mode.
Although other things in the kernel seem to indicate that only ARM
mode is accepted (and not Thumb), many compilers will generate BX
and BLX instructions even when generating ARM only code.
Signed-off-by: Nikola Valerjev <nikola@ghs.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
David Gibson [Fri, 9 Dec 2005 05:45:17 +0000 (16:45 +1100)]
[PATCH] powerpc: Fix SLB flushing path in hugepage
On ppc64, when opening a new hugepage region, we need to make sure any
old normal-page SLBs for the area are flushed on all CPUs. There was
a bug in this logic - after putting the new hugepage area masks into
the thread structure, we copied it into the paca (read by the SLB miss
handler) only on one CPU, not on all. This could cause incorrect SLB
entries to be loaded when a multithreaded program was running
simultaneously on several CPUs. This patch corrects the error,
copying the context information into the PACA on all CPUs using the mm
in question before flushing any existing SLB entries.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
David Gibson [Fri, 9 Dec 2005 03:20:52 +0000 (14:20 +1100)]
[PATCH] powerpc: Add missing icache flushes for hugepages
On most powerpc CPUs, the dcache and icache are not coherent so
between writing and executing a page, the caches must be flushed.
Userspace programs assume pages given to them by the kernel are icache
clean, so we must do this flush between the kernel clearing a page and
it being mapped into userspace for execute. We were not doing this
for hugepages, this patch corrects the situation.
We use the same lazy mechanism as we use for normal pages, delaying
the flush until userspace actually attempts to execute from the page
in question.
Tested on G5.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org>
Olof Johansson [Fri, 9 Dec 2005 01:40:17 +0000 (19:40 -0600)]
[PATCH] powerpc: Set cache info defaults
Cache info is setup by walking the device tree in initialize_cache_info().
However, icache_flush_range might be called before that, in
slb_initialize()->patch_slb_encoding, which modifies the load immediate
instructions used with SLB fault code.
Not only that, but depending on memory layout, we might take SLB faults
during unflatten_device_tree. So that fault will load an SLB entry that
might not contain the right LLP flags for the segment.
Either we can walk the flattened device tree to setup cache info, or
we can pick the known defaults that are known to work. Doing it in the
flattened device tree is hairier since we need to know the machine type
to know what property to look for, etc, etc.
For now, it's just easier to go with the defaults. Worst thing that
happens from it is that we might waste a few cycles doing too small
dcbst/icbi increments.
Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>
model_id fields of wf_smu_sys_all_params should match the model ID
they are supposed to represent (as commented). Fixes windfarm on some
iMac 8,1 models.
Signed-off-by: Michal Ostrowski <mostrows at watson ibm com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
[NET]: Fix NULL pointer deref in checksum debugging.
The problem I was seeing turned out to be that skb->dev is NULL when
the checksum is being completed in user context. This happens because
the reference to the device is dropped (to allow it to be released
when packets are in the queue).
Because skb->dev was NULL, the netdev_rx_csum_fault was panicing on
deref of dev->name. How about this?
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Some debug code wasn't properly removed from the initial 64k pages
patch, and while it's harmless, it's also slowing down significantly a
very hot code path, thus it should really be removed.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
The 64k pages patch changed the meaning of one argument passed to the
low level hash functions (from "large" it became "psize" or page size
index), but one of the call sites wasn't properly updated, causing
potential random weird problems with huge pages. This fixes it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Mike Kravetz [Wed, 7 Dec 2005 21:07:23 +0000 (13:07 -0800)]
[PATCH] powerpc/pseries: boot failures on numa if no memory on node
This bug exists in the current code and prevents machines from booting
with numa enabled if there is a node that does not contain memory.
Workaround is to boot with 'numa=off'. Looks like a simple typo.
Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
Jack Steiner [Tue, 6 Dec 2005 14:05:24 +0000 (08:05 -0600)]
[IA64-SGI] Fix SN PTC deadlock recovery
The patch that added support for a new platform chipset (shub2) broke
PTC deadlock recovery on older versions of the chipset. (PTCs are the
SN platform-specific method for doing a global TLB purge). This
patch fixes deadlock recovery so that it works on both the old & new
chipsets.
Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Robin Holt [Tue, 6 Dec 2005 02:02:31 +0000 (20:02 -0600)]
[IA64] Change SET_PERSONALITY to comply with comment in binfmt_elf.c.
We have a customer application which trips a bug. The problem arises
when a driver attempts to call do_munmap on an area which is mapped, but
because current->thread.task_size has been set to 0xC0000000, the call
to do_munmap fails thinking it is an unmap beyond the user's address
space.
The comment in fs/binfmt_elf.c in load_elf_library() before the call
to SET_PERSONALITY() indicates that task_size must not be changed for
the running application until flush_thread, but is for ia64 executing
ia32 binaries.
This patch moves the setting of task_size from SET_PERSONALITY() to
flush_thread() as indicated. The customer application no longer is able
to trip the bug.
Signed-off-by: Robin Holt <holt@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Jack Steiner [Mon, 5 Dec 2005 19:56:50 +0000 (13:56 -0600)]
[IA64] Limit the maximum NODEDATA_ALIGN() offset
The per-node data structures are allocated with strided offsets that are a
function of the node number. This prevents excessive cache-aliasing from
occurring.
On systems with a large number of nodes, the strided offset becomes
too large. This patch restricts the maximum offset to 32MB. This is far larger
than the size of any current L3 cache.
Signed-off-by: Jack Steiner <steiner@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
John Keller [Tue, 29 Nov 2005 22:36:32 +0000 (16:36 -0600)]
[IA64-SGI] altix: pci_window fixup
Altix only patch to add fixup code that sets up
pci_controller->window. This code is a temporary
fix until ACPI support on Altix is added.
Also, corrects the usage of pci_dev->sysdata,
which had previously been used to reference
platform specific device info, to now point to
a pci_controller struct.
Signed-off-by: John Keller <jpk@sgi.com> Signed-off-by: Tony Luck <tony.luck@intel.com>