H. Peter Anvin [Thu, 25 Dec 2008 18:39:01 +0000 (10:39 -0800)]
x86: unify the implementation of FPU traps
On 32 bits, we may suffer IRQ 13, or supposedly we might have a buggy
implementation which gives spurious trap 16. We did not check for
this on 64 bits, but there is no reason we can't make the code the
same in both cases. Furthermore, this is presumably rare, so do the
spurious check last, instead of first.
and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.
H. Peter Anvin [Tue, 23 Dec 2008 18:10:40 +0000 (10:10 -0800)]
x86: PAT: fix address types in track_pfn_vma_new()
Impact: cleanup, fix warning
This warning:
arch/x86/mm/pat.c: In function track_pfn_vma_copy:
arch/x86/mm/pat.c:701: warning: passing argument 5 of follow_phys from incompatible pointer type
Triggers because physical addresses are resource_size_t, not u64.
This really matters when calling an interface like follow_phys() which
takes a pointer to a physical address -- although on x86, being
littleendian, it would generally work anyway as long as the memory region
wasn't completely uninitialized.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Kyle McMartin [Tue, 23 Dec 2008 13:44:30 +0000 (08:44 -0500)]
parisc: disable UP-optimized flush_tlb_mm
flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.
Harry Ciao [Tue, 23 Dec 2008 21:57:16 +0000 (13:57 -0800)]
edac: fix edac core deadlock when removing a device
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure. Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed. This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier. The edac_poller will wake up sleepers when
it is found.
EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices. They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device. This is exactly
where edac_poller and rmmod would have a great chance to deadlock.
In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,
device_ctls_mutex would have already been grabbed by
edac_device_del_device(). So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.
edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex. Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.
Moreover, an edac_dev.work should check first if it is being removed. If
this is the case, then it should bail out immediately. Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed. The current edac_dev.op_state can
be used to serve this purpose.
The original deadlock problem and the solution have been witnessed and
tested on actual hardware. Without the solution, rmmod an edac driver
would result in below deadlock:
Evgeniy Polyakov [Tue, 23 Dec 2008 21:57:12 +0000 (13:57 -0800)]
w1: fix slave selection on big-endian systems
During test of the w1-gpio driver i found that in "w1.c:679
w1_slave_found()" the device id is converted to little-endian with
"cpu_to_le64()", but its not converted back to cpu format in "w1_io.c:293
w1_reset_select_slave()".
Based on a patch created by Andreas Hummel.
[akpm@linux-foundation.org: remove unneeded cast] Reported-by: Andreas Hummel <andi_hummel@gmx.de> Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
V4L/DVB (9920): em28xx: fix NULL pointer dereference in call to VIDIOC_INT_RESET command
Fix a NULL pointer dereference that would occur if the video decoder tied to
the em28xx supports the VIDIOC_INT_RESET call (for example: the cx25840 driver)
H. Peter Anvin [Tue, 23 Dec 2008 01:56:05 +0000 (17:56 -0800)]
x86: prioritize the FPU traps for the error code
In the case of multiple FPU errors, prioritize the error codes,
instead of returning __SI_FAULT, which ends up pushing a 0 as the
error code to userspace, a POSIX violation.
For i386, we will simply return if there are no errors at all; for
x86-64 this is probably a "can't happen" (and the code should be
unified), but for this patch, return __SI_FAULT|SI_KERNEL if this ever
happens.
Thomas Gleixner [Sat, 20 Dec 2008 20:27:34 +0000 (21:27 +0100)]
Null pointer deref with hrtimer_try_to_cancel()
Impact: Prevent kernel crash with posix timer clockid CLOCK_MONOTONIC_RAW
commit 2d42244ae71d6c7b0884b5664cf2eda30fb2ae68 (clocksource:
introduce CLOCK_MONOTONIC_RAW) introduced a new clockid, which is only
available to read out the raw not NTP adjusted system time.
The above commit did not prevent that a posix timer can be created
with that clockid. The timer_create() syscall succeeds and initializes
the timer to a non existing hrtimer base. When the timer is deleted
either by timer_delete() or by the exit() cleanup the kernel crashes.
Prevent the creation of timers for CLOCK_MONOTONIC_RAW by setting the
posix clock function to no_timer_create which returns an error code.
Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 20 Dec 2008 19:07:31 +0000 (11:07 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
fs/9p: change simple_strtol to simple_strtoul
9p: convert d_iname references to d_name.name
9p: Remove potentially bad parameter from function entry debug print.
Linus Torvalds [Sat, 20 Dec 2008 19:07:18 +0000 (11:07 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix resume (S2R) broken by Intel microcode module, on A110L
x86 gart: don't complain if no AMD GART found
AMD IOMMU: panic if completion wait loop fails
AMD IOMMU: set cmd buffer pointers to zero manually
x86: re-enable MCE on secondary CPUS after suspend/resume
AMD IOMMU: allocate rlookup_table with __GFP_ZERO
might hang upon resuming, OTOH it should have likely hanged each and every time.
(1) possible deadlock in microcode_resume_cpu() if either 'if' section is
taken;
(2) now, I don't see it in spec. and can't experimentally verify it (newer
ucodes don't seem to be available for my Core2duo)... but logically-wise, I'd
think that when read upon resuming, the 'microcode revision' (MSR 0x8B) should
be back to its original one (we need to reload ucode anyway so it doesn't seem
logical if a cpu doesn't drop the version)... if so, the comparison with
memcmp() for the full 'struct cpu_signature' is wrong... and that's how one of
the aforementioned 'if' sections might have been triggered - leading to a
deadlock.
Obviously, in my tests I simulated loading/resuming with the ucode of the same
version (just to see that the file is loaded/re-loaded upon resuming) so this
issue has never popped up.
I'd appreciate if someone with an appropriate system might give a try to the
2nd patch (titled "fix a comparison && deadlock...").
In any case, the deadlock situation is a must-have fix.
x86: PAT: remove follow_pfnmap_pte in favor of follow_phys
Impact: Cleanup - removes a new function in favor of a recently modified older one.
Replace follow_pfnmap_pte in pat code with follow_phys. follow_phys lso
returns protection eliminating the need of pte_pgprot call. Using follow_phys
also eliminates the need for pte_pa.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
x86: PAT: modify follow_phys to return phys_addr prot and return value
Impact: Changes and globalizes an existing static interface.
Follow_phys does similar things as follow_pfnmap_pte. Make a minor change
to follow_phys so that it can be used in place of follow_pfnmap_pte.
Physical address return value with 0 as error return does not work in
follow_phys as the actual physical address 0 mapping may exist in pte.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It is used by x86 PAT code for performance reasons. Identifying pfnmap
as linear over entire vma helps speedup reserve and free of memtype
for the region.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Jaswinder Singh [Fri, 19 Dec 2008 17:03:52 +0000 (22:33 +0530)]
x86: common.c boot_cpu_stack and boot_exception_stacks should be static
Impact: cleanup, avoid sparse warnings, reduce kernel size a bit
Fixes these sparse warnings:
arch/x86/kernel/cpu/common.c:869:6: warning: symbol 'boot_cpu_stack' was not declared. Should it be static?
arch/x86/kernel/cpu/common.c:910:6: warning: symbol 'boot_exception_stacks' was not declared. Should it be static?
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] mpt fusion: clear list of outstanding commands on host reset
[SCSI] scsi_lib: only call scsi_unprep_request() under queue lock
[SCSI] ibmvstgt: move crq_queue_create to the end of initialization
[SCSI] libiscsi REGRESSION: fix passthrough support with older iscsi tools
[SCSI] aacraid: disable Dell Percraid quirk on Adaptec 2200S and 2120S
Linus Torvalds [Fri, 19 Dec 2008 19:36:04 +0000 (11:36 -0800)]
Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: GEM on PAE has problems - disable it for now.
drm/i915: Don't return busy for buffers left on the flushing list.
Stanley Miao [Fri, 19 Dec 2008 14:08:22 +0000 (22:08 +0800)]
ALSA: Fix a Oops bug in omap soc driver.
There will be a Oops or frequent underrun messages when playing music with
omap soc driver, this is because a data region is incorretly sized, other data
region will be overwriten when writing to this data region.
Signed-off-by: Stanley Miao <stanley.miao@windriver.com> Acked-by: Jarkko Nikula <jarkko.nikula@nokia.com> Cc: stable@kernel.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Fri, 19 Dec 2008 13:02:32 +0000 (14:02 +0100)]
ALSA: hda - Remove non-working headphone control for Dell laptops
The previous commit re-enabled hp_nid setup for IDT92HD73*, but
it's unneeded indeed for Dell laptops that have multiple headphones.
Setting the extra hp_nid results in a non-working "Headpohne" mixer
control. Thus hp_nid should be 0 for these dell models.
Also, the automatic addition of hp_nid should check whether it's
a dual-HP model or not. For dual-HPs, the pins are already checked
by the early workaround.
Wu Fengguang [Wed, 26 Nov 2008 06:35:22 +0000 (14:35 +0800)]
ACPI: don't cond_resched() when irqs_disabled()
The ACPI interpreter usually runs with irqs enabled.
However, during suspend/resume it runs with
irqs disabled to evaluate _GTS/_BFS, as well as
by irqrouter_resume() which evaluates _CRS, _PRS, _SRS.
http://bugzilla.kernel.org/show_bug.cgi?id=12252
Signed-off-by: Wu Fengguang <wfg@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Bjorn Helgaas [Thu, 13 Nov 2008 23:30:13 +0000 (17:30 -0600)]
ACPI: fix 2.6.28 acpi.debug_level regression
acpi_early_init() was changed to over-write the cmdline param,
making it really inconvenient to set debug flags at boot-time.
Also,
This sets the default level to "info", which is what all the ACPI
drivers use. So to enable messages from drivers, you only have to
supply the "layer" (a.k.a. "component"). For non-"info" ACPI core
and ACPI interpreter messages, you have to supply both level and
layer masks, as before.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com> Signed-off-by: Len Brown <len.brown@intel.com>
Impact: fix wrong cache sharing detection on platforms supporting > 8 bit apicid's
In the presence of extended topology eumeration leaf 0xb provided
by cpuid, 32bit extended initial_apicid in cpuinfo_x86 struct will be
updated by detect_extended_topology(). At this instance, we should also
reinit the apicid (which could also potentially be extended to 32bit).
With out this there will potentially be duplicate apicid's populated in the
per cpu's cpuinfo_x86 struct, resulting in wrong cache sharing topology etc
detected by init_intel_cacheinfo().
Takashi Iwai [Wed, 17 Dec 2008 13:51:01 +0000 (14:51 +0100)]
ALSA: hda - Add no-jd model for IDT 92HD73xx
Added the model without the jack-detection for some desktops that
have really no jack-detection. The recent driver caused regressions
regarding the sound output on such machines.
cciss: fix problem that deleting multiple logical drives could cause a panic
Fix problem that deleting multiple logical drives could cause a panic.
It fixes a panic which can be easily reproduced in the following way: Just
create several "arrays," each with multiple logical drives via hpacucli,
then delete the first array, and it will blow up in deregister_disk(), in
the call to get_host() when it tries to dig the hba pointer out of a NULL
queue pointer.
The problem has been present since my code to make rebuild_lun_table
behave better went in.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Dave Airlie [Fri, 19 Dec 2008 05:38:34 +0000 (15:38 +1000)]
drm/i915: GEM on PAE has problems - disable it for now.
On PAE systems, GEM allocates pages using shmem, and passes these
pages to be bound into AGP, however the AGP interfaces + the x86
set_memory interfaces all take unsigned long not dma_addr_t.
The initial fix for this was a mess, so we need to do this correctly
for 2.6.29.
Eric Anholt [Mon, 15 Dec 2008 03:05:04 +0000 (19:05 -0800)]
drm/i915: Don't return busy for buffers left on the flushing list.
These buffers don't have active rendering still occurring to them, they just
need either a flush to be emitted or a retire_requests to occur so that we
notice they're done. Return unbusy so that one of the two occurs. The two
expected consumers of this interface (OpenGL and libdrm_intel BO cache) both
want this behavior.
Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Keith Packard <keithp@keithp.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
NeilBrown [Fri, 19 Dec 2008 05:25:01 +0000 (16:25 +1100)]
md: Don't read past end of bitmap when reading bitmap.
When we read the write-intent-bitmap off the device, we currently
read a whole number of pages.
When PAGE_SIZE is 4K, this works due to the alignment we enforce
on the superblock and bitmap.
When PAGE_SIZE is 64K, this case read past the end-of-device
which causes an error.
When we write the superblock, we ensure to clip the last page
to just be the required size. Copy that code into the read path
to just read the required number of sectors.
Signed-off-by: Neil Brown <neilb@suse.de> Cc: stable@kernel.org
James Chapman [Wed, 17 Dec 2008 12:02:16 +0000 (12:02 +0000)]
ppp: fix segfaults introduced by netdev_priv changes
This patch fixes a segfault in ppp_shutdown_interface() and
ppp_destroy_interface() when a PPP connection is closed. I bisected
the problem to the following commit:
netdevice ppp: Convert directly reference of netdev->priv
1. Use netdev_priv(dev) to replace dev->priv.
2. Alloc netdev's private data by alloc_netdev().
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The original ppp_generic code treated the netdev and struct ppp as
independent data structures which were freed separately. In moving the
ppp struct into the netdev, it is now possible for the private data to
be freed before the call to ppp_shutdown_interface(), which is bad.
The kfree(ppp) in ppp_destroy_interface() is also wrong; presumably
ppp hasn't worked since the above commit.
The following patch fixes both problems.
Signed-off-by: James Chapman <jchapman@katalix.com> Reviewed-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Fri, 19 Dec 2008 03:35:10 +0000 (19:35 -0800)]
net: Fix module refcount leak in kernel_accept()
The kernel_accept() does not hold the module refcount of newsock->ops->owner,
so we need __module_get(newsock->ops->owner) code after call kernel_accept()
by hand.
In sunrpc, the module refcount is missing to hold. So this cause kernel panic.
Used following script to reproduct:
while [ 1 ];
do
mount -t nfs4 192.168.0.19:/ /mnt
touch /mnt/file
umount /mnt
lsmod | grep ipv6
done
This patch fixed the problem by add __module_get(newsock->ops->owner) to
kernel_accept(). So we do not need to used __module_get(newsock->ops->owner)
in every place when used kernel_accept().
Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo Molnar [Fri, 19 Dec 2008 00:36:14 +0000 (01:36 +0100)]
x86: fix warning in arch/x86/kernel/microcode_amd.c
this warning:
arch/x86/kernel/microcode_amd.c: In function ‘apply_microcode_amd’:
arch/x86/kernel/microcode_amd.c:163: warning: cast from pointer to integer of different size
arch/x86/kernel/microcode_amd.c:163: warning: cast from pointer to integer of different size
triggers because we want to pass the address to the microcode MSR,
which is 64-bit even on 32-bit. Cast it explicitly to express this.
Jaswinder Singh [Thu, 18 Dec 2008 18:33:56 +0000 (00:03 +0530)]
x86: traps.c declare functions before they get used
Impact: cleanup
In asm/traps.h :-
do_double_fault : added under X86_64
sync_regs : added under X86_64
math_error : moved out from X86_32 as it is common for both 32 and 64 bit
math_emulate : moved from X86_32 as it is common for both 32 and 64 bit
smp_thermal_interrupt : added under X86_64
mce_threshold_interrupt : added under X86_64
x86: PAT: change pgprot_noncached to uc_minus instead of strong uc - v3
Impact: mm behavior change.
Make pgprot_noncached uc_minus instead of strong UC. This will make
pgprot_noncached to be in line with ioremap_nocache() and all the other
APIs that map page uc_minus on uc request.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
x86: PAT: implement track/untrack of pfnmap regions for x86 - v3
Impact: New mm functionality.
Hookup remap_pfn_range and vm_insert_pfn and corresponding copy and free
routines with reserve and free tracking.
reserve and free here only takes care of non RAM region mapping. For RAM
region, driver should use set_memory_[uc|wc|wb] to set the cache type and
then setup the mapping for user pte. We can bypass below
reserve/free in that case.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
x86: PAT: add follow_pfnmp_pte routine to help tracking pfnmap pages - v3
Impact: New currently unused interface.
Add a generic interface to follow pfn in a pfnmap vma range. This is used by
one of the subsequent x86 PAT related patch to keep track of memory types
for vma regions across vma copy and free.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
x86: PAT: store vm_pgoff for all linear_over_vma_region mappings - v3
Impact: Code transformation, new functions added should have no effect.
Drivers use mmap followed by pgprot_* and remap_pfn_range or vm_insert_pfn,
in order to export reserved memory to userspace. Currently, such mappings are
not tracked and hence not kept consistent with other mappings (/dev/mem,
pci resource, ioremap) for the sme memory, that may exist in the system.
The following patchset adds x86 PAT attribute tracking and untracking for
pfnmap related APIs.
First three patches in the patchset are changing the generic mm code to fit
in this tracking. Last four patches are x86 specific to make things work
with x86 PAT code. The patchset aso introduces pgprot_writecombine interface,
which gives writecombine mapping when enabled, falling back to
pgprot_noncached otherwise.
This patch:
While working on x86 PAT, we faced some hurdles with trackking
remap_pfn_range() regions, as we do not have any information to say
whether that PFNMAP mapping is linear for the entire vma range or
it is smaller granularity regions within the vma.
A simple solution to this is to use vm_pgoff as an indicator for
linear mapping over the vma region. Currently, remap_pfn_range
only sets vm_pgoff for COW mappings. Below patch changes the
logic and sets the vm_pgoff irrespective of COW. This will still not
be enough for the case where pfn is zero (vma region mapped to
physical address zero). But, for all the other cases, we can look at
pfnmap VMAs and say whether the mappng is for the entire vma region
or not.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
John McCutchan [Thu, 18 Dec 2008 01:43:02 +0000 (17:43 -0800)]
Maintainer email fixes for inotify
Update John McCutchan and Robert Love's email addresses for
maintenance of inotify
Signed-off-by: John McCutchan <john@johnmccutchan.com> Acked-by: Robert Love <rlove@rlove.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
changed CONFIG_RELOCATABLE from n to y, which may lead to a mismatch
between the vmlinux debug information and the runtime location of the
kernel, even when the bootloader does not relocate the kernel.
Revert the specific change. Works for me with GRUB and qemu.