Hannes Reinecke [Wed, 18 Feb 2009 09:30:15 +0000 (10:30 +0100)]
block: fix deadlock in blk_abort_queue() for drivers that readd to timeout list
blk_abort_queue() iterates the timeout list and aborts each request on the
list, but if the driver error handling readds a request to the timeout list
during this processing, we could be looping forever. Fix this by splicing
current entries to a local list and run over that list instead.
broke a particular case for booting from partitioned md/raid devices.
That is the second time this has been broken recently. The previous
time was fixed by
Because the data isn't available when an md device is first created
(we add disks and set it up after creation), the initial partition
scan finds nothing. It is not until the device is opened that
another partition scan happens and finds something.
So at the point where the kernel parameter "root=/dev/md_d0p1" is
being parsed, md_d0 exists, but md_d0p1 does not.
However if we let blk_lookup_devt return the correct device number
even though the device doesn't exist, then the attempt to mount it
will successfully find the partition.
I have tried in the past to find a way to get the partition table to
be read as soon as the array is assembled but that proved impossible
(at the time). I don't remember the details, and could possibly
revisit it. However it would be really nice if blk_lookup_devt
could be adjusted to again accept non existant partitions.
The above commit added WRITE_SYNC and switched various places to using
that for committing writes that will be waited upon immediately after
submission. However, this causes a performance regression with AS and CFQ
for ext3 at least, since sync_dirty_buffer() will submit some writes with
WRITE_SYNC while ext3 has sumitted others dependent writes without the sync
flag set. This causes excessive anticipation/idling in the IO scheduler
because sync and async writes get interleaved, causing a big performance
regression for the below test case (which is meant to simulate sqlite
like behaviour).
Chip Coldwell [Mon, 16 Feb 2009 12:11:56 +0000 (13:11 +0100)]
cciss: PCI power management reset for kexec
The kexec kernel resets the CCISS hardware in three steps:
1. Use PCI power management states to reset the controller in the
kexec kernel.
2. Clear the MSI/MSI-X bits in PCI configuration space so that MSI
initialization in the kexec kernel doesn't fail.
3. Use the CCISS "No-op" message to determine when the controller
firmware has recovered from the PCI PM reset.
[akpm@linux-foundation.org: cleanups] Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Jens Axboe [Mon, 16 Feb 2009 09:25:40 +0000 (10:25 +0100)]
block: fix bad definition of BIO_RW_SYNC
We can't OR shift values, so get rid of BIO_RW_SYNC and use BIO_RW_SYNCIO
and BIO_RW_UNPLUG explicitly. This brings back the behaviour from before 213d9417fec62ef4c3675621b9364a667954d4dd.
Boaz Harrosh [Tue, 3 Feb 2009 06:47:29 +0000 (07:47 +0100)]
bsg: Fix sense buffer bug in SG_IO
When submitting requests via SG_IO, which does a sync io, a
bsg_command is not allocated. So an in-Kernel sense_buffer was not
set. However when calling blk_execute_rq() with no sense buffer
one is provided from the stack. Now bsg at blk_complete_sgv4_hdr_rq()
would check if rq->sense_len and a sense was requested by sg_io_v4
the rq->sense was copy_user() back, but by now it is already mangled
stack memory.
I have fixed that by forcing a sense_buffer when calling bsg_map_hdr().
The bsg_command->sense is provided in the write/read path like before,
and on-the-stack buffer is provided when doing SG_IO.
I have also fixed a dprintk message to print rq->errors in hex because
of the scsi bit-field use of this member. For other block devices it
does not matter anyway.
USB/PCI: Fix resume breakage of controllers behind cardbus bridges
If a USB PCI controller is behind a cardbus bridge, we are trying to
restore its configuration registers too early, before the cardbus
bridge is operational. To fix this, call pci_restore_state() from
usb_hcd_pci_resume() and remove usb_hcd_pci_resume_early() which is
no longer necessary (the configuration spaces of USB controllers that
are not behind cardbus bridges will be restored by the PCI PM core
with interrupts disabled anyway).
This patch fixes the regression from 2.6.28 tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=12659
[ Side note: the proper long-term fix is probably to just force the
unplug event at suspend time instead of doing a plug/unplug at resume
time, but this patch is fine regardless - Linus ]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Miles Lane <miles.lane@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 17 Feb 2009 22:29:15 +0000 (14:29 -0800)]
Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
doc: mmiotrace.txt, buffer size control change
trace: mmiotrace to the tracer menu in Kconfig
mmiotrace: count events lost due to not recording
Linus Torvalds [Tue, 17 Feb 2009 22:27:39 +0000 (14:27 -0800)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, vm86: fix preemption bug
x86, olpc: fix model detection without OFW
x86, hpet: fix for LS21 + HPET = boot hang
x86: CPA avoid repeated lazy mmu flush
x86: warn if arch_flush_lazy_mmu_cpu is called in preemptible context
x86/paravirt: make arch_flush_lazy_mmu/cpu disable preemption
x86, pat: fix warn_on_once() while mapping 0-1MB range with /dev/mem
x86/cpa: make sure cpa is safe to call in lazy mmu mode
x86, ptrace, mm: fix double-free on race
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: hold trans_mutex when using btrfs_record_root_in_trans
Btrfs: make a lockdep class for the extent buffer locks
Btrfs: fs/btrfs/volumes.c: remove useless kzalloc
Btrfs: remove unused code in split_state()
Btrfs: remove btrfs_init_path
Btrfs: balance_level checks !child after access
Btrfs: Avoid using __GFP_HIGHMEM with slab allocator
Btrfs: don't clean old snapshots on sync(1)
Btrfs: use larger metadata clusters in ssd mode
Btrfs: process mount options on mount -o remount,
Btrfs: make sure all pending extent operations are complete
Linus Torvalds [Tue, 17 Feb 2009 22:16:02 +0000 (14:16 -0800)]
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
sata_nv: give up hardreset on nf2
libata-sff: fix 32-bit PIO ATAPI regression
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes:
kbuild: create the source symlink earlier in the objdir
scripts: add x86 64 bit support to the markup_oops.pl script
scripts: add x86 register parser to markup_oops.pl
kbuild: add sys_* entries for syscalls in tags
kbuild: fix tags generation of config symbols
bootgraph: fix for use with dot symbols
kbuild: add vmlinux to kernel rpm
kbuild,setlocalversion: shorten the make time when using svn
Linus Torvalds [Tue, 17 Feb 2009 22:08:26 +0000 (14:08 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: fix bus endianity in file2alias
HID: move tmff and zpff devices from ignore_list to blacklist
HID: unlock properly on error paths in hidraw_ioctl()
HID: blacklist Powercom USB UPS
Linus Torvalds [Tue, 17 Feb 2009 22:08:03 +0000 (14:08 -0800)]
Merge branch 'for-linus' of git://git.o-hand.com/linux-mfd
* 'for-linus' of git://git.o-hand.com/linux-mfd:
mfd: Fix sm501_register_gpio section mismatch
mfd: fix sm501 section mismatches
mfd: terminate pcf50633 i2c_device_id list
mfd: Ensure all WM8350 IRQs are masked at startup
mfd: fix htc-egpio iomem resource handling using resource_size
mfd: Fix TWL4030 build on some ARM variants
mfd: wm8350 tries reaches -1
mfd: Mark WM835x USB_SLV_500MA bit as accessible
mfd: Improve diagnostics for WM8350 ID register probe
mfd: Initialise WM8350 interrupts earlier
mfd: Fix egpio kzalloc return test
Linus Torvalds [Tue, 17 Feb 2009 22:05:05 +0000 (14:05 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling
ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages
ext4: Initialize preallocation list_head's properly
ext4: Fix lockdep warning
ext4: Fix to read empty directory blocks correctly in 64k
jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate()
Revert "ext4: wait on all pending commits in ext4_sync_fs()"
jbd2: Fix return value of jbd2_journal_start_commit()
David Woodhouse [Fri, 13 Feb 2009 23:18:03 +0000 (23:18 +0000)]
Fix Intel IOMMU write-buffer flushing
This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.
Override that bit on the affected chipsets, and everything is happy
again.
Thanks to Chris and Bhavesh and others for helping to debug.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Tested-by: Chris Wright <chrisw@sous-sol.org> Reviewed-by: Bhavesh Davda <bhavesh@vmware.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 16 Feb 2009 02:38:12 +0000 (02:38 +0000)]
Fix incomplete __mntput locking
Getting this wrong caused
WARNING: at fs/namespace.c:636 mntput_no_expire+0xac/0xf2()
due to optimistically checking cpu_writer->mnt outside the spinlock.
Here's what we really want:
* we know that nobody will set cpu_writer->mnt to mnt from now on
* all changes to that sucker are done under cpu_writer->lock
* we want the laziest equivalent of
spin_lock(&cpu_writer->lock);
if (likely(cpu_writer->mnt != mnt)) {
spin_unlock(&cpu_writer->lock);
continue;
}
/* do stuff */
that would make sure we won't miss earlier setting of ->mnt done by
another CPU.
Anyway, for now we just move the spin_lock() earlier and move the test
into the properly locked region.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Reported-and-tested-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hans de Goede [Tue, 17 Feb 2009 18:59:54 +0000 (19:59 +0100)]
hwmon: Fix ACPI resource check error handling
This patch fixes a number of cases where things were not properly
cleaned up when acpi_check_resource_conflict() returned an error,
causing oopses such as the one reported here:
https://bugzilla.redhat.com/show_bug.cgi?id=483208
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
The video_ioctl2 conversion of ivtv in kernel 2.6.27 introduced a bug
causing decoder commands to crash. The decoder commands should have been
handled from the video_ioctl2 default handler, ensuring correct mapping
of the argument between user and kernel space. Unfortunately they ended
up before the video_ioctl2 call, causing random crashes.
Thanks to hannes@linus.priv.at for testing and helping me track down the
cause!
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Adam Baker [Wed, 4 Feb 2009 18:33:21 +0000 (15:33 -0300)]
V4L/DVB (10619): gspca - main: Destroy the URBs at disconnection time.
If a device using the gspca framework is unplugged while it is still streaming
then the call that is used to free the URBs that have been allocated occurs
after the pointer it uses becomes invalid at the end of gspca_disconnect.
Make another cleanup call in gspca_disconnect while the pointer is still
valid (multiple calls are OK as destroy_urbs checks for pointers already
being NULL.
This change set is wrong. The affected functions cannot be called from
an interrupt context, because they may process large buffers. In this
case, interrupts are disabled for a long time. Functions, like
dvb_dmx_swfilter_packets(), could be called only from a tasklet.
This change set does hide some strong design bugs in dm1105.c and
au0828-dvb.c.
Please revert this change set and do fix the bugs in dm1105.c and
au0828-dvb.c (and other files).
On Sun, 15 Feb 2009, Oliver Endriss wrote:
This changeset _must_ be reverted! It breaks all kernels since 2.6.27
for applications which use DVB and require a low interrupt latency.
It is a very bad idea to call the demuxer to process data buffers with
interrupts disabled!
On Mon, 16 Feb 2009, Trent Piepho wrote:
I agree, this is bad. The demuxer is far too much work to be done with
IRQs off. IMHO, even doing it under a spin-lock is excessive. It should
be a mutex. Drivers should use a work-queue to feed the demuxer.
Thank you for testing this changeset and discovering the issues on it.
Cc: Trent Piepho <xyzzy@speakeasy.org> Cc: Hartmut <e9hack@googlemail.com> Cc: Oliver Endriss <o.endriss@gmx.de> Cc: Andreas Oberritter <obi@linuxtv.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Tobias Lorenz [Thu, 12 Feb 2009 17:56:19 +0000 (14:56 -0300)]
V4L/DVB (10533): fix LED status output
This patch closes one of my todos that was since long on my list.
Some people reported clicks and glitches in the audio stream,
correlated to the LED color changing cycle.
Thanks to Rick Bronson <rick@efn.org>.
As reported by David Engel <david@istwok.net>, ATSC115 doesn't work
fine with mythtv. This software opens both analog and dvb interfaces of
saa7134.
What happens is that some tuner commands are going to the wrong place,
as shown at the logs:
Feb 12 20:37:48 opus kernel: tuner-simple 1-0061: using tuner params #0 (ntsc)
Feb 12 20:37:48 opus kernel: tuner-simple 1-0061: freq = 67.25 (1076), range = 0, config = 0xce, cb = 0x01
Feb 12 20:37:48 opus kernel: tuner-simple 1-0061: Freq= 67.25 MHz, V_IF=45.75 MHz, Offset=0.00 MHz, div=1808
Feb 12 20:37:48 opus kernel: tuner 1-0061: tv freq set to 67.25
Feb 12 20:37:48 opus kernel: tuner-simple 1-000a: using tuner params #0 (ntsc)
Feb 12 20:37:48 opus kernel: tuner-simple 1-000a: freq = 67.25 (1076), range = 0, config = 0xce, cb = 0x01
Feb 12 20:37:48 opus kernel: tuner-simple 1-000a: Freq= 67.25 MHz, V_IF=45.75 MHz, Offset=0.00 MHz, div=1808
Feb 12 20:37:48 opus kernel: tuner-simple 1-000a: tv 0x07 0x10 0xce 0x01
Feb 12 20:37:48 opus kernel: tuner-simple 1-0061: tv 0x07 0x10 0xce 0x01
This happens due to a hack at TUV1236D analog setup, where it replaces
tuner address, at 0x61 for 0x0a, in order to save a few memory bytes.
The code assumes that nobody else would try to access the tuner during
that setup, but the point is that there's no lock to protect such
access. So, this opens the possibility of race conditions to happen.
Instead of hacking tuner address, this patch uses a temporary var with
the proper tuner value to be used during the setup. This should save
the issue, although we should consider to write some analog/digital
lock at saa7134 driver.
Anssi Hannula [Sat, 14 Feb 2009 09:45:05 +0000 (11:45 +0200)]
HID: move tmff and zpff devices from ignore_list to blacklist
The devices handled by hid-tmff and hid-zpff were added in the
hid_ignore_list[] instead of hid_blacklist[] in hid-core.c, thus
disabling them completely.
hid_ignore_list[] causes hid layer to skip the device, while
hid_blacklist[] indicates there is a specific driver in hid bus.
Re-enable the devices by moving them to the correct list.
Michael Tokarev [Sun, 1 Feb 2009 15:11:04 +0000 (16:11 +0100)]
HID: blacklist Powercom USB UPS
For quite some time users with various UPSes from Powercom were forced to play
magic with bind/unbind in /sys in order to be able to see the UPSes. The
beasts does not work as HID devices, even if claims to do so. cypress_m8
driver works with the devices instead, creating a normal serial port with which
normal UPS controlling software works.
The manufacturer confirmed the upcoming models with proper HID support will
have different device IDs. In any way, it's wrong to have two completely
different modules for one device in kernel.
Blacklist the device in HID (add it to hid_ignore_list) to stop this mess,
finally.
Signed-off-By: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Herbert Xu [Tue, 17 Feb 2009 12:00:11 +0000 (20:00 +0800)]
crypto: lrw - Fix big endian support
It turns out that LRW has never worked properly on big endian.
This was never discussed because nobody actually used it that
way. In fact, it was only discovered when Geert Uytterhoeven
loaded it through tcrypt which failed the test on it.
The fix is straightforward, on big endian the to find the nth
bit we should be grouping them by words instead of bytes. So
setbit128_bbe should xor with 128 - BITS_PER_LONG instead of
128 - BITS_PER_BYTE == 0x78.
Tested-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rakib Mullick [Tue, 17 Feb 2009 08:21:52 +0000 (09:21 +0100)]
mfd: Fix sm501_register_gpio section mismatch
WARNING: drivers/mfd/built-in.o(.text+0x1706): Section mismatch in
reference from the function sm501_register_gpio() to the function
.devinit.text:sm501_gpio_register_chip()
The function sm501_register_gpio() references
the function __devinit sm501_gpio_register_chip().
This is often because sm501_register_gpio lacks a __devinit
annotation or the annotation of sm501_gpio_register_chip is wrong.
Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Mark Brown [Wed, 4 Feb 2009 20:26:07 +0000 (21:26 +0100)]
mfd: Fix TWL4030 build on some ARM variants
Many ARM platforms do not provide a mach/cpu.h so rather than guarding
the use of that header with CONFIG_ARM guard it with the guards used
when testing for the OMAP variants in the body of the code.
Signed-off-by: Mark Brown <broonie@sirena.org.uk> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Roel Kluin [Wed, 4 Feb 2009 20:23:22 +0000 (21:23 +0100)]
mfd: wm8350 tries reaches -1
With a postfix decrement tries will reach -1 rather than 0,
so the warning will not be issued even upon timeout.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Mark Brown [Wed, 4 Feb 2009 20:09:38 +0000 (21:09 +0100)]
mfd: Improve diagnostics for WM8350 ID register probe
Check the return value of the device I/O functions when reading the
ID registers so we can provide a more useful diagnostic when we're
having trouble talking to the device.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Mark Brown [Wed, 4 Feb 2009 19:49:52 +0000 (20:49 +0100)]
mfd: Initialise WM8350 interrupts earlier
Ensure that the interrupt handling is configured before we do platform
specific init. This allows the platform specific initialisation to
configure things which use interrupts safely.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
Sergei Shtylyov [Sun, 15 Feb 2009 19:24:24 +0000 (23:24 +0400)]
libata-sff: fix 32-bit PIO ATAPI regression
Commit 871af1210f13966ab911ed2166e4ab2ce775b99d (libata: Add 32bit
PIO support) has caused all kinds of errors on the ATAPI devices, so
it has been empirically proven that one shouldn't try to read/write
an extra data word when a device is not expecting it already. "Don't
do it then"; however, still use a chance to do 32-bit read/write one
last time when there are exactly 3 trailing bytes.
Oh, and stop pointlessly swapping the bytes to and fro on big-endian
machines by using io*_rep() accessors which shouldn't byte-swap.
This patch should fix the kernel.org bug #12609.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Impact: fix powernow-k8 when acpi=off (or other error).
There was a spurious change introduced into powernow-k8 in this patch:
so that we try to "restore" the cpus_allowed we never saved. We revert
that file.
See lkml "[PATCH] x86/powernow: fix cpus_allowed brokage when
acpi=off" from Yinghai for the bug report.
Cc: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Acked-by: Ingo Molnar <mingo@elte.hu>
Dan Carpenter [Mon, 16 Feb 2009 01:02:19 +0000 (20:02 -0500)]
ext4: Fix NULL dereference in ext4_ext_migrate()'s error handling
This was found through a code checker (http://repo.or.cz/w/smatch.git/).
It looks like you might be able to trigger the error by trying to migrate
a readonly file system.
Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Pekka Paalanen [Sat, 3 Jan 2009 19:09:27 +0000 (21:09 +0200)]
doc: mmiotrace.txt, buffer size control change
Impact: prevents confusing the user when buffer size is inadequate
The tracing framework offers a resizeable buffer, which mmiotrace uses
to record events. If the buffer is full, the following events will be
lost. Events should not be lost, so the documentation instructs the user
to increase the buffer size. The buffer size is set via a debugfs file.
Mmiotrace documentation was not updated the same time the debugfs file
was changed. The old file was tracing/trace_entries and first contained
the number of entries the buffer had space for, per cpu. Nowadays this
file is replaced with the file tracing/buffer_size_kb, which tells the
amount of memory reserved for the buffer, per cpu, in kilobytes.
Previously, a flag had to be toggled via the debugfs file
tracing/tracing_enabled when the buffer size was changed. This is no
longer necessary.
The mmiotrace documentation is updated to reflect the current state of
the tracing framework.
Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Paalanen [Sat, 3 Jan 2009 19:23:51 +0000 (21:23 +0200)]
trace: mmiotrace to the tracer menu in Kconfig
Impact: cosmetic change in Kconfig menu layout
This patch was originally suggested by Peter Zijlstra, but seems it
was forgotten.
CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable
directly under the Kernel hacking / debugging menu in the kernel
configuration system. They were present only for x86 and x86_64.
Other tracers that use the ftrace tracing framework are in their own
sub-menu. This patch moves the mmiotrace configuration options there.
Since the Kconfig file, where the tracer menu is, is not architecture
specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by
x86/x86_64. CONFIG_MMIOTRACE now depends on it.
Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Pekka Paalanen [Tue, 6 Jan 2009 11:57:11 +0000 (13:57 +0200)]
mmiotrace: count events lost due to not recording
Impact: enhances lost events counting in mmiotrace
The tracing framework, or the ring buffer facility it uses, has a switch
to stop recording data. When recording is off, the trace events will be
lost. The framework does not count these, so mmiotrace has to count them
itself.
Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Arjan van de Ven [Sun, 15 Feb 2009 10:30:55 +0000 (11:30 +0100)]
scripts: add x86 64 bit support to the markup_oops.pl script
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Arjan van de Ven [Sun, 15 Feb 2009 10:30:52 +0000 (11:30 +0100)]
scripts: add x86 register parser to markup_oops.pl
An oops dump also contains the register values.
This patch parses these for (32 bit) x86, and then annotates the
disassembly with these values; this helps in analysis of the oops by the
developer, for example, NULL pointer or other pointer bugs show up clearly
this way.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Rabin Vincent [Sun, 25 Jan 2009 13:09:12 +0000 (18:39 +0530)]
kbuild: add sys_* entries for syscalls in tags
Currently, it is no longer possible to use the tags file to jump to
system call function definitions with sys_foo, because the definitions
are obscured by use of the SYSCALL_DEFINE* macros.
This patch adds the appropriate option to ctags to make it see through
the macro. Also, it adds the ENTRY() work already done for Exuberant
to Emacs too.
Signed-off-by: Rabin Vincent <rabin@rab.in> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Michael Neuling [Sun, 15 Feb 2009 09:20:30 +0000 (10:20 +0100)]
bootgraph: fix for use with dot symbols
powerpc has dot symbols, so the dmesg output looks like:
<4>[ 0.327310] calling .migration_init+0x0/0x9c @ 1
<4>[ 0.327595] initcall .migration_init+0x0/0x9c returned 1 after 0 usecs
The below fixes bootgraph.pl so it handles this correctly.
Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Josh Hunt [Thu, 12 Feb 2009 05:10:57 +0000 (21:10 -0800)]
kbuild: add vmlinux to kernel rpm
We are building an automated system to test kernels weekly and need to
provide an rpm to our QA dept. We would like to use the ability to create
kernel rpms already in the kernel's Makefile, but need the vmlinux file
included in the rpm for later debugging.
This patch adds a compressed vmlinux to the kernel rpm when doing a
make rpm-pkg or binrpm-pkg and upon install places the vmlinux file in /boot.
Signed-off-by: Josh Hunt <josh@scalex86.org> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Thomas Gleixner [Tue, 13 Jan 2009 22:36:34 +0000 (23:36 +0100)]
x86, vm86: fix preemption bug
Commit 3d2a71a596bd9c761c8487a2178e95f8a61da083 ("x86, traps: converge
do_debug handlers") changed the preemption disable logic of do_debug()
so vm86_handle_trap() is called with preemption disabled resulting in:
BUG: sleeping function called from invalid context at include/linux/kernel.h:155
in_atomic(): 1, irqs_disabled(): 0, pid: 3005, name: dosemu.bin
Pid: 3005, comm: dosemu.bin Tainted: G W 2.6.29-rc1 #51
Call Trace:
[<c050d669>] copy_to_user+0x33/0x108
[<c04181f4>] save_v86_state+0x65/0x149
[<c0418531>] handle_vm86_trap+0x20/0x8f
[<c064e345>] do_debug+0x15b/0x1a4
[<c064df1f>] debug_stack_correct+0x27/0x2c
[<c040365b>] sysenter_do_call+0x12/0x2f
BUG: scheduling while atomic: dosemu.bin/3005/0x10000001
Restore the original calling convention and reenable preemption before
calling handle_vm86_trap().
Reported-by: Michal Suchanek <hramrach@centrum.cz> Cc: stable@kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
kvm->slots_lock is outer to kvm->lock, so take slots_lock
in kvm_vm_ioctl_assign_device() before taking kvm->lock,
rather than taking it in kvm_iommu_map_memslots().
Cc: stable@kernel.org Signed-off-by: Mark McLoughlin <markmc@redhat.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Wed, 21 Jan 2009 08:52:16 +0000 (16:52 +0800)]
KVM: MMU: Map device MMIO as UC in EPT
Software are not allow to access device MMIO using cacheable memory type, the
patch limit MMIO region with UC and WC(guest can select WC using PAT and
PCD/PWT).
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Tue, 6 Jan 2009 02:03:03 +0000 (10:03 +0800)]
KVM: Fix racy in kvm_free_assigned_irq
In the past, kvm_get_kvm() and kvm_put_kvm() was called in assigned device irq
handler and interrupt_work, in order to prevent cancel_work_sync() in
kvm_free_assigned_irq got a illegal state when waiting for interrupt_work done.
But it's tricky and still got two problems:
1. A bug ignored two conditions that cancel_work_sync() would return true result
in a additional kvm_put_kvm().
2. If interrupt type is MSI, we would got a window between cancel_work_sync()
and free_irq(), which interrupt would be injected again...
This patch discard the reference count used for irq handler and interrupt_work,
and ensure the legal state by moving the free function at the very beginning of
kvm_destroy_vm(). And the patch fix the second bug by disable irq before
cancel_work_sync(), which may result in nested disable of irq but OK for we are
going to free it.
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Sheng Yang [Tue, 6 Jan 2009 02:03:02 +0000 (10:03 +0800)]
KVM: Add kvm_arch_sync_events to sync with asynchronize events
kvm_arch_sync_events is introduced to quiet down all other events may happen
contemporary with VM destroy process, like IRQ handler and work struct for
assigned device.
For kvm_arch_sync_events is called at the very beginning of kvm_destroy_vm(), so
the state of KVM here is legal and can provide a environment to quiet down other
events.
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Marcelo Tosatti [Wed, 10 Dec 2008 20:23:26 +0000 (21:23 +0100)]
KVM: mmu_notifiers release method
The destructor for huge pages uses the backing inode for adjusting
hugetlbfs accounting.
Hugepage mappings are destroyed by exit_mmap, after
mmu_notifier_release, so there are no notifications through
unmap_hugepage_range at this point.
The hugetlbfs inode can be freed with pages backed by it referenced
by the shadow. When the shadow releases its reference, the huge page
destructor will access a now freed inode.
Implement the release operation for kvm mmu notifiers to release page
refs before the hugetlbfs inode is gone.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Avi Kivity [Mon, 19 Jan 2009 12:57:52 +0000 (14:57 +0200)]
KVM: Avoid using CONFIG_ in userspace visible headers
Kconfig symbols are not available in userspace, and are not stripped by
headers-install. Avoid their use by adding #defines in <asm/kvm.h> to
suit each architecture.
Yang Zhang [Thu, 8 Jan 2009 07:13:31 +0000 (15:13 +0800)]
KVM: ia64: fix fp fault/trap handler
The floating-point registers f6-f11 is used by vmm and
saved in kvm-pt-regs, so should set the correct bit mask
and the pointer in fp_state, otherwise, fpswa may touch
vmm's fp registers instead of guests'.
In addition, for fp trap handling, since the instruction
which leads to fp trap is completely executed, so can't
use retry machanism to re-execute it, because it may
pollute some registers.
Signed-off-by: Yang Zhang <yang.zhang@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
Chris Ball [Sat, 14 Feb 2009 01:56:18 +0000 (20:56 -0500)]
x86, olpc: fix model detection without OFW
Impact: fix "garbled display, laptop is unusable" bug
Commit e51a1ac2dfca9ad869471e88f828281db7e810c0 ("x86, olpc: fix endian
bug in openfirmware workaround") breaks model comparison on OLPC; the value
0xc2 needs to be scaled up by olpc_board().
The pre-patch version was wrong, but accidentally worked anyway
(big-endian 0xc2 is big enough to satisfy all other board revisions,
but little endian 0xc2 is not).
Signed-off-by: Chris Ball <cjb@laptop.org> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Andres Salomon <dilinger@queued.net> Cc: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
David Woodhouse [Fri, 13 Feb 2009 23:18:03 +0000 (23:18 +0000)]
iommu: fix Intel IOMMU write-buffer flushing
This is the cause of the DMA faults and disk corruption that people have
been seeing. Some chipsets neglect to report the RWBF "capability" --
the flag which says that we need to flush the chipset write-buffer when
changing the DMA page tables, to ensure that the change is visible to
the IOMMU.
Override that bit on the affected chipsets, and everything is happy
again.
Thanks to Chris and Bhavesh and others for helping to debug.
ext4: Implement range_cyclic in ext4_da_writepages instead of write_cache_pages
With delayed allocation we lock the page in write_cache_pages() and
try to build an in memory extent of contiguous blocks. This is needed
so that we can get large contiguous blocks request. If range_cyclic
mode is enabled, write_cache_pages() will loop back to the 0 index if
no I/O has been done yet, and try to start writing from the beginning
of the range. That causes an attempt to take the page lock of lower
index page while holding the page lock of higher index page, which can
cause a dead lock with another writeback thread.
The solution is to implement the range_cyclic behavior in
ext4_da_writepages() instead.
When creating a new ext4_prealloc_space structure, we have to
initialize its list_head pointers before we add them to any prealloc
lists. Otherwise, with list debug enabled, we will get list
corruption warnings.
Wim Van Sebroeck [Wed, 28 Jan 2009 20:51:04 +0000 (20:51 +0000)]
[WATCHDOG] iTCO_wdt: fix SMI_EN regression 2
bugzilla: #12363
commit 7cd5b08be3c489df11b559fef210b81133764ad4 added a second regression:
some Dell's and Compaq's lockup on boot. So we revert most of the code.
The ICH9 reboot issue remains in place and will need some more fixing... :-(