pilppa.com Git - linux-2.6-omap-h63xx.git/log

x86: fix resume (S2R) broken by Intel microcode module, on A110L

Impact: fix deadlock

This is in response to the following bug report:

Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12100
Subject         : resume (S2R) broken by Intel microcode module, on A110L
Submitter       : Andreas Mohr <andi@lisas.de>
Date            : 2008-11-25 08:48 (19 days old)
Handled-By      : Dmitry Adamushko <dmitry.adamushko@gmail.com>

[ The deadlock scenario has been discovered by Andreas Mohr ]

I think I might have a logical explanation why the system:

  (http://bugzilla.kernel.org/show_bug.cgi?id=12100)

might hang upon resuming, OTOH it should have likely hanged each and every time.

(1) possible deadlock in microcode_resume_cpu() if either 'if' section is
taken;

(2) now, I don't see it in spec. and can't experimentally verify it (newer
ucodes don't seem to be available for my Core2duo)... but logically-wise, I'd
think that when read upon resuming, the 'microcode revision' (MSR 0x8B) should
be back to its original one (we need to reload ucode anyway so it doesn't seem
logical if a cpu doesn't drop the version)... if so, the comparison with
memcmp() for the full 'struct cpu_signature' is wrong... and that's how one of
the aforementioned 'if' sections might have been triggered - leading to a
deadlock.

Obviously, in my tests I simulated loading/resuming with the ucode of the same
version (just to see that the file is loaded/re-loaded upon resuming) so this
issue has never popped up.

I'd appreciate if someone with an appropriate system might give a try to the
2nd patch (titled "fix a comparison && deadlock...").

In any case, the deadlock situation is a must-have fix.

Reported-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Tested-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

Merge branch 'iommu-fixes-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent

x86 gart: don't complain if no AMD GART found

Impact: remove annoying bootup printk

It's perfectly normal for no AMD GART to be present, e.g., if you have
Intel CPUs. None of the other iommu_init() functions makes noise when
it finds nothing.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Acked-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

AMD IOMMU: panic if completion wait loop fails

Impact: prevents data corruption after a failed completion wait loop

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

AMD IOMMU: set cmd buffer pointers to zero manually

Impact: set cmd buffer head and tail pointers to zero in case nobody else did

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

x86: re-enable MCE on secondary CPUS after suspend/resume

Impact: fix disabled MCE after resume

Don't prevent multiple initialization of MCEs.

Back from early prehistory mcheck_init() has a reentry check. Presumably
that was needed in very old kernels to prevent it entering twice.

But as Andreas points out this prevents CPU hotplug (and therefore resume)
to correctly reinitialize MCEs when a AP boots again after being
offlined.

Just drop the check.

Reported-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

AMD IOMMU: allocate rlookup_table with __GFP_ZERO

Impact: fix bug which can lead to panic in prealloc_protection_domains()

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: add quirk fix for Freecom HDD

Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  pata_hpt366: no ATAPI DMA
  pata_hpt366: fix cable detection,
  libata: fix Seagate NCQ+FLUSH blacklist

Merge branch 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6

* 'sh/for-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Disable GENERIC_HARDIRQS_NO__DO_IRQ for unconverted platforms.
sh: maple: Do not pass SLAB_POISON to kmem_cache_create()

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
  powerpc/cell/axon-msi: Fix MSI after kexec
  powerpc: Fix bootmem reservation on uninitialized node
  powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area

mm: Don't touch uninitialized variable in do_pages_stat_array()

Commit 80bba1290ab5122c60cdb73332b26d288dc8aedd removed one necessary
variable initialization.  As a result following warning happened:

    CC      mm/migrate.o
  mm/migrate.c: In function 'sys_move_pages':
  mm/migrate.c:1001: warning: 'err' may be used uninitialized in this function

More unfortunately, if find_vma() failed, kernel read uninitialized
memory.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
CC: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pata_hpt366: no ATAPI DMA

IDE hpt366 driver doesn't allow DMA for ATAPI devices and MWDMA2 on
ATAPI device locks up pata_hpt366. Follow the suit.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

pata_hpt366: fix cable detection,

pata_hpt366 is strange in that its two channels occupy two PCI
functions and both are primary channels and bit1 of PCI configuration
register 0x5A indicates cable for both channels.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

libata: fix Seagate NCQ+FLUSH blacklist

Due to miscommunication, P/N was mistaken as firmware revision
strings. Update it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>

sh: Disable GENERIC_HARDIRQS_NO__DO_IRQ for unconverted platforms.

Presently limited to Cayman, Dreamcast, Microdev, and SystemH 7751.
Re-enable it for everyone once these have been fixed up.

Signed-off-by: Paul Mundt <lethal@linux-sh.org>

sh: maple: Do not pass SLAB_POISON to kmem_cache_create()

SLAB_POISON is not a valid flag for kmem_create_cache() unless
CONFIG_DEBUG_SLAB is set, so remove it from the flags argument.

Acked-by: Adrian McMenamin <adrian@newgolddream.dyndns.info>
Signed-off-by: Matt Fleming <mjf@gentoo.org>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>

powerpc/cell/axon-msi: Fix MSI after kexec

Commit d015fe995 'powerpc/cell/axon-msi: Retry on missing interrupt'
has turned a rare failure to kexec on QS22 into a reproducible
error, which we have now analysed.

The problem is that after a kexec, the MSIC hardware still points
into the middle of the old ring buffer.  We set up the ring buffer
during reboot, but not the offset into it.  On older kernels, this
would cause a storm of thousands of spurious interrupts after a
kexec, which would most of the time get dropped silently.

With the new code, we time out on each interrupt, waiting for
it to become valid.  If more interrupts come in that we time
out on, this goes on indefinitely, which eventually leads to
a hard crash.

The solution in this commit is to read the current offset from
the MSIC when reinitializing it.  This now works correctly, as
expected.

Reported-by: Dirk Herrendoerfer <d.herrendoerfer@de.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>

powerpc: Fix bootmem reservation on uninitialized node

careful_allocation() was calling into the bootmem allocator for
nodes which had not been fully initialized and caused a previous
bug:  http://patchwork.ozlabs.org/patch/10528/  So, I merged a
few broken out loops in do_init_bootmem() to fix it.  That changed
the code ordering.

I think this bug is triggered by having reserved areas for a node
which are spanned by another node's contents.  In the
mark_reserved_regions_for_nid() code, we attempt to reserve the
area for a node before we have allocated the NODE_DATA() for that
nid.  We do this since I reordered that loop.  I suck.

This is causing crashes at bootup on some systems, as reported
by Jon Tollefson.

This may only present on some systems that have 16GB pages
reserved.  But, it can probably happen on any system that is
trying to reserve large swaths of memory that happen to span other
nodes' contents.

This commit ensures that we do not touch bootmem for any node which
has not been initialized, and also removes a compile warning about
an unused variable.

Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

powerpc: Check for valid hugepage size in hugetlb_get_unmapped_area

It looks like most of the hugetlb code is doing the correct thing if
hugepages are not supported, but the mmap code is not.  If we get into
the mmap code when hugepages are not supported, such as in an LPAR
which is running Active Memory Sharing, we can oops the kernel.  This
fixes the oops being seen in this path.

oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: nfs(N) lockd(N) nfs_acl(N) sunrpc(N) ipv6(N) fuse(N) loop(N)
dm_mod(N) sg(N) ibmveth(N) sd_mod(N) crc_t10dif(N) ibmvscsic(N)
scsi_transport_srp(N) scsi_tgt(N) scsi_mod(N)
Supported: No
NIP: c000000000038d60 LR: c00000000003945c CTR: c0000000000393f0
REGS: c000000077e7b830 TRAP: 0300   Tainted: G
(2.6.27.5-bz50170-2-ppc64)
MSR: 8000000000009032 <EE,ME,IR,DR>  CR: 44000448  XER: 20000001
DAR: c000002000af90a8, DSISR: 0000000040000000
TASK = c00000007c1b8600[4019] 'hugemmap01' THREAD: c000000077e78000 CPU: 6
GPR00: 0000001fffffffe0 c000000077e7bab0 c0000000009a4e78 0000000000000000
GPR04: 0000000000010000 0000000000000001 00000000ffffffff 0000000000000001
GPR08: 0000000000000000 c000000000af90c8 0000000000000001 0000000000000000
GPR12: 000000000000003f c000000000a73880 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000010000
GPR20: 0000000000000000 0000000000000003 0000000000010000 0000000000000001
GPR24: 0000000000000003 0000000000000000 0000000000000001 ffffffffffffffb5
GPR28: c000000077ca2e80 0000000000000000 c00000000092af78 0000000000010000
NIP [c000000000038d60] .slice_get_unmapped_area+0x6c/0x4e0
LR [c00000000003945c] .hugetlb_get_unmapped_area+0x6c/0x80
Call Trace:
[c000000077e7bbc0] [c00000000003945c] .hugetlb_get_unmapped_area+0x6c/0x80
[c000000077e7bc30] [c000000000107e30] .get_unmapped_area+0x64/0xd8
[c000000077e7bcb0] [c00000000010b140] .do_mmap_pgoff+0x140/0x420
[c000000077e7bd80] [c00000000000bf5c] .sys_mmap+0xc4/0x140
[c000000077e7be30] [c0000000000086b4] syscall_exit+0x0/0x40
Instruction dump:
fac1ffb0 fae1ffb8 fb01ffc0 fb21ffc8 fb41ffd0 fb61ffd8 fb81ffe0 fbc1fff0
fbe1fff8 f821fef1 f8c10158 f8e10160 <7d49002e> f9010168 e92d01b0 eb4902b0

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>

Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 5348/1: fix documentation wrt location of the alignment trap interface
  [ARM] Ensure linux/hardirqs.h is included where required
  [ARM] fix kernel-doc syntax
  [ARM] arch/arm/common/sa1111.c: Correct error handling code
  [ARM] 5341/2: there is no copy_page on nommu ARM

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  Phonet: keep TX queue disabled when the device is off
  SCHED: netem: Correct documentation comment in code.
  netfilter: update rwlock initialization for nat_table
  netlabel: Compiler warning and NULL pointer dereference fix
  e1000e: fix double release of mutex
  IA64: HP_SIMETH needs to depend upon NET
  netpoll: fix race on poll_list resulting in garbage entry
  ipv6: silence log messages for locally generated multicast
  sungem: improve ethtool output with internal pcs and serdes
  tcp: tcp_vegas cong avoid fix
  sungem: Make PCS PHY support partially work again.

Define smp_call_function_many for UP

Otherwise those using it in transition patches (eg. kvm) can't compile
with CONFIG_SMP=n:

arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function 'make_all_cpus_request':
arch/x86/kvm/../../../virt/kvm/kvm_main.c:380: error: implicit declaration of function 'smp_call_function_many'

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cgroups: fix a race between rmdir and remount

When a cgroup is removed, it's unlinked from its parent's children list,
but not actually freed until the last dentry on it is released (at which
point cgrp->root->number_of_cgroups is decremented).

Currently rebind_subsystems checks for the top cgroup's child list being
empty in order to rebind subsystems into or out of a hierarchy - this can
result in the set of subsystems bound to a hierarchy being
removed-but-not-freed cgroup.

The simplest fix for this is to forbid remounts that change the set of
subsystems on a hierarchy that has removed-but-not-freed cgroups. This
bug can be reproduced via:

mkdir /mnt/cg
mount -t cgroup -o ns,freezer cgroup /mnt/cg
mkdir /mnt/cg/foo
sleep 1h < /mnt/cg/foo &
rmdir /mnt/cg/foo
mount -t cgroup -o remount,ns,devices,freezer cgroup /mnt/cg
kill $!

Though the above will cause oops in -mm only but not mainline, but the bug
can cause memory leak in mainline (and even oops)

Signed-off-by: Paul Menage <menage@google.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ACPI toshiba: only register rfkill if bt is enabled

Part of the rfkill initialization was done whenever BT was on or not.  The
following patch checks for BT presence before registering the rfkill to
the input layer.  Some minor cleanups (> 80 char lines) were also added in
the process.

On Tue, Oct 28, 2008 at 10:10:37PM +0300, Andrey Borzenkov wrote:
[...]
> [   66.633036] toshiba_acpi: Toshiba Laptop ACPI Extras version 0.19
> [   66.633054] toshiba_acpi:     HCI method: \_SB_.VALD.GHCI
> [   66.637764] input: Toshiba RFKill Switch as /devices/virtual/input/input3
[...]
> [  113.920753] ------------[ cut here ]------------
> [  113.920828] kernel BUG at /home/bor/src/linux-git/net/rfkill/rfkill.c:347!
> [  113.920845] invalid opcode: 0000 [#1]
> [  113.920877] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/host0/target0:0:0/0:0:0:0/block/sda/size
> [  113.920900] Dumping ftrace buffer:
> [  113.920919]    (ftrace buffer empty)
> [  113.920933] Modules linked in: af_packet irnet ppp_generic slhc ircomm_tty ircomm binfmt_misc loop dm_mirror dm_region_hash dm_log dm_round_robin dm_multipath dm_mod alim15x3 ide_core nvram toshiba cryptomgr aead crypto_blkcipher michael_mic crypto_algapi orinoco_cs orinoco hermes_dld hermes pcmcia firmware_class snd_ali5451 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device smsc_ircc2 snd_pcm_oss snd_pcm rtc_cmos irda snd_timer snd_mixer_oss rtc_core snd crc_ccitt yenta_socket rtc_lib rsrc_nonstatic i2c_ali1535 pcmcia_core pcspkr psmouse soundcore i2c_core evdev sr_mod snd_page_alloc alim1535_wdt cdrom fan sg video output toshiba_acpi rfkill thermal backlight ali_agp processor ac button input_polldev battery agpgart ohci_hcd usbcore reiserfs pata_ali libata sd_mod scsi_mod [last unloaded: scsi_wait_scan]
> [  113.921765]
> [  113.921785] Pid: 3272, comm: ipolldevd Not tainted (2.6.28-rc2-1avb #3) PORTEGE 4000
> [  113.921801] EIP: 0060:[<dfaa4683>] EFLAGS: 00010246 CPU: 0
> [  113.921854] EIP is at rfkill_force_state+0x53/0x90 [rfkill]
> [  113.921870] EAX: 00000000 EBX: 00000000 ECX: 00000003 EDX: 00000000
> [  113.921885] ESI: 00000000 EDI: ddd50300 EBP: d8d7af40 ESP: d8d7af24
> [  113.921900]  DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> [  113.921918] Process ipolldevd (pid: 3272, ti=d8d7a000 task=d8d93c90 task.ti=d8d7a000)
> [  113.921933] Stack:
> [  113.921945]  d8d7af38 00000246 dfb029d8 dfb029c0 dfb029d8 dfb029c0 ddd50300 d8d7af5c
> [  113.922014]  dfb018e2 01000246 01000000 ddd50300 ddd50314 ddabb8a0 d8d7af68 dfb381c1
> [  113.922098]  00000000 d8d7afa4 c012ec0a 00000000 00000002 00000000 c012eba8 ddabb8c0
> [  113.922240] Call Trace:
> [  113.922240]  [<dfb018e2>] ? bt_poll_rfkill+0x5c/0x82 [toshiba_acpi]
> [  113.922240]  [<dfb381c1>] ? input_polled_device_work+0x11/0x40 [input_polldev]
> [  113.922240]  [<c012ec0a>] ? run_workqueue+0xea/0x1f0
> [  113.922240]  [<c012eba8>] ? run_workqueue+0x88/0x1f0
> [  113.922240]  [<dfb381b0>] ? input_polled_device_work+0x0/0x40 [input_polldev]
> [  113.922240]  [<c012f047>] ? worker_thread+0x87/0xf0
> [  113.922240]  [<c0132b00>] ? autoremove_wake_function+0x0/0x50
> [  113.922240]  [<c012efc0>] ? worker_thread+0x0/0xf0
> [  113.922240]  [<c013280f>] ? kthread+0x3f/0x80
> [  113.922240]  [<c01327d0>] ? kthread+0x0/0x80
> [  113.922240]  [<c01040d7>] ? kernel_thread_helper+0x7/0x10
> [  113.922240] Code: 43 54 89 73 54 39 c6 74 11 89 d9 ba 01 00 00 00 b8 40 68 aa df e8 3e 35 69 e0 89 f8 e8 77 fd 85 e0 31 c0 83 c4 10 5b 5e 5f 5d c3 <0f> 0b eb fe 89 f6 8d bc 27 00 00 00 00 be f4 4d aa df bb 5f 01
> [  113.922240] EIP: [<dfaa4683>] rfkill_force_state+0x53/0x90 [rfkill] SS:ESP 0068:d8d7af24
> [  113.924700] ---[ end trace 0e404eb40cadd5f0 ]---

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@gmail.com>
Tested-by: Andrey Borzenkov <arvidjaar@mail.ru>
Acked-by: Len Brown <len.brown@intel.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Acked-by: Philip Langdale <philipl@overt.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

eCryptfs: Update maintainers

Tyler Hicks and Dustin Kirkland are now the primary contact points for
eCryptfs issues that may arise from this point forward.

Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
Acked-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
Acked-by: Dustin Kirkland <kirkland@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

slob: do not pass the SLAB flags as GFP in kmem_cache_create()

The kmem_cache_create() function in the slob allocator passes the SLAB
flags as GFP flags to the slob_alloc() function. The patch changes this
call to pass GFP_KERNEL as the other allocators seem to do.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pcmcia: blackfin: fix bug - add missing ; to MODULE macro

Cc: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[ARM] 5348/1: fix documentation wrt location of the alignment trap interface

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

[ARM] Ensure linux/hardirqs.h is included where required

... for the removal of it from asm-generic/local.h

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Phonet: keep TX queue disabled when the device is off

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

SCHED: netem: Correct documentation comment in code.

The netem simulator is no longer limited by Linux timer resolution HZ.
Not since Patrick McHardy changed the QoS system to use hrtimer.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>

netfilter: update rwlock initialization for nat_table

The commit e099a173573ce1ba171092aee7bb3c72ea686e59
(netfilter: netns nat: per-netns NAT table) renamed the
nat_table from __nat_table to nat_table without updating the
__RW_LOCK_UNLOCKED(__nat_table.lock).

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/galak/powerpc:
powerpc/fsl-booke: Fix problem with _tlbil_va being interrupted

x86 Fix VMI crash on boot in 2.6.28-rc8

VMI initialiation can relocate the fixmap, causing early_ioremap to
malfunction if it is initialized before the relocation. To fix this,
VMI activation is split into two phases; the detection, which must
happen before setting up ioremap, and the activation, which must happen
after parsing early boot parameters.

This fixes a crash on boot when VMI is enabled under VMware.

Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "sched_clock: prevent scd->clock from moving backwards"

This reverts commit 5b7dba4ff834259a5623e03a565748704a8fe449, which
caused a regression in hibernate, reported by and bisected by Fabio
Comolli.

This revert fixes

http://bugzilla.kernel.org/show_bug.cgi?id=12155
http://bugzilla.kernel.org/show_bug.cgi?id=12149

Bisected-by: Fabio Comolli <fabio.comolli@gmail.com>
Requested-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

[ARM] fix kernel-doc syntax

Fix kernel-doc notation to use correct syntax. Even though this should be
moved to where the function is actually implemented...

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

[ARM] arch/arm/common/sa1111.c: Correct error handling code

If it is reasonable to apply PTR_ERR to the result of calling clk_get, then
that result should first be tested with IS_ERR, not with !.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression E,E1;
@@

if (
-   E == NULL
+   IS_ERR(E)
   ) { <+... when != E = E1
        PTR_ERR(E)
       ...+> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

ieee1394: add quirk fix for Freecom HDD

According to http://bugzilla.kernel.org/show_bug.cgi?id=12206, Freecom
FireWire Hard Drive 1TB reports max_rom=2 but returns garbage if block
read requests are used to read the config ROM. Force max_rom=0 to limit
them to quadlet read requests.

Reported-by: Christian Mueller <cm1@mumac.de>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>

powerpc/fsl-booke: Fix problem with _tlbil_va being interrupted

An example calling sequence which we did see:

copy_user_highpage -> kmap_atomic -> flush_tlb_page -> _tlbil_va

We got interrupted after setting up the MAS registers before the
tlbwe and the interrupt handler that caused the interrupt also did
a kmap_atomic (ide code) and thus on returning from the interrupt
the MAS registers no longer contained the proper values.

Since we dont save/restore MAS registers for normal interrupts we
need to disable interrupts in _tlbil_va to ensure atomicity.

Signed-off-by: Kumar Gala <galak@kernel.crashing.org>

Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jwboyer/powerpc-4xx:
powerpc/40x: Add proper BOOTCFLAGS for cuboot-acadia

Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6

* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c-highlander: Trivial endian casting fixes
i2c-pmcmsp: Fix endianness misannotation

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
Commands needing to be retried require a complete re-initialization.

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
  MIPS: IP32: Update defconfig
  MIPS: Add missing calls to plat_unmap_dma_mem.
  MIPS: Kconfig: Fix the arch-specific header path
  MIPS: Use EI/DI for MIPS R2.

console ASCII glyph 1:1 mapping

For the console, there is a 1:1 mapping of glyphs which cannot be found
in the current font.  This seems to be meant as a kind of 'emergency
fallback' for fonts without unicode mapping which otherwise would
display nothing readable on the screen.

At the moment it affects all chars for which no substitution character
is defined.  In particular this means that for all chars (>= 128) where
there is no iso88591-1/unicode character (e.g.  control character area)
you'll get the very strange 1:1 mapping of the (cp437) graphics card
glyphs.

I'm pretty sure that the 1:1 mapping should only affect strict ASCII
code characters, i.e.  chars < 128.

The patch limits the mapping as it probably was meant anyway.

Signed-off-by: Ingo Brueckl <ib@wupperonline.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Egmont Koblinger <egmont@uhulinux.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

unicode table for cp437

There is a major bug in the cp437 to unicode translation table.  Char
0x7c is mapped to U+00a5 which is the Yen sign and wrong.  The right
mapping is U+00a6 (broken bar).

Furthermore, a mapping for U+00b4 (a widely used character) is missing
even though easily possible.

The patch fixes these, as well as it provides a few other useful
mappings.

The changes are as follows:

  0x0f (enhancement) enables a sort of currency symbol
  0x27 (bug) enables a sort of acute accent which is a widely used character
  0x44 (enhancement) enables a sort of icelandic capital letter eth
  0x7c (major bug) corrects mapping
  0xeb (enhancement) enables a sort of icelandic small letter eth
  0xee (enhancement) enables a sort of math 'element of'

Signed-off-by: Ingo Brueckl <ib@wupperonline.de>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MIPS: IP32: Update defconfig

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Add missing calls to plat_unmap_dma_mem.

dma_free_noncoherent() and dma_free_coherent() are missing calls to
plat_unmap_dma_mem(). This patch adds them.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Kconfig: Fix the arch-specific header path

The header path in the help text for the RUNTIME_DEBUG config option is
obsolete and needs to be updated to match the new location of
architecture-specific header files. While at it, fix the spelling mistake.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@movial.fi>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Use EI/DI for MIPS R2.

For MIPS R2, use the EI and DI instructions to enable and disable
interrupts.

Signed-off-by: Tomaso Paoletti <tpaoletti@caviumnetworks.com>
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Commands needing to be retried require a complete re-initialization.

The test-unit-ready portion of this patch was causing boots to fail on
my test machine (as in http://lkml.org/lkml/2008/12/5/161). With this
patch in place, the system is booting reliably.

Mike Anderson found the same problem in the hp_hw_start_stop code,
and I applied the same solution in cdrom_read_cdda_bpc.

Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

netlabel: Compiler warning and NULL pointer dereference fix

Fix the two compiler warnings show below.  Thanks to Geert Uytterhoeven for
finding and reporting the problem.

net/netlabel/netlabel_unlabeled.c:567: warning: 'entry' may be used
   uninitialized in this function
net/netlabel/netlabel_unlabeled.c:629: warning: 'entry' may be used
   uninitialized in this function

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e1000e: fix double release of mutex

During a reset, releasing the swflag after it failed to be acquired would
cause a double unlock of the mutex. Instead, test whether acquisition of
the swflag was successful and if not, do not release the swflag. The reset
must still be done to bring the device to a quiescent state.

This resolves [BUG 12200] BUG: bad unlock balance detected! e1000e
http://bugzilla.kernel.org/show_bug.cgi?id=12200

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

powerpc/40x: Add proper BOOTCFLAGS for cuboot-acadia

The cuboot-acadia.c wrapper can cause assembler errors on some
toolchains due to the lack of the proper BOOTCFLAGS. This adds
the proper flags for the file.

Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>

i2c-highlander: Trivial endian casting fixes

Fixes sparse warnings:
drivers/i2c/busses/i2c-highlander.c:95:26: warning: incorrect type in argument 1 (different base types)
drivers/i2c/busses/i2c-highlander.c:95:26:    expected restricted __be16 const [usertype] *p
drivers/i2c/busses/i2c-highlander.c:95:26:    got unsigned short [usertype] *<noident>
drivers/i2c/busses/i2c-highlander.c:106:15: warning: incorrect type in assignment (different base types)
drivers/i2c/busses/i2c-highlander.c:106:15:    expected unsigned short [unsigned] [short] [usertype] <noident>
drivers/i2c/busses/i2c-highlander.c:106:15:    got restricted __be16

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

i2c-pmcmsp: Fix endianness misannotation

tmp is used as host-endian and is loaded from a be64, fix the cast and the
endian accessor used.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Jean Delvare <khali@linux-fr.org>

[ARM] 5341/2: there is no copy_page on nommu ARM

... as it is defined with memcpy, therefore no copy_page symbol to
export.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

Revert "radeonfb: accelerate imageblit and other improvements"

This reverts commit b1ee26bab14886350ba12a5c10cbc0696ac679bf, along with
the "fixes" for it that all just caused problems:

- c4c6fa9891f3d1bcaae4f39fb751d5302965b566 "radeonfb: fix problem with
color expansion & alignment"

- f3179748a157c21d44d929fd3779421ebfbeaa93 "radeonfb: Disable new color
expand acceleration unless explicitely enabled"

because even when disabled, it breaks for people. See

http://bugzilla.kernel.org/show_bug.cgi?id=12191

for the latest example.

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Krzysztof Halasa <khc@pm.waw.pl>
Cc: James Cloos <cloos@jhcloos.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Jean-Luc Coulon <jean.luc.coulon@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

IA64: HP_SIMETH needs to depend upon NET

From: Alexander Beregalov <a.beregalov@gmail.com>

Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 2.6.28-rc8

Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: CPU remove deadlock fix

fix mapping_writably_mapped()

Lee Schermerhorn noticed yesterday that I broke the mapping_writably_mapped
test in 2.6.7! Bad bad bug, good good find.

The i_mmap_writable count must be incremented for VM_SHARED (just as
i_writecount is for VM_DENYWRITE, but while holding the i_mmap_lock)
when dup_mmap() copies the vma for fork: it has its own more optimal
version of __vma_link_file(), and I missed this out. So the count
was later going down to 0 (dangerous) when one end unmapped, then
wrapping negative (inefficient) when the other end unmapped.

The only impact on x86 would have been that setting a mandatory lock on
a file which has at some time been opened O_RDWR and mapped MAP_SHARED
(but not necessarily PROT_WRITE) across a fork, might fail with -EAGAIN
when it should succeed, or succeed when it should fail.

But those architectures which rely on flush_dcache_page() to flush
userspace modifications back into the page before the kernel reads it,
may in some cases have skipped the flush after such a fork - though any
repetitive test will soon wrap the count negative, in which case it will
flush_dcache_page() unnecessarily.

Fix would be a two-liner, but mapping variable added, and comment moved.

Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland

* 'to-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-roland:
tracehook: exec double-reporting fix

lib/idr.c: Fix bug introduced by RCU fix

The last patch to lib/idr.c caused a bug if idr_get_new_above() was
called on an empty idr.

Usually, nodes stay on the same layer. New layers are added to the top
of the tree.

The exception is idr_get_new_above() on an empty tree: In this case, the
new root node is first added on layer 0, then moved upwards. p->layer
was not updated.

As usual: You shall never rely on the source code comments, they will
only mislead you.

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MN10300: Give correct size when reserving interrupt vector table

Give the correct size when reserving the interrupt vector table. It should be
a page not a single byte.

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MN10300: Fix __put_user_asm8()

Fix __put_user_asm8() by jumping to the end label (3:) from the exception
handler, rather than jumping back to retry the second store instruction (label
2:).

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MN10300: Fix the preemption resume_kernel() routine

Fix the preemption resume_kernel() routine by inverting the test to see
whether interrupts are off (IM7 is all enabled, not all disabled).

Furthermore, interrupts should be disabled on entry to resume_kernel() so that
they're correctly set for jumping to restore_all() and doing the need
reschedule test.

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MN10300: Discard low-priority Tx interrupts when closing an on-chip serial port

Discard low-prioriy Tx interrupts when closing an MN10300 on-chip serial port.

The MN10300 on-chip serial port uses three interrupts to manage its serial
ports:

(1) A very high priority interrupt that drives virtual DMA for Rx.

(2) A very high priority interrupt that drives virtual DMA for Tx.

(3) A normal priority virtual interrupt that does the normal UART interrupt
     stuff and is shared between Rx and Tx.

mn10300_serial_stop_tx() only disables the high priority Tx interrupt.  It
doesn't also disable the normal priority one because it is shared with Rx.

However, the high priority interrupt may interrupt local_irq_disabled()
sections, and so may have queued up a low priority virtual interrupt whilst the
UART driver is asking for the Tx interrupt to be disabled.

The result of this can be an oops when we try to process the interrupt in
mn10300_serial_transmit_interrupt() as port->uart.info and port->uart.info->tty
may have gone away.

To deal with this, if either of those pointers is NULL, we make sure the
high-priority Tx interrupt is disabled and discard the interrupt.  The low
priority interrupt is disabled by the mn10300_serial_pic irq_chip table.

Signed-off-by: Akira Takeuchi <takeuchi.akr@jp.panasonic.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

MN10300: vmlinux.lds.S cleanup - use PAGE_SIZE, PERCPU macros

Include the linux/page.h header into the MN10300 kernel linker script thus
allowing us to use PAGE_SIZE macro instead of a numeric constant.

Also use the PERCPU macro instead of an explicit section definition.

Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: api - Disallow cryptomgr as a module if algorithms are built-in

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch
PCI: stop leaking 'slot_name' in pci_create_slot

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6:
  [IA64] SN: prevent IRQ retargetting in request_irq()
  [IA64] Fix section mismatch ioc3uart_init()/ioc3uart_submodule
  [IA64] Clear up section mismatch for ioc4_ide_attach_one.
  [IA64] Clear up section mismatch with arch_unregister_cpu()
  [IA64] Clear up section mismatch for sn_check_wars.
  [IA64] Updated the generic_defconfig to work with the 2.6.28-rc7 kernel.
  [IA64] Fix GRU compile error w/o CONFIG_HUGETLB_PAGE
  [IA64] eliminate NULL test and memset after alloc_bootmem
  [IA64] remove BUILD_BUG_ON from paravirt_getreg()

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: Better than nothing implementation of PCI mmap to fix X.

pktcdvd: remove broken dev_t export of class devices

The pktcdvd created class devices only export some sysfs files,
but have no char dev_t registered in the driver.

At class device creation time they copy the dev_t value of the
block device to the char device, wich will register a new char
device in the driver core and userspace, with a conflicting dev_t
value.

In many cases the class devices dev_t just points to a random
USB device. This fixes the sysfs "duplicate entry" errors.

Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Acked-by: Peter Osterlund <petero2@telia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: fw-ohci: fix IOMMU resource exhaustion
ieee1394: node manager causes up to ~3.25s delay in freezing tasks

drivers/video/mb862xx/mb862xxfb.c: fix printk

sparc64:

drivers/video/mb862xx/mb862xxfb.c:929: warning: long long unsigned int format, resource_size_t arg (arg 4)
drivers/video/mb862xx/mb862xxfb.c:931: warning: long long unsigned int format, resource_size_t arg (arg 4)

We don't know what type the architecture uses to implement u64, hence they
cannot be printed.

Cc: Anatolij Gustschin <agust@denx.de>
Cc: Dmitry Baryshkov <dbaryshkov@gmail.com>
Cc: Anton Vorontsov <avorontsov@ru.mvista.com>
Cc: Matteo Fortini <m.fortini@selcomgroup.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

KSYM_SYMBOL_LEN fixes

Miles Lane tailing /sys files hit a BUG which Pekka Enberg has tracked
to my 966c8c12dc9e77f931e2281ba25d2f0244b06949 sprint_symbol(): use
less stack exposing a bug in slub's list_locations() -
kallsyms_lookup() writes a 0 to namebuf[KSYM_NAME_LEN-1], but that was
beyond the end of page provided.

The 100 slop which list_locations() allows at end of page looks roughly
enough for all the other stuff it might print after the symbol before
it checks again: break out KSYM_SYMBOL_LEN earlier than before.

Latencytop and ftrace and are using KSYM_NAME_LEN buffers where they
need KSYM_SYMBOL_LEN buffers, and vmallocinfo a 2*KSYM_NAME_LEN buffer
where it wants a KSYM_SYMBOL_LEN buffer: fix those before anyone copies
them.

[akpm@linux-foundation.org: ftrace.h needs module.h]
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc Miles Lane <miles.lane@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

inotify: fix IN_ONESHOT unmount event watcher

On umount two event will be dispatched to watcher:

1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
    ->inotify_dev_queue_event(.., IN_IGNORED, ..)

But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event.  Which result in accessing invalid object later.  IMHO
it is not pure regression.  This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)

  commit ac74c00e499ed276a965e5b5600667d5dc04a84a
  Author: Ulisses Furquim <ulissesf@gmail.com>
  Date:   Fri Feb 8 04:18:16 2008 -0800
    inotify: fix check for one-shot watches before destroying them
    As the IN_ONESHOT bit is never set when an event is sent we must check it
    in the watch's mask and not in the event's mask.

TESTCASE:
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
./inotify mnt/d&
umount mnt ## << lockup or crash here

TESTSOURCE:
/* gcc -oinotify inotify.c */
#include <stdio.h>
#include <stdlib.h>
#include <sys/inotify.h>

int main(int argc, char **argv)
{
        char buf[1024];
        struct inotify_event *ie;
        char *p;
        int i;
        ssize_t l;

        p = argv[1];
        i = inotify_init();
        inotify_add_watch(i, p, ~0);

        l = read(i, buf, sizeof(buf));
        printf("read %d bytes\n", l);
        ie = (struct inotify_event *) buf;
        printf("event mask: %d\n", ie->mask);
return 0;
}

Signed-off-by: Dmitri Monakhov <dmonakhov@openvz.org>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Robert Love <rlove@google.com>
Cc: Ulisses Furquim <ulissesf@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

atomic: fix a typo in atomic_long_xchg()

atomic_long_xchg() is not correctly defined for 32bit arches.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: no get_user/put_user while holding mmap_sem in do_pages_stat?

Since commit 2f007e74bb85b9fc4eab28524052161703300f1a, do_pages_stat()
gets the page address from user-space and puts the corresponding status
back while holding the mmap_sem for read. There is no need to hold
mmap_sem there while some page-faults may occur.

This patch adds a temporary address and status buffer so as to only
hold mmap_sem while working on these kernel buffers. This is
implemented by extracting do_pages_stat_array() out of do_pages_stat().

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

drivers/serial/s3c2440.c: fix typo in MODULE_LICENSE

Signed-off-by: Balaji Rao <balajirrao@gmail.com>
Acked-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

pagemap: fix 32-bit pagemap regression

The large pages fix from bcf8039ed45 broke 32-bit pagemap by pulling the
pagemap entry code out into a function with the wrong return type.
Pagemap entries are 64 bits on all systems and unsigned long is only 32
bits on 32-bit systems.

Signed-off-by: Matt Mackall <mpm@selenic.com>
Reported-by: Doug Graham <dgraham@nortel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

page_cgroup should ignore empty nodes

Fix a total bootup freeze on ia64.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Tested-by: Li Zefan <lizf@cn.fujitsu.com>
Reported-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc twl4030: rename ioctl function when RTC_INTF_DEV=n

Fix build error when RTC_INTF_DEV=n:

drivers/rtc/rtc-twl4030.c:402: error: 'twl4030_rtc_ioctl' undeclared here (not in a function)
make[3]: *** [drivers/rtc/rtc-twl4030.o] Error 1

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Samuel Ortiz <sameo@openedhand.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fbcon: fix workqueue shutdown

Add a call to cancel_work_sync() in fbcon_exit() to cancel any pending
work in the fbcon workqueue.

The current implementation of fbcon_exit() sets the fbcon workqueue
function info->queue.func to NULL, but does not assure that there is no
work pending when it does so.  On occasion, depending on system timing,
there will still be pending work in the queue when fbcon_exit() is
called.  This results in a null pointer deference when run_workqueue()
tries to call the queue's work function.

Fixes errors on shutdown similar to these:

  Console: switching to colour dummy device 80x25
  Unable to handle kernel paging request for data at address 0x00000000

Signed-off-by: Geoff Levand <geoffrey.levand@am.sony.com>
Cc: Krzysztof Helt <krzysztof.h1@poczta.fm>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: remove UP version of lru_add_drain_all()

Currently, lru_add_drain_all() has two version.
  (1) use schedule_on_each_cpu()
  (2) don't use schedule_on_each_cpu()

Gerald Schaefer reported it doesn't work well on SMP (not NUMA) S390
machine.

  offline_pages() calls lru_add_drain_all() followed by drain_all_pages().
  While drain_all_pages() works on each cpu, lru_add_drain_all() only runs
  on the current cpu for architectures w/o CONFIG_NUMA. This let us run
  into the BUG_ON(!PageBuddy(page)) in __offline_isolated_pages() during
  memory hotplug stress test on s390. The page in question was still on the
  pcp list, because of a race with lru_add_drain_all() and drain_all_pages()
  on different cpus.

Actually, Almost machine has CONFIG_UNEVICTABLE_LRU=y. Then almost machine use
(1) version lru_add_drain_all although the machine is UP.

Then this ifdef is not valueable.
simple removing is better.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "percpu_counter: new function percpu_counter_sum_and_set"

Revert

    commit e8ced39d5e8911c662d4d69a342b9d053eaaac4e
    Author: Mingming Cao <cmm@us.ibm.com>
    Date:   Fri Jul 11 19:27:31 2008 -0400

        percpu_counter: new function percpu_counter_sum_and_set

As described in

revert "percpu counter: clean up percpu_counter_sum_and_set()"

the new percpu_counter_sum_and_set() is racy against updates to the
cpu-local accumulators on other CPUs.  Revert that change.

This means that ext4 will be slow again.  But correct.

Reported-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Cc: <stable@kernel.org> [2.6.27.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

revert "percpu counter: clean up percpu_counter_sum_and_set()"

Revert

    commit 1f7c14c62ce63805f9574664a6c6de3633d4a354
    Author: Mingming Cao <cmm@us.ibm.com>
    Date:   Thu Oct 9 12:50:59 2008 -0400

        percpu counter: clean up percpu_counter_sum_and_set()

Before this patch we had the following:

percpu_counter_sum(): return the percpu_counter's value

percpu_counter_sum_and_set(): return the percpu_counter's value, copying
that value into the central value and zeroing the per-cpu counters before
returning.

After this patch, percpu_counter_sum_and_set() has gone, and
percpu_counter_sum() gets the old percpu_counter_sum_and_set()
functionality.

Problem is, as Eric points out, the old percpu_counter_sum_and_set()
functionality was racy and wrong.  It zeroes out counters on "other" cpus,
without holding any locks which will prevent races agaist updates from
those other CPUS.

This patch reverts 1f7c14c62ce63805f9574664a6c6de3633d4a354.  This means
that percpu_counter_sum_and_set() still has the race, but
percpu_counter_sum() does not.

Note that this is not a simple revert - ext4 has since started using
percpu_counter_sum() for its dirty_blocks counter as well.

Note that this revert patch changes percpu_counter_sum() semantics.

Before the patch, a call to percpu_counter_sum() will bring the counter's
central counter mostly up-to-date, so a following percpu_counter_read()
will return a close value.

After this patch, a call to percpu_counter_sum() will leave the counter's
central accumulator unaltered, so a subsequent call to
percpu_counter_read() can now return a significantly inaccurate result.

If there is any code in the tree which was introduced after
e8ced39d5e8911c662d4d69a342b9d053eaaac4e was merged, and which depends
upon the new percpu_counter_sum() semantics, that code will break.

Reported-by: Eric Dumazet <dada1@cosmosbay.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Cc: <linux-ext4@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

percpu_counter: fix CPU unplug race in percpu_counter_destroy()

We should first delete the counter from percpu_counters list
before freeing memory, or a percpu_counter_hotcpu_callback()
could dereference a NULL pointer.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc: fix missing id_table in rtc-ds1672 and rtc-max6900 drivers

Add missing id_table to the drivers in subject. Patch is against the
latest git. It should go in with 2.6.28 if possible, the drivers won't
work without the id_table bits.

Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Reported-by: Imre Kaloz <kaloz@openwrt.org>
Tested-by: Imre Kaloz <kaloz@openwrt.org>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

relayfs: fix infinite loop with splice()

Running kmemtraced, which uses splice() on relayfs, causes a hard lock on
x86-64 SMP.  As described by Tom Zanussi:

  It looks like you hit the same problem as described here:

  commit 8191ecd1d14c6914c660dfa007154860a7908857

      splice: fix infinite loop in generic_file_splice_read()

  relay uses the same loop but it never got noticed or fixed.

Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Tested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

uml: boot broken due to buffer overrun

mconsole_init() passed 256 bytes as length in os_create_unix_socket, while
the sizeof UNIX_PATH_MAX is 108. This patch fixes that problem and avoids
a big overrun bug reported on UML bootup.

sockaddr_un.sun_path is UNIX_PATH_MAX long which causes the problem.
Reported-by: Vikas K Managutte <vikki.km@gmail.com>
Reported-by: Sarvesh Kumar Lal Das <skldas@gmail.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: WANG Cong <wangcong@zeuux.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: <stable@kernel.org> [please check with Jeff]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/backing-dev.c: remove recently-added WARN_ON()

On second thoughts, this is just going to disturb people while telling us
things which we already knew.

Cc: Peter Korsgaard <jacmet@sunsite.dk>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

crypto: api - Disallow cryptomgr as a module if algorithms are built-in

If we have at least one algorithm built-in then it no longer makes
sense to have the testing framework, and hence cryptomgr to be a
module. It should be either on or off, i.e., built-in or disabled.

This just happens to stop a potential runaway modprobe loop that
seems to trigger on at least one distro.

With fixes from Evgeniy Polyakov.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

firewire: fw-ohci: fix IOMMU resource exhaustion

There is a DMA map/ unmap imbalance whenever a block write request
packet is sent and then dequeued with ohci_cancel_packet. The latter
may happen frequently if the AR resp tasklet is executed before the AT
req tasklet for the same transaction.

Add the missing dma_unmap_single. This fixes
https://bugzilla.redhat.com/show_bug.cgi?id=475156

Reported-by: Emmanuel Kowalski
Tested-by: Emmanuel Kowalski
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>

netpoll: fix race on poll_list resulting in garbage entry

A few months back a race was discused between the netpoll napi service
path, and the fast path through net_rx_action:
http://kerneltrap.org/mailarchive/linux-netdev/2007/10/16/345470

A patch was submitted for that bug, but I think we missed a case.

Consider the following scenario:

INITIAL STATE
CPU0 has one napi_struct A on its poll_list
CPU1 is calling netpoll_send_skb and needs to call poll_napi on the same
napi_struct A that CPU0 has on its list

CPU0 CPU1
net_rx_action poll_napi
!list_empty (returns true) locks poll_lock for A
poll_one_napi
  napi->poll
   netif_rx_complete
    __napi_complete
    (removes A from poll_list)
list_entry(list->next)

In the above scenario, net_rx_action assumes that the per-cpu poll_list is
exclusive to that cpu.  netpoll of course violates that, and because the netpoll
path can dequeue from the poll list, its possible for CPU0 to detect a non-empty
list at the top of the while loop in net_rx_action, but have it become empty by
the time it calls list_entry.  Since the poll_list isn't surrounded by any other
structure, the returned data from that list_entry call in this situation is
garbage, and any number of crashes can result based on what exactly that garbage
is.

Given that its not fasible for performance reasons to place exclusive locks
arround each cpus poll list to provide that mutal exclusion, I think the best
solution is modify the netpoll path in such a way that we continue to guarantee
that the poll_list for a cpu is in fact exclusive to that cpu.  To do this I've
implemented the patch below.  It adds an additional bit to the state field in
the napi_struct.  When executing napi->poll from the netpoll_path, this bit will
be set. When a driver calls netif_rx_complete, if that bit is set, it will not
remove the napi_struct from the poll_list.  That work will be saved for the next
iteration of net_rx_action.

I've tested this and it seems to work well.  About the biggest drawback I can
see to it is the fact that it might result in an extra loop through
net_rx_action in the event that the device is actually contended for (i.e. the
netpoll path actually preforms all the needed work no the device, and the call
to net_rx_action winds up doing nothing, except removing the napi_struct from
the poll_list.  However I think this is probably a small price to pay, given
that the alternative is a crash.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tracehook: exec double-reporting fix

The patch 6341c39 "tracehook: exec" introduced a small regression in
2.6.27 regarding binfmt_misc exec event reporting.  Since the reporting
is now done in the common search_binary_handler() function, an exec
of a misc binary will result in two (or possibly multiple) exec events
being reported, instead of just a single one, because the misc handler
contains a recursive call to search_binary_handler.

To add to the confusion, if PTRACE_O_TRACEEXEC is not active, the multiple
SIGTRAP signals will in fact cause only a single ptrace intercept, as the
signals are not queued.  However, if PTRACE_O_TRACEEXEC is on, the debugger
will actually see multiple ptrace intercepts (PTRACE_EVENT_EXEC).

The test program included below demonstrates the problem.

This change fixes the bug by calling tracehook_report_exec() only in the
outermost search_binary_handler() call (bprm->recursion_depth == 0).

The additional change to restore bprm->recursion_depth after each binfmt
load_binary call is actually superfluous for this bug, since we test the
value saved on entry to search_binary_handler().  But it keeps the use of
of the depth count to its most obvious expected meaning.  Depending on what
binfmt handlers do in certain cases, there could have been false-positive
tests for recursion limits before this change.

    /* Test program using PTRACE_O_TRACEEXEC.
       This forks and exec's the first argument with the rest of the arguments,
       while ptrace'ing.  It expects to see one PTRACE_EVENT_EXEC stop and
       then a successful exit, with no other signals or events in between.

       Test for kernel doing two PTRACE_EVENT_EXEC stops for a binfmt_misc exec:

       $ gcc -g traceexec.c -o traceexec
       $ sudo sh -c 'echo :test:M::foobar::/bin/cat: > /proc/sys/fs/binfmt_misc/register'
       $ echo 'foobar test' > ./foobar
       $ chmod +x ./foobar
       $ ./traceexec ./foobar; echo $?
       ==> good <==
       foobar test
       0
       $
       ==> bad <==
       foobar test
       unexpected status 0x4057f != 0
       3
       $

    */

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/wait.h>
    #include <sys/ptrace.h>
    #include <unistd.h>
    #include <signal.h>
    #include <stdlib.h>

    static void
    wait_for (pid_t child, int expect)
    {
      int status;
      pid_t p = wait (&status);
      if (p != child)
{
  perror ("wait");
  exit (2);
}
      if (status != expect)
{
  fprintf (stderr, "unexpected status %#x != %#x\n", status, expect);
  exit (3);
}
    }

    int
    main (int argc, char **argv)
    {
      pid_t child = fork ();

      if (child < 0)
{
  perror ("fork");
  return 127;
}
      else if (child == 0)
{
  ptrace (PTRACE_TRACEME);
  raise (SIGUSR1);
  execv (argv[1], &argv[1]);
  perror ("execve");
  _exit (127);
}

      wait_for (child, W_STOPCODE (SIGUSR1));

      if (ptrace (PTRACE_SETOPTIONS, child,
  0L, (void *) (long) PTRACE_O_TRACEEXEC) != 0)
{
  perror ("PTRACE_SETOPTIONS");
  return 4;
}

      if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
  perror ("PTRACE_CONT");
  return 5;
}

      wait_for (child, W_STOPCODE (SIGTRAP | (PTRACE_EVENT_EXEC << 8)));

      if (ptrace (PTRACE_CONT, child, 0L, 0L) != 0)
{
  perror ("PTRACE_CONT");
  return 6;
}

      wait_for (child, W_EXITCODE (0, 0));

      return 0;
    }

Reported-by: Arnd Bergmann <arnd@arndb.de>
CC: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Roland McGrath <roland@redhat.com>

ipv6: silence log messages for locally generated multicast

This patch fixes minor annoyance during transmission of unsolicited
neighbor advertisements from userspace to multicast addresses (as
far as I can see in RFC, this is allowed and the similar functionality
for IPv4 has been in arping for a long time).

Outgoing multicast packets get reinserted into local processing as if they
are received from the network. The machine thus sees its own NA and fills
the logs with error messages. This patch removes the message if NA has been
generated locally.

Signed-off-by: Jan Sembera <jsembera@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>

sungem: improve ethtool output with internal pcs and serdes

From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>

Attached is a patch which improves the output of ethtool (see below)
to some sensefull values with a sungem fibre card which uses the
sungem interal pcs connected to a serdes chip. The seriallink case in
the driver is untouched.

Most values are hardcoded, because gigabit fibre autoneg is anyways
limited and the driver don't really support much at the moment with
that hardware.

Signed-off-by: David S. Miller <davem@davemloft.net>

PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch

Makes a Compaq 6735s boot reliably again. It used to hang in the loop
on some boots. Give the link one second to train, otherwise break out
of the loop and reset the previously set clock bits.

Cc: stable@vger.kernel.org
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>