Rusty's patch to change our sysfs access to various registers
to use smp_call_function_single() introduced a whole bunch of
warnings. This fixes them. This version also fixes an actual
bug in here where it did mtspr instead of mfspr when reading
the files
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Josh Boyer [Wed, 25 Mar 2009 06:23:59 +0000 (06:23 +0000)]
powerpc: Sanitize stack pointer in signal handling code
On powerpc64 machines running 32-bit userspace, we can get garbage bits in the
stack pointer passed into the kernel. Most places handle this correctly, but
the signal handling code uses the passed value directly for allocating signal
stack frames.
This fixes the issue by introducing a get_clean_sp function that returns a
sanitized stack pointer. For 32-bit tasks on a 64-bit kernel, the stack
pointer is masked correctly. In all other cases, the stack pointer is simply
returned.
Additionally, we pass an 'is_32' parameter to get_sigframe now in order to
get the properly sanitized stack. The callers are know to be 32 or 64-bit
statically.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jeremy Kerr [Mon, 23 Mar 2009 16:55:08 +0000 (16:55 +0000)]
powerpc: Add write barrier before enabling DTL flags
Currently, we don't enforce any ordering for updates to the lppaca
when enabling dtl logging, so we may end up enabling logging before the
index fields have been established.
This change adds a smp_wmb() before doing the actual enable.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Tue, 24 Mar 2009 13:23:13 +0000 (08:23 -0500)]
powerpc/83xx: Update ranges in gianfar node to match other dts
The gianfar@25000 node was missing its ranges prop for the mdio bus
and provided an explicit ranges property on gianfar@24000 to match
change from commit:
Anton Vorontsov [Thu, 19 Mar 2009 18:01:51 +0000 (21:01 +0300)]
powerpc/86xx: Move gianfar mdio nodes under the ethernet nodes
Currently it doesn't matter where the mdio nodes are placed, but with
power management support (i.e. when sleep = <> properties will take
effect), mdio nodes placement will become important: mdio controller
is a part of the ethernet block, so the mdio nodes should be placed
correctly. Otherwise we may wrongly assume that MDIO controllers are
available during sleep.
Suggested-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Anton Vorontsov [Thu, 19 Mar 2009 18:01:48 +0000 (21:01 +0300)]
powerpc/85xx: Move gianfar mdio nodes under the ethernet nodes
Currently it doesn't matter where the mdio nodes are placed, but with
power management support (i.e. when sleep = <> properties will take
effect), mdio nodes placement will become important: mdio controller
is a part of the ethernet block, so the mdio nodes should be placed
correctly. Otherwise we may wrongly assume that MDIO controllers are
available during sleep.
Suggested-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Anton Vorontsov [Thu, 19 Mar 2009 18:01:45 +0000 (21:01 +0300)]
powerpc/83xx: Move gianfar mdio nodes under the ethernet nodes
Currently it doesn't matter where the mdio nodes are placed, but with
power management support (i.e. when sleep = <> properties will take
effect), mdio nodes placement will become important: mdio controller
is a part of the ethernet block, so the mdio nodes should be placed
correctly. Otherwise we may wrongly assume that MDIO controllers are
available during sleep.
Suggested-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Anton Vorontsov [Thu, 19 Mar 2009 18:01:42 +0000 (21:01 +0300)]
powerpc/83xx: Add power management support for MPC837x boards
This patch adds pmc nodes to the device tree files so that the boards
will able to use standby capability of MPC837x processors. The MPC837x
PMC controllers are compatible with MPC8349 ones (i.e. no deep sleep).
sleep = <> properties are used to specify SCCR masks as described
in "Specifying Device Power Management Information (sleep property)"
chapter in Documentation/powerpc/booting-without-of.txt.
Since I2C1 and eSDHC controllers share the same clock source, they
are now placed under sleep-nexus nodes.
A processor is able to wakeup the boards on LAN events (Wake-On-Lan),
console events (with no_console_suspend kernel command line), GPIO
events and external IRQs (IRQ1 and IRQ2).
The processor can also wakeup the boards by the fourth general purpose
timer in GTM1 block, but the GTM wakeup support isn't yet implemented
(it's tested to work, but it's unclear how can we use the quite short
GTM timers, and how do we want to expose the GTM to userspace).
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
This moves some MMU related init code out of setup_64.c into hash_utils_64.c
and calls it early_init_mmu() and early_init_mmu_secondary(). This will
make it easier to plug in a new MMU type.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/mm: Merge various PTE bits and accessors definitions
Now that they are almost identical, we can merge some of the definitions
related to the PTE format into common files.
This creates a new pte-common.h which is included by both 32 and 64-bit
right after the CPU specific pte-*.h file, and which defines some
bits to "default" values if they haven't been defined already, and
then provides a generic definition of most of the bit combinations
based on these and exposed to the rest of the kernel.
I also moved to the common pgtable.h most of the "small" accessors to the
PTE bits and modification helpers (pte_mk*). The actual accessors remain
in their separate files.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch tweaks the way some PTE bit combinations are defined, in such a
way that the 32 and 64-bit variant become almost identical and that will
make it easier to bring in a new common pte-* file for the new variant
of the Book3-E support.
The combination of bits defining access to kernel pages are now clearly
separated from the combination used by userspace and the core VM. The
resulting generated code should remain identical unless I made a mistake.
Note: While at it, I removed a non-sensical statement related to CONFIG_KGDB
in ppc_mmu_32.c which could cause kernel mappings to be user accessible when
that option is enabled. Probably something that bitrot.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Thu, 19 Mar 2009 03:55:41 +0000 (03:55 +0000)]
powerpc/mm: e300c2/c3/c4 TLB errata workaround
Complete workaround for DTLB errata in e300c2/c3/c4 processors.
Due to the bug, the hardware-implemented LRU algorythm always goes to way
1 of the TLB. This fix implements the proposed software workaround in
form of a LRW table for chosing the TLB-way.
Based on patch from David Jander <david@protonic.nl>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Thu, 19 Mar 2009 03:55:40 +0000 (03:55 +0000)]
powerpc/mm: Used free register to save a few cycles in SW TLB miss handling
Now that r0 is free we can keep the value of I/DMISS in r3 and not reload
it before doing the tlbli/d. This saves us a few cycles in the fast path
case.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Thu, 19 Mar 2009 03:55:39 +0000 (03:55 +0000)]
powerpc/mm: Remove unused register usage in SW TLB miss handling
Long ago we had some code that actually used the CTR in the SW TLB
miss handlers (603/e300). Since we don't use it no reason to waste
cycles saving it off and restoring it (we actually didn't restore it
in the fast path case).
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Thu, 19 Mar 2009 03:40:52 +0000 (03:40 +0000)]
powerpc: expect all devices calling dma ops to have archdata set
Now that we set archdata for of_platform and platform devices via
platform_notify() we no longer need to special case having a NULL device
pointer or NULL archdata. It should be a driver error if this condition
shows up and the driver should be fixed.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Thu, 19 Mar 2009 03:40:51 +0000 (03:40 +0000)]
powerpc: setup default archdata for {of_}platform via bus_register_notifier
Since a number of powerpc chips are SoCs we end up having dma-able
devices that are registered as platform or of_platform devices. We need
to hook the archdata to setup proper dma_ops for these devices.
Rather than having to add a bus_notify to each platform we add a default
one at the highest priority (called first) to set the default dma_ops for
of_platform and platform devices to dma_direct_ops. This allows platform
code to override the ops by providing their own notifier call back.
In the future to enable >4G DMA support on ppc32 we can hook swiotlb ops.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kumar Gala [Thu, 19 Mar 2009 03:40:50 +0000 (03:40 +0000)]
powerpc/pci: Default to dma_direct_ops for pci dma_ops
This will allow us to remove the ppc32 specific checks in get_dma_ops()
that defaults to dma_direct_ops if the archdata is NULL. We really
should always have archdata set to something going forward.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/sysdev/pmi.c: In function 'pmi_of_probe':
arch/powerpc/sysdev/pmi.c:166: warning: passing argument 2 of 'request_irq' from incompatible pointer type
Change the return type of the handler from "int" to "irqreturn_t".
Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jeremy Kerr [Wed, 11 Mar 2009 17:55:52 +0000 (17:55 +0000)]
powerpc: Add virtual processor dispatch trace log
pseries SPLPAR machines are able to retrieve a log of dispatch and
preempt events from the hypervisor. With this information, we can
see when and why each dispatch & preempt is occuring.
This change adds a set of debugfs files allowing userspace to read this
dispatch log.
Based on initial patches from Nishanth Aravamudan <nacc@us.ibm.com>.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jeremy Kerr [Wed, 11 Mar 2009 17:55:52 +0000 (17:55 +0000)]
powerpc: Add dispatch trace log fields to lppaca
PAPR v2.3 defines fields in the virtual processor area for a dispatch
trace log (DLT). Since we'd like to use the DLT, add the necessary
fields to struct lppaca.
Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Rusty Russell [Wed, 11 Mar 2009 12:20:05 +0000 (12:20 +0000)]
powerpc: Make sysfs code use smp_call_function_single
Impact: performance improvement
This fixes 'powerpc: avoid cpumask games in arch/powerpc/kernel/sysfs.c'
which talked about using smp_call_function_single, but actually used
work_on_cpu (an older version of the patch).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The return code from invoking the notifier chain when updating the
ibm,dynamic-memory property is not handled properly. In failure
cases (rc == NOTIFY_BAD) we should be restoring the original value
of the property. In success (rc == NOTIFY_OK) we should be returning
zero from the calling routine.
Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit e7943fbbfdb6eef03c003b374de1f802cc14f02a broke ppc32 using
Open Firmware client interface due to using the wrong relocation
macro when accessing the variable "linux_banner".
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Kyle McMartin [Mon, 23 Mar 2009 19:25:49 +0000 (15:25 -0400)]
Build with -fno-dwarf2-cfi-asm
With a sufficiently new compiler and binutils, code which wasn't
previously generating .eh_frame sections has begun to. Certain
architectures (powerpc, in this case) may generate unexpected relocation
formats in response to this, preventing modules from loading.
While the new relocation types should probably be handled, revert to the
previous behaviour with regards to generation of .eh_frame sections.
(This was reported against Fedora, which appears to be the only distro
doing any building against gcc-4.4 at present: RH bz#486545.)
Signed-off-by: Kyle McMartin <kyle@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: Alexandre Oliva <aoliva@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits)
ucc_geth: Fix oops when using fixed-link support
dm9000: locking bugfix
net: update dnet.c for bus_id removal
dnet: DNET should depend on HAS_IOMEM
dca: add missing copyright/license headers
nl80211: Check that function pointer != NULL before using it
sungem: missing net_device_ops
be2net: fix to restore vlan ids into BE2 during a IF DOWN->UP cycle
be2net: replenish when posting to rx-queue is starved in out of mem conditions
bas_gigaset: correctly allocate USB interrupt transfer buffer
smsc911x: reset last known duplex and carrier on open
sh_eth: Fix mistake of the address of SH7763
sh_eth: Change handling of IRQ
netns: oops in ip[6]_frag_reasm incrementing stats
net: kfree(napi->skb) => kfree_skb
net: fix sctp breakage
ipv6: fix display of local and remote sit endpoints
net: Document /proc/sys/net/core/netdev_budget
tulip: fix crash on iface up with shirq debug
virtio_net: Make virtio_net support carrier detection
...
Miklos Szeredi [Mon, 23 Mar 2009 15:07:24 +0000 (16:07 +0100)]
fix ptrace slowness
This patch fixes bug #12208:
Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12208
Subject : uml is very slow on 2.6.28 host
This turned out to be not a scheduler regression, but an already
existing problem in ptrace being triggered by subtle scheduler
changes.
The problem is this:
- task A is ptracing task B
- task B stops on a trace event
- task A is woken up and preempts task B
- task A calls ptrace on task B, which does ptrace_check_attach()
- this calls wait_task_inactive(), which sees that task B is still on the runq
- task A goes to sleep for a jiffy
- ...
Since UML does lots of the above sequences, those jiffies quickly add
up to make it slow as hell.
This patch solves this by not rescheduling in read_unlock() after
ptrace_stop() has woken up the tracer.
Thanks to Oleg Nesterov and Ingo Molnar for the feedback.
Anton Vorontsov [Mon, 23 Mar 2009 04:30:52 +0000 (21:30 -0700)]
ucc_geth: Fix oops when using fixed-link support
commit b1c4a9dddf09fe99b8f88252718ac5b357363dc4 ("ucc_geth: Change
uec phy id to the same format as gianfar's") introduced a regression
in the ucc_geth driver that causes this oops when fixed-link is used:
This patch fixes the issue by removing offending (and somewhat
duplicate) code from init_phy() routine, and changes _probe()
function to use uec_mdio_bus_name().
Also, since we fully construct phy_bus_id in the _probe() routine,
we no longer need ->phy_address and ->mdio_bus fields in
ucc_geth_info structure.
I wish the patch would be a bit shorter, but it seems like the only
way to fix the issue in a sane way. Luckily, the patch has been
tested with real PHYs and fixed-link, so no further regressions
expected.
Reported-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Tested-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se> Signed-off-by: David S. Miller <davem@davemloft.net>
David Brownell [Mon, 23 Mar 2009 04:28:39 +0000 (21:28 -0700)]
dm9000: locking bugfix
This fixes a locking bug in the dm9000 driver. It calls
request_irq() without setting IRQF_DISABLED ... which is
correct for handlers that support IRQ sharing, since that
behavior is not guaranteed for shared IRQs. However, its
IRQ handler then wrongly assumes that IRQs are blocked.
So the fix just uses the right spinlock primitives in the
IRQ handler.
NOTE: this is a classic example of the type of bug which
lockdep currently masks by forcibly setting IRQF_DISABLED
on IRQ handlers that did not request that flag.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 22 Mar 2009 18:38:57 +0000 (11:38 -0700)]
Merge branch 'fix-includes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
* 'fix-includes' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: merge the non-MMU and MMU versions of siginfo.h
m68k: use the MMU version of unistd.h for all m68k platforms
m68k: merge the non-MMU and MMU versions of signal.h
m68k: merge the non-MMU and MMU versions of ptrace.h
m68k: use MMU version of setup.h for both MMU and non-MMU
m68k: merge the non-MMU and MMU versions of sigcontext.h
m68k: merge the non-MMU and MMU versions of swab.h
m68k: merge the non-MMU and MMU versions of param.h
Tyler Hicks [Fri, 20 Mar 2009 07:23:57 +0000 (02:23 -0500)]
eCryptfs: NULL crypt_stat dereference during lookup
If ecryptfs_encrypted_view or ecryptfs_xattr_metadata were being
specified as mount options, a NULL pointer dereference of crypt_stat
was possible during lookup.
This patch moves the crypt_stat assignment into
ecryptfs_lookup_and_interpose_lower(), ensuring that crypt_stat
will not be NULL before we attempt to dereference it.
Thanks to Dan Carpenter and his static analysis tool, smatch, for
finding this bug.
Tyler Hicks [Fri, 20 Mar 2009 06:25:09 +0000 (01:25 -0500)]
eCryptfs: Allocate a variable number of pages for file headers
When allocating the memory used to store the eCryptfs header contents, a
single, zeroed page was being allocated with get_zeroed_page().
However, the size of an eCryptfs header is either PAGE_CACHE_SIZE or
ECRYPTFS_MINIMUM_HEADER_EXTENT_SIZE (8192), whichever is larger, and is
stored in the file's private_data->crypt_stat->num_header_bytes_at_front
field.
ecryptfs_write_metadata_to_contents() was using
num_header_bytes_at_front to decide how many bytes should be written to
the lower filesystem for the file header. Unfortunately, at least 8K
was being written from the page, despite the chance of the single,
zeroed page being smaller than 8K. This resulted in random areas of
kernel memory being written between the 0x1000 and 0x1FFF bytes offsets
in the eCryptfs file headers if PAGE_SIZE was 4K.
This patch allocates a variable number of pages, calculated with
num_header_bytes_at_front, and passes the number of allocated pages
along to ecryptfs_write_metadata_to_contents().
Thanks to Florian Streibelt for reporting the data leak and working with
me to find the problem. 2.6.28 is the only kernel release with this
vulnerability. Corresponds to CVE-2009-0787
radeonfb: Whack the PCI PM register until it sticks
This fixes a regression introduced when we switched to using the core
pci_set_power_state(). The chip seems to need the state to be written
over and over again until it sticks, so we do that.
Note that the code is a bit blunt, without timeout, etc... but that's
pretty much because I put back in there the code exactly as it used to
be before the regression. I still add a call to pci_set_power_state()
at the end so that ACPI gets called appropriately on x86.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Tested-by: Raymond Wooninck <tittiatcoke@gmail.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jouni Malinen [Fri, 20 Mar 2009 15:57:36 +0000 (17:57 +0200)]
nl80211: Check that function pointer != NULL before using it
NL80211_CMD_GET_MESH_PARAMS and NL80211_CMD_SET_MESH_PARAMS handlers
did not verify whether a function pointer is NULL (not supported by
the driver) before trying to call the function. The former nl80211
command is available for unprivileged users, too, so this can
potentially allow normal users to kill networking (or worse..) if
mac80211 is built without CONFIG_MAC80211_MESH=y.
Signed-off-by: Jouni Malinen <jouni.malinen@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
powerpc/mm: Unify PTE_RPN_SHIFT and _PAGE_CHG_MASK definitions
This updates the 32-bit headers to use the same definitions for the RPN
shift inside the PTE as 64-bit, and thus updates _PAGE_CHG_MASK to
become identical.
This does introduce a runtime visible difference, which is that now,
_PAGE_HASHPTE will be part of _PAGE_CHG_MASK and thus preserved. However
this should have no practical effect as it should have been preserved in
the first place and we got away with not having it there due to our
PTE access functions preserving it anyway.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
powerpc/mm: Split the various pgtable-* headers based on MMU type
This patch moves the definition of the PTE format for each MMU type
to separate files instead of all in one file. This improves overall
maintainability and will make it easier to add new types.
On 64-bit, additionally, I've separated the headers relative to the
format of the page table tree (3 vs. 4 levels for 64K vs 4K pages)
from the headers specific to the PTE format for hash based processors,
this will make it easier to add support for Book3 "E" 64-bit
implementations.
There are still some type-related ifdef's in the generic headers,
we might remove them in the long run, but this patch shouldn't result
in any code change, -hopefully- just definitions being moved around.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Jeff Moyer [Thu, 19 Mar 2009 00:04:21 +0000 (17:04 -0700)]
aio: lookup_ioctx can return the wrong value when looking up a bogus context
The libaio test harness turned up a problem whereby lookup_ioctx on a
bogus io context was returning the 1 valid io context from the list
(harness/cases/3.p).
Because of that, an extra put_iocontext was done, and when the process
exited, it hit a BUG_ON in the put_iocontext macro called from exit_aio
(since we expect a users count of 1 and instead get 0).
Thanks to Zach for pointing out that hlist_for_each_entry_rcu will not
return with a NULL tpos at the end of the loop, even if the entry was
not found.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Zach Brown <zach.brown@oracle.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Davide Libenzi [Thu, 19 Mar 2009 00:04:19 +0000 (17:04 -0700)]
eventfd: remove fput() call from possible IRQ context
Remove a source of fput() call from inside IRQ context. Myself, like Eric,
wasn't able to reproduce an fput() call from IRQ context, but Jeff said he was
able to, with the attached test program. Independently from this, the bug is
conceptually there, so we might be better off fixing it. This patch adds an
optimization similar to the one we already do on ->ki_filp, on ->ki_eventfd.
Playing with ->f_count directly is not pretty in general, but the alternative
here would be to add a brand new delayed fput() infrastructure, that I'm not
sure is worth it.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Thu, 19 Mar 2009 21:56:35 +0000 (14:56 -0700)]
Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
[S390] make page table upgrade work again
[S390] make page table walking more robust
[S390] Dont check for pfn_valid() in uaccess_pt.c
[S390] ftrace/mcount: fix kernel stack backchain
[S390] topology: define SD_MC_INIT to fix performance regression
[S390] __div64_31 broken for CONFIG_MARCH_G5
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: Clear space_info full when adding new devices
Btrfs: Fix locking around adding new space_info
Linus Torvalds [Thu, 19 Mar 2009 18:32:05 +0000 (11:32 -0700)]
Fix race in create_empty_buffers() vs __set_page_dirty_buffers()
Nick Piggin noticed this (very unlikely) race between setting a page
dirty and creating the buffers for it - we need to hold the mapping
private_lock until we've set the page dirty bit in order to make sure
that create_empty_buffers() might not build up a set of buffers without
the dirty bits set when the page is dirty.
I doubt anybody has ever hit this race (and it didn't solve the issue
Nick was looking at), but as Nick says: "Still, it does appear to solve
a real race, which we should close."
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Thu, 19 Mar 2009 06:53:16 +0000 (23:53 -0700)]
sparc64: Fix crash with /proc/iomem
When you compile kernel on Sparc64 with heap memory checking and type
"cat /proc/iomem", you get a crash, because pointers in struct
resource are uninitialized.
Most code fills struct resource with zeros, so I assume that it is
responsibility of the caller of request_resource to initialized it,
not the responsibility of request_resource functuion.
After 2.6.29 is out, there could be a check for uninitialized fields
added to request_resource to avoid crashes like this.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt [Thu, 19 Mar 2009 06:44:23 +0000 (23:44 -0700)]
bas_gigaset: correctly allocate USB interrupt transfer buffer
Every USB transfer buffer has to be allocated individually by kmalloc.
Impact: bugfix, no functional change
Signed-off-by: Tilman Schmidt <tilman@imap.cc> Tested-by: Kolja Waschk <kawk@users.sourceforge.net> Signed-off-by: David S. Miller <davem@davemloft.net>
smsc911x: reset last known duplex and carrier on open
smsc911x_phy_adjust_link is called periodically by the phy layer (as
it's run in polling mode), and it only updates the hardware when it sees
a change in duplex or carrier. This patch clears the last known values
every time the interface is brought up, instead of only when the module
is loaded.
Without this patch the adjust_link function never updates the hardware
after an ifconfig down; ifconfig up. On a full duplex link this causes
the tx error counter to increment, even though packets are correctly
transmitted, as the default MAC_CR register setting is for half duplex.
The tx errors are "no carrier" errors, which should be ignored in
full-duplex mode. When MAC_CR is set to "full duplex" mode they are
correctly ignored by the hardware.
Note that even with this patch the tx error counter can increment if
packets are transmitted between "ifconfig up" and the first phy poll
interval. An improved solution would use the phy interrupt with phylib,
but I haven't managed to make this work 100% robustly yet.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Aced-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The NAPI poll parameter netdev_budget is not documented in
kernel-docs. Since it may have a substantial effect on at least some
network loads, it should be.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kyle McMartin [Thu, 19 Mar 2009 01:49:01 +0000 (18:49 -0700)]
tulip: fix crash on iface up with shirq debug
Tulip is currently doing request_irq before it has done its
initialization. This is usually not a problem because it hasn't
enable interrupts yet, but with DEBUG_SHIRQ on, we call the irq handler
when registering the interrupt as a sanity check.
This can result in a NULL ptr dereference, so call tulip_init_ring
before request_irq, and add a free_ring function to do the freeing
now shared with tulip_close.
Tested with a shell loop running ifup, ifdown in a loop a few hundred
times with DEBUG_SHIRQ on.
Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
virtio_net: Make virtio_net support carrier detection
Impact: Make NetworkManager work with virtio_net
For now the semantics are simple: There is always carrier.
This allows a seamless experience with e.g., qemu/kvm
where NetworkManager just configures and sets up
everything automagically.
If/when a generally agreed-upon way to control
carrier on/off in the emulator/hypervisor level
emerges, it will be trivial to extend the driver
to support that too, but for now even this 2-liner
makes user experience that much better.
Signed-off-by: Pantelis Koukousoulas <pktoss@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
The un-refactored code checked the link speed and duplex of
every slave on every pass; the refactored code did not do so.
The 802.3ad and balance-alb/tlb modes utilize the speed and
duplex information, and require it to be kept up to date. This patch
adds a notifier check to perform the appropriate updating when the slave
device speed changes.
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Thu, 19 Mar 2009 01:11:51 +0000 (18:11 -0700)]
bnx2: Fix problem of using wrong IRQ handler.
The MSI-X handler was chosen before the call to pci_enable_msix().
If MSI-X was not available, the wrong MSI-X handler would be used in
INTA mode. This would cause a screaming interrupt problem because
INTA would not be cleared by the MSI-X handler.
Fixed by assigning MSI-X handler after pci_enable_msix() returns
successfully. Also update version to 1.9.3.
Thomas Chenault <thomas_chenault@dell.com> helped us find this problem.
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 18 Mar 2009 14:39:11 +0000 (07:39 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: Fix vunmap and free order in snd_free_sgbuf_pages()
ALSA: mixart, fix lock imbalance
ALSA: pcm_oss, fix locking typo
ALSA: oss-mixer - Fixes recording gain control
ALSA: hda - Workaround for buggy DMA position on ATI controllers
ALSA: hda - Fix DMA mask for ATI controllers
ALSA: opl3sa2 - Fix NULL dereference when suspending snd_opl3sa2
After TASK_SIZE now gives the current size of the address space the
upgrade of a 64 bit process from 3 to 4 levels of page table needs
to use the arch_mmap_check hook to catch large mmap lengths. The
get_unmapped_area* functions need to check for -ENOMEM from the
arch_get_unmapped_area*, upgrade the page table and retry.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>