Neil Brown [Thu, 8 May 2008 03:03:09 +0000 (13:03 +1000)]
nfsd: fix spurious EACCESS in reconnect_path()
Thanks to Frank Van Maarseveen for the original problem report: "A
privileged process on an NFS client which drops privileges after using
them to change the current working directory, will experience incorrect
EACCES after an NFS server reboot. This problem can also occur after
memory pressure on the server, particularly when the client side is
quiet for some time."
This occurs because the filehandle points to a directory whose parents
are no longer in the dentry cache, and we're attempting to reconnect the
directory to its parents without adequate permissions to perform lookups
in the parent directories.
We can therefore fix the problem by acquiring the necessary capabilities
before attempting the reconnection. We do this only in the
no_subtree_check case, since the documented behavior of the
subtree_check export option requires the server to check that the user
has lookup permissions on all parents.
The subtree_check case still has a problem, since reconnect_path()
unnecessarily requires both read and lookup permissions on all parent
directories. However, a fix in that case would be more delicate, and
use of subtree_check is already discouraged for other reasons.
Signed-off-by: Neil Brown <neilb@suse.de> Cc: Frank van Maarseveen <frankvm@frankvm.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Kevin Coffman [Wed, 30 Apr 2008 16:46:08 +0000 (12:46 -0400)]
gss_krb5: Use random value to initialize confounder
Initialize the value used for the confounder to a random value
rather than starting from zero.
Allow for confounders of length 8 or 16 (which will be needed for AES).
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Kevin Coffman [Wed, 30 Apr 2008 16:45:53 +0000 (12:45 -0400)]
gss_krb5: create a define for token header size and clean up ptr location
cleanup:
Document token header size with a #define instead of open-coding it.
Don't needlessly increment "ptr" past the beginning of the header
which makes the values passed to functions more understandable and
eliminates the need for extra "krb5_hdr" pointer.
Clean up some intersecting white-space issues flagged by checkpatch.pl.
This leaves the checksum length hard-coded at 8 for DES. A later patch
cleans that up.
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Miklos Szeredi [Mon, 16 Jun 2008 11:20:29 +0000 (13:20 +0200)]
nfsd: rename MAY_ flags
Rename nfsd_permission() specific MAY_* flags to NFSD_MAY_* to make it
clear, that these are not used outside nfsd, and to avoid name and
number space conflicts with the VFS.
[comment from hch: rename MAY_READ, MAY_WRITE and MAY_EXEC as well]
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
NeilBrown [Thu, 19 Jun 2008 00:11:09 +0000 (10:11 +1000)]
knfsd: nfsd: Handle ERESTARTSYS from syscalls.
OCFS2 can return -ERESTARTSYS from write requests (and possibly
elsewhere) if there is a signal pending.
If nfsd is shutdown (by sending a signal to each thread) while there
is still an IO load from the client, each thread could handle one last
request with a signal pending. This can result in -ERESTARTSYS
which is not understood by nfserrno() and so is reflected back to
the client as nfserr_io aka -EIO. This is wrong.
Instead, interpret ERESTARTSYS to mean "try again later" by returning
nfserr_jukebox. The client will resend and - if the server is
restarted - the write will (hopefully) be successful and everyone will
be happy.
The symptom that I narrowed down to this was:
copy a large file via NFS to an OCFS2 filesystem, and restart
the nfs server during the copy.
The 'cp' might get an -EIO, and the file will be corrupted -
presumably holes in the middle where writes appeared to fail.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Wed, 11 Jun 2008 14:03:12 +0000 (10:03 -0400)]
lockd: close potential race with rapid lockd_up/lockd_down cycle
If lockd_down is called very rapidly after lockd_up returns, then
there is a slim chance that lockd() will never be called. kthread()
will return before calling the function, so we'll end up never
actually calling the cleanup functions for the thread.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Tue, 10 Jun 2008 12:40:39 +0000 (08:40 -0400)]
sunrpc: remove sv_kill_signal field from svc_serv struct
Since we no longer make any distinction between shutdown signals with
nfsd, then it becomes easier to just standardize on a particular signal
to use to bring it down (SIGINT, in this case).
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Tue, 10 Jun 2008 12:40:38 +0000 (08:40 -0400)]
knfsd: convert knfsd to kthread API
This patch is rather large, but I couldn't figure out a way to break it
up that would remain bisectable. It does several things:
- change svc_thread_fn typedef to better match what kthread_create expects
- change svc_pool_map_set_cpumask to be more kthread friendly. Make it
take a task arg and and get rid of the "oldmask"
- have svc_set_num_threads call kthread_create directly
- eliminate __svc_create_thread
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Tue, 10 Jun 2008 12:40:37 +0000 (08:40 -0400)]
knfsd: remove special handling for SIGHUP
The special handling for SIGHUP in knfsd is a holdover from much
earlier versions of Linux where reloading the export table was
more expensive. That facility is not really needed anymore and
to my knowledge, is seldom-used.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Tue, 10 Jun 2008 12:40:36 +0000 (08:40 -0400)]
knfsd: clean up nfsd filesystem interfaces
Several of the nfsd filesystem interfaces allow changes to parameters
that don't have any effect on a running nfsd service. They are only ever
checked when nfsd is started. This patch fixes it so that changes to
those procfiles return -EBUSY if nfsd is already running to make it
clear that changes on the fly don't work.
The patch should also close some relatively harmless races between
changing the info in those interfaces and starting nfsd, since these
variables are being moved under the protection of the nfsd_mutex.
Finally, the nfsv4recoverydir file always returns -EINVAL if read. This
patch fixes it to return the recoverydir path as expected.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Neil Brown [Tue, 10 Jun 2008 12:40:35 +0000 (08:40 -0400)]
knfsd: Replace lock_kernel with a mutex for nfsd thread startup/shutdown locking.
This removes the BKL from the RPC service creation codepath. The BKL
really isn't adequate for this job since some of this info needs
protection across sleeps.
Also, add some comments to try and clarify how the locking should work
and to make it clear that the BKL isn't necessary as long as there is
adequate locking between tasks when touching the svc_serv fields.
Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Benny Halevy [Thu, 29 May 2008 09:56:08 +0000 (12:56 +0300)]
nfsd: make nfs4xdr WRITEMEM safe against zero count
WRITEMEM zeroes the last word in the destination buffer
for padding purposes, but this must not be done if
no bytes are to be copied, as it would result
in zeroing of the word right before the array.
The current implementation works since it's always called
with non zero nbytes or it follows an encoding of the
string (or opaque) length which, if equal to zero,
can be overwritten with zero.
Nevertheless, it seems safer to check for this case.
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Linus Torvalds [Sun, 18 May 2008 20:56:54 +0000 (13:56 -0700)]
Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6
* 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
i2c/max6875: Really prevent 24RF08 corruption
i2c-amd756: Fix functionality flags
i2c: Kill the old driver matching scheme
i2c: Convert remaining new-style drivers to use module aliasing
i2c: Switch pasemi to the new device/driver matching scheme
i2c: Clean up Blackfin BF527 I2C device declarations
i2c-nforce2: Disable the second SMBus channel on the DFI Lanparty NF4 Expert
i2c: New co-maintainer
Add multi_defconfig, to build a kernel for all supported m68k platforms,
excluding Sun 3 (Sun 3 kernels are incompatible with all other m68k platforms)
The *_ISA type defines are quite generic and cause namespace conflicts
(e.g. with `AMIGAHW_DECLARE(GG2_ISA)' in <asm/amigahw.h>) for some kernel
configurations. Use ISA_TYPE_* to avoid such conflicts.
Use `__builtin_trap()' instead of `asm volatile("illegal")' in the m68k BUG()
macros (as suggested by Andrew Pinski), to kill warnings in code that assumes
BUG() does not return.
m68k: FB_HP300 depends on DIO and doesnt need FB_CFB_FILLRECT
Correct FB_HP300 dependencies:
- FB_HP300 doesn't depend only on HP300, but also on DIO (which depends on
HP300)
- FB_HP300 does not need FB_CFB_FILLRECT
Jean Delvare [Sun, 18 May 2008 18:49:41 +0000 (20:49 +0200)]
i2c/max6875: Really prevent 24RF08 corruption
i2c-core takes care of the possible corruption of 24RF08 chips for
quite some times, so device devices no longer need to do it. And they
really should not, as applying the prevention twice voids it.
I thought that I had fixed all drivers long ago but apparently I had
missed that one.
Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Ben Gardner <bgardner@wabtec.com>
Jean Delvare [Sun, 18 May 2008 18:49:40 +0000 (20:49 +0200)]
i2c: Convert remaining new-style drivers to use module aliasing
Update all the remaining new-style i2c drivers to use standard module
aliasing instead of the old driver_name/type driver matching scheme.
Note that the tuner driver is a bit quirky at the moment, as it
overwrites i2c_client.name with arbitrary strings. We write "tuner"
back on remove, to make sure that driver cycling will work properly,
but there may still be troublesome corner cases.
Jean Delvare [Sun, 18 May 2008 18:49:40 +0000 (20:49 +0200)]
i2c-nforce2: Disable the second SMBus channel on the DFI Lanparty NF4 Expert
There is a strange chip at 0x2e on the second SMBus channel of the
DFI Lanparty NF4 Expert motherboard. Accessing the chip reboots the
system. As there's nothing interesting on this SMBus channel, the
easiest and safest thing to do is to disable it on that board.
This is a better fix to bug #5889 than the it87 driver update that was
done originally:
http://bugzilla.kernel.org/show_bug.cgi?id=5889
Jean Delvare [Sun, 18 May 2008 18:49:40 +0000 (20:49 +0200)]
i2c: New co-maintainer
Ben Dooks agreed to become my co-maintainer for the i2c subsystem. In
particular, Ben will help with drivers for embedded systems, of which
my experience is inexistent. Thanks Ben and welcome on board!
Signed-off-by: Jean Delvare <khali@linux-fr.org> Acked-by: Ben Dooks <ben-linux@fluff.org>
[ARM] fix parenthesis in include/asm-arm/arch-omap/control.h
Parenthesis fix in include/asm-arm/arch-omap/control.h
Signed-off-by: Mariusz Kozlowski <m.kozlowski@tuxland.pl> Cc: Paul Walmsley <paul@pwsan.com> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Michael Abbott [Wed, 14 May 2008 23:29:24 +0000 (16:29 -0700)]
[ARM] colibri: fix support for DM9000 ethernet device
Two changes are necessary to enable proper operation of the DM9000 device with
the Colibri PXA 270 board: firstly, the IRQ type needs to be configured for
rising edge interrupts, and secondly this configuration needs to be
communicated through to the DM9000.
[akpm@linux-foundation.org: remove set_irq_type() call as per ben-linux request] Signed-off-by: Michael Abbott <michael.abbott@diamond.ac.uk> Cc: Daniel Mack <daniel@caiaq.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Ben Dooks <ben-linux@fluff.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[ARM] 5037/1: Orion: fix DNS323/Kurobox Pro PCI initialisation
Whereas most Orion 5x machine support code would initialise the PCI
subsystem with nr_controllers in their struct hw_pci set to 2, the
DNS323 and Kurobox Pro machine support code had nr_controllers set
to 1.
This was presumably done because on those two machines, the PCI(-X)
controller (nr == 1) isn't used, requiring initialisation of only
the PCIe controller (nr == 0.) However, not initialising the PCI(-X)
controller on boards that don't use it leads to a situation where
both the PCIe and the PCI(-X) controller think that their root bus is
zero, and it messes up IRQ assignment.
This patch changes the DNS323 and Kurobox Pro support code to always
use nr_controllers == 2.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
[ARM] 5034/1: fix arm{925,926,940,946} dma_flush_range() in WT mode
The CPU's dma_flush_range() operation needs to clean+invalidate the
given memory area if the cache is in writeback mode, or do just the
invalidate part if the cache is in writethrough mode, but the current
proc-arm{925,926,940,946} (incorrectly) do a cache clean in the
latter case. This patch fixes that.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus Torvalds [Sat, 17 May 2008 21:21:43 +0000 (14:21 -0700)]
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: disable mwait for AMD family 10H/11H CPUs
x86: fix crash on cpu hotplug on pat-incapable machines
x86: remove mwait capability C-state check
It depends on the CPU. For AMD CPUs that support MWAIT this is wrong.
Family 0x10 and 0x11 CPUs will enter C1 on HLT. Powersavings then
depend on a clock divisor and current Pstate of the core.
If all cores of a processor are in halt state (C1) the processor can
enter the C1E (C1 enhanced) state. If mwait is used this will never
happen.
Thus HLT saves more power than MWAIT here.
It might be best to switch off the mwait flag for these AMD CPU
families like it was introduced with commit f039b754714a422959027cb18bb33760eb8153f0 (x86: Don't use MWAIT on AMD
Family 10)
Re-add the AMD families 10H/11H check and disable the mwait usage for
those.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Avi Kivity [Wed, 14 May 2008 09:20:32 +0000 (12:20 +0300)]
x86: fix crash on cpu hotplug on pat-incapable machines
pat_disable() is __init, which means it goes away after booting is complete.
Unfortunately it is used by the hotplug code if the machine is not
pat-capable, causing a crash.
Fix by marking pat_disable() as __cpuinit.
Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ingo Molnar [Wed, 14 May 2008 06:47:40 +0000 (08:47 +0200)]
x86: remove mwait capability C-state check
Vegard Nossum reports:
| powertop shows between 200-400 wakeups/second with the description
| "<kernel IPI>: Rescheduling interrupts" when all processors have load (e.g.
| I need to run two busy-loops on my 2-CPU system for this to show up).
|
| The bisect resulted in this commit:
|
| commit 0c07ee38c9d4eb081758f5ad14bbffa7197e1aec
| Date: Wed Jan 30 13:33:16 2008 +0100
|
| x86: use the correct cpuid method to detect MWAIT support for C states
remove the functional effects of this patch and make mwait unconditional.
A future patch will turn off mwait on specific CPUs where that causes
power to be wasted.
Linus Torvalds [Fri, 16 May 2008 01:28:46 +0000 (18:28 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
[POWERPC] macintosh: Replace deprecated __initcall with device_initcall
[POWERPC] cell: Fix section mismatches in io-workarounds code
[POWERPC] spufs: Fix compile error
[POWERPC] Fix uninitialized variable bug in copy_{to|from}_user
[POWERPC] Add null pointer check to of_find_property
[POWERPC] vmemmap fixes to use smaller pages
[POWERPC] spufs: Fix pointer reference in find_victim
[POWERPC] 85xx: SBC8548 - Add flash support and HW Rev reporting
[POWERPC] 85xx: Fix some sparse warnings for 85xx MDS
[POWERPC] 83xx: Enable DMA engine on the MPC8377 MDS board.
[POWERPC] 86xx: mpc8610_hpcd: fix second serial port
[POWERPC] 86xx: mpc8610_hpcd: add support for NOR and NAND flashes
[POWERPC] 85xx: Add 8568 PHY workarounds to board code
[POWERPC] 86xx: mpc8610_hpcd: use ULI526X driver for on-board ethernet
Linus Torvalds [Fri, 16 May 2008 01:28:28 +0000 (18:28 -0700)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
jbd2: update transaction t_state to T_COMMIT fix
ext4: Retry block allocation if new blocks are allocated from system zone.
ext4: mballoc fix mb_normalize_request algorithm for 1KB block size filesystems
ext4: fix typos in messages and comments (journalled -> journaled)
ext4: fix synchronization of quota files in journal=data mode
ext4: Fix mount messages when quota disabled
ext4: correct mount option parsing to detect when quota options can be changed
Linus Torvalds [Fri, 16 May 2008 00:50:37 +0000 (17:50 -0700)]
Clean up 'print_fn_descriptor_symbol()' types
Everybody wants to pass it a function pointer, and in fact, that is what
you _must_ pass it for it to make sense (since it knows that ia64 and
ppc64 use descriptors for function pointers and fetches the actual
address from there).
So don't make the argument be a 'unsigned long' and force everybody to
add a cast.
Mingming Cao [Thu, 15 May 2008 18:46:17 +0000 (14:46 -0400)]
jbd2: update transaction t_state to T_COMMIT fix
Updating the current transaction's t_state is protected by j_state_lock. We
need to do the same when updating the t_state to T_COMMIT.
Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
ext4: Retry block allocation if new blocks are allocated from system zone.
If the block allocator gets blocks out of system zone ext4 calls
ext4_error. But if the file system is mounted with errors=continue
retry block allocation. We need to mark the system zone blocks as
in use to make sure retry don't pick them again
System zone is the block range mapping block bitmap, inode bitmap and inode
table.
Ingo Molnar [Wed, 14 May 2008 15:11:46 +0000 (17:11 +0200)]
tty: fix BKL related leak and crash
Enabling the BKL to be lockdep tracked uncovered the following
upstream kernel bug in the tty code, which caused a BKL
reference leak:
================================================
[ BUG: lock held when returning to user space! ]
------------------------------------------------
dmesg/3121 is leaving the kernel with locks still held!
1 lock held by dmesg/3121:
#0: (kernel_mutex){--..}, at: [<c02f34d9>] opost+0x24/0x194
this might explain some of the atomicity warnings and crashes
that -tip tree testing has been experiencing since the BKL
was converted back to a spinlock.
The patch aims to fix a performance issue for the syscall
personality(PER_LINUX32).
On IA-64 box, the syscall personality (PER_LINUX32) has poor performance
because it failed to find the Linux/x86 execution domain. Then it tried
to load the kernel module however it failed always and it used the default
execution domain PER_LINUX instead. Requesting kernel modules is very
expensive. It caused the performance issue. (see the function
lookup_exec_domain in kernel/exec_domain.c).
To resolve the issue, execution domain Linux/x86 is always registered in
initialization time for IA-64 architecture.
Signed-off-by: Xiaolan Huang <xiaolan.huang@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
Linus Torvalds [Thu, 15 May 2008 16:10:13 +0000 (09:10 -0700)]
Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
[S390] show_interrupts: prevent cpu hotplug when walking cpu_online_map.
[S390] smp: __smp_call_function_map vs cpu_online_map fix.
[S390] tape: Use ccw_dev_id to build cdev_id.
[S390] dasd: fix timeout handling in interrupt handler
[S390] s390dbf: Use const char * for dbf name.
[S390] dasd: Use const in busid functions.
[S390] blacklist.c: removed duplicated include
[S390] vmlogrdr: module initialization function should return negative errors
[S390] sparsemem vmemmap: initialize memmap.
[S390] Remove last traces of cio_msg=.
[S390] cio: Remove CCW_CMD_SUSPEND_RECONN in front of CCW_CMD_SET_PGID.
Heiko Carstens [Thu, 15 May 2008 14:52:38 +0000 (16:52 +0200)]
[S390] smp: __smp_call_function_map vs cpu_online_map fix.
Both smp_call_function() and __smp_call_function_map() access
cpu_online_map. Both functions run with preemption disabled which
protects for cpus going offline. However new cpus can be added and
therefore the cpu_online_map can change unexpectedly.
So use the call_lock to protect against changes to the cpu_online_map
in start_secondary() and all smp_call_* functions.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Weinhuber [Thu, 15 May 2008 14:52:36 +0000 (16:52 +0200)]
[S390] dasd: fix timeout handling in interrupt handler
When the dasd_int_handler is called with an error code instead of
an irb, the associated request should be restarted. This handling
was missing from the -ETIMEDOUT case. In fact it should be done in
any case.
Signed-off-by: Stefan Weinhuber <wein@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Helge Deller [Fri, 2 May 2008 20:02:48 +0000 (22:02 +0200)]
parisc: fix trivial section name warnings
This trivial patch fixes the following section warnings on PARISC:
> WARNING: vmlinux.o (.text.1): unexpected section name.
>The (.[number]+) following section name are ld generated and not expected.
> Did you forget to use "ax"/"aw" in a .S file?
> Note that for example <linux/init.h> contains
> section definitions for use in .S files.