Chuck Lever [Wed, 22 Oct 2008 17:12:36 +0000 (13:12 -0400)]
NFSD: Fix BUG during NFSD shutdown processing
The Linux NFS server can be started via a user-space write to
/proc/fs/nfs/threads or to /proc/fs/nfs/portlist. In the first case,
all default listeners are started (both UDP and TCP). In the second,
a listener is started only for one specified transport.
The NFS server has to make sure lockd stays up until the last listener
transport goes away. To support both start-up interfaces, it should
do one lockd_up() for each NFSD listener.
The nfsd_init_socks() function used to do one lockd_up() call for each
svc_create_xprt(). Recently commit 26a414092353590ceaa5955bcb53f863d6ea7549 mistakenly changed
nfsd_init_socks() to do only one lockd_up() call even though it still
does two svc_create_xprt() calls.
The end result is a lockd_down() BUG during NFSD shutdown processing
because nfsd_last_threads() does a lockd_down() call for each entry
on the sv_permsocks list, but the start-up code doesn't do a matching
number of lockd_up() calls.
Add a second lockd_up() in nfsd_init_socks() to make sure the number
of lockd_up() calls matches the number of entries on the NFS servers's
sv_permsocks list.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tom Tucker [Fri, 3 Oct 2008 17:41:14 +0000 (12:41 -0500)]
svcrdma: Fix IRD/ORD polarity
The inititator/responder resources in the event have been swapped. They
no represent what the local peer would set their values to in order to
match the peer. Note that iWARP does not exchange these on the wire and
the provider is simply putting in the local device max.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tom Tucker [Fri, 3 Oct 2008 20:45:03 +0000 (15:45 -0500)]
svcrdma: Modify the RPC reply path to use FRMR when available
Use FRMR to map local RPC reply data. This allows RDMA_WRITE to send reply
data using a single WR. The FRMR is invalidated by linking the LOCAL_INV WR
to the RDMA_SEND message used to complete the reply.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tom Tucker [Tue, 12 Aug 2008 20:12:10 +0000 (15:12 -0500)]
svcrdma: Modify the RPC recv path to use FRMR when available
RPCRDMA requests that specify a read-list are fetched with RDMA_READ. Using
an FRMR to map the data sink improves NFSRDMA security on transports that
place the RDMA_READ data sink LKEY on the wire because the valid lifetime
of the MR is only the duration of the RDMA_READ. The LKEY is invalidated
when the last RDMA_READ WR completes.
Mapping the data sink also allows for very large amounts to data to be
fetched with a single WR, so if the client is also using FRMR, the entire
RPC read-list can be fetched with a single WR.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tom Tucker [Mon, 11 Aug 2008 19:10:19 +0000 (14:10 -0500)]
svcrdma: Add support to svc_rdma_send to handle chained WR
WR can be submitted as linked lists of WR. Update the svc_rdma_send
routine to handle WR chains. This will be used to submit a WR that
uses an FRMR with another WR that invalidates the FRMR.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tom Tucker [Fri, 3 Oct 2008 20:22:18 +0000 (15:22 -0500)]
svcrdma: Add a service to register a Fast Reg MR with the device
Fast Reg MR introduces a new WR type. Add a service to register the
region with the adapter and update the completion handling to support
completions with a NULL WR context.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tom Tucker [Tue, 30 Sep 2008 18:46:13 +0000 (13:46 -0500)]
svcrdma: Query device for Fast Reg support during connection setup
Query the device capabilities in the svc_rdma_accept function to determine
what advanced memory management capabilities are supported by the device.
Based on the query, select the most secure model available given the
requirements of the transport and capabilities of the adapter.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Tom Tucker [Mon, 6 Oct 2008 19:45:18 +0000 (14:45 -0500)]
svcrdma: Add FRMR get/put services
Add services for the allocating, freeing, and unmapping Fast Reg MR. These
services will be used by the transport connection setup, send and receive
routines.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
Chuck Lever [Fri, 3 Oct 2008 21:15:23 +0000 (17:15 -0400)]
NLM: Always start both UDP and TCP listeners
Commit 24e36663, which first appeared in 2.6.19, changed lockd so that
the client side starts a UDP listener only if there is a UDP NFSv2/v3
mount. Its description notes:
This... means that lockd will *not* listen on UDP if the only
mounts are TCP mount (and nfsd hasn't started).
The latter is the only one that concerns me at all - I don't know
if this might be a problem with some servers.
Unfortunately it is a problem for Linux itself. The rpc.statd daemon
on Linux uses UDP for contacting the local lockd, no matter which
protocol is used for NFS mounts. Without a local lockd UDP listener,
NFSv2/v3 lock recovery from Linux NFS clients always fails.
Revert parts of commit 24e36663 so lockd_up() always starts both
listeners.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Fri, 3 Oct 2008 16:50:51 +0000 (12:50 -0400)]
lockd: Remove unused fields in the nlm_reboot structure
The nlm_reboot structure is used to store information provided by the
NSM_NOTIFY procedure. This procedure is not specified by the NLM or NSM
protocols, other than to say that the procedure can be used to transmit
information private to a particular NLM/NSM implementation.
For Linux, the callback arguments include the name of the monitored host,
the new NSM state of the host, and a 16-byte private opaque.
As a clean up, remove the unused fields and the server-side XDR logic that
decodes them.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Fri, 3 Oct 2008 16:50:44 +0000 (12:50 -0400)]
lockd: Add helper to sanity check incoming NOTIFY requests
lockd accepts SM_NOTIFY calls only from a privileged process on the
local system. If lockd uses an AF_INET6 listener, the sender's address
(ie the local rpc.statd) will be the IPv6 loopback address, not the
IPv4 loopback address.
Make sure the privilege test in nlmsvc_proc_sm_notify() and
nlm4svc_proc_sm_notify() works for both AF_INET and AF_INET6 family
addresses by refactoring the test into a helper and adding support for
IPv6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Fri, 3 Oct 2008 16:50:36 +0000 (12:50 -0400)]
lockd: change nlmclnt_grant() to take a "struct sockaddr *"
Adjust the signature and callers of nlmclnt_grant() to pass a "struct
sockaddr *" instead of a "struct sockaddr_in *" in order to support IPv6
addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Fri, 3 Oct 2008 16:50:07 +0000 (12:50 -0400)]
NLM: Convert nlm_lookup_host() to use a single argument
The nlm_lookup_host() function already has a large number of arguments,
and I'm about to add a few more. As a clean up, convert the function
to use a single data structure argument.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Tom Tucker [Tue, 30 Sep 2008 18:06:13 +0000 (13:06 -0500)]
svcrdma: Add Fast Reg MR Data Types
Add data types to track Fast Reg Memory Regions. The core data type is
svc_rdma_fastreg_mr that associates a device MR with a host kva and page
list. A field is added to the WR context to keep track of the FRMR
used to map the local memory for an RPC.
An FRMR list and spin lock are added to the transport instance to keep
track of all FRMR allocated for the transport. Also added are device
capability flags to indicate what the memory registration
capabilities are for the underlying device and whether or not fast
memory registration is supported.
Signed-off-by: Tom Tucker <tom@opengridcomputing.com>
J. Bruce Fields [Wed, 6 Feb 2008 20:05:12 +0000 (15:05 -0500)]
lockd: reject reclaims outside the grace period
The current lockd does not reject reclaims that arrive outside of the
grace period.
Accepting a reclaim means promising to the client that no conflicting
locks were granted since last it held the lock. We can meet that
promise if we assume the only lockers are nfs clients, and that they are
sufficiently well-behaved to reclaim only locks that they held before,
and that only reclaim locks have been permitted so far. Once we leave
the grace period (and start permitting non-reclaims), we can no longer
keep that promise. So we must start rejecting reclaims at that point.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Thu, 6 Sep 2007 16:34:25 +0000 (12:34 -0400)]
nfsd: common grace period control
Rewrite grace period code to unify management of grace period across
lockd and nfsd. The current code has lockd and nfsd cooperate to
compute a grace period which is satisfactory to them both, and then
individually enforce it. This creates a slight race condition, since
the enforcement is not coordinated. It's also more complicated than
necessary.
Here instead we have lockd and nfsd each inform common code when they
enter the grace period, and when they're ready to leave the grace
period, and allow normal locking only after both of them are ready to
leave.
We also expect the locks_start_grace()/locks_end_grace() interface here
to be simpler to build on for future cluster/high-availability work,
which may require (for example) putting individual filesystems into
grace, or enforcing grace periods across multiple cluster nodes.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
since commit ff7d9756b501744540be65e172d27ee321d86103
"nfsd: use static memory for callback program and stats"
do_probe_callback uses a static callback program
(NFS4_CALLBACK) rather than the one set in clp->cl_callback.cb_prog
as passed in by the client in setclientid (4.0)
or create_session (4.1).
This patches introduces rpc_create_args.prognumber that allows
overriding program->number when creating rpc_clnt.
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Initially I thought it might make sense to do
that every callback probing but since the stats
are per-program and they are shared between possibly
several client callback instances, zeroing them out
seems like the wrong thing to do.
Note that that commit also introduced a bug
since stats.program is also being cleared in the process
and it is not restored after the memset as it used to be.
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Thu, 25 Sep 2008 15:57:05 +0000 (11:57 -0400)]
SUNRPC: Clean up debug messages in rpcb_clnt.c
The RPCB XDR functions are used for multiple procedures. For instance,
rpcb_encode_getaddr() is used for RPCB_GETADDR, RPCB_SET, and
RPCB_UNSET. Make the XDR debug messages more generic so they are less
confusing.
And, unlike in other RPC consumers in the kernel, a single debug flag
enables all levels of debug messages in the RPC bind client, including
XDR debug messages. Since the XDR decoders already report success or
failure in this case, remove redundant debug messages in the mid-level
rpcb_register_call() function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Thu, 25 Sep 2008 15:56:57 +0000 (11:56 -0400)]
SUNRPC: Fix up svc_unregister()
With the new rpcbind code, a PMAP_UNSET will not have any effect on
services registered via rpcbind v3 or v4.
Implement a version of svc_unregister() that uses an RPCB_UNSET with
an empty netid string to make sure we have cleared *all* entries for
a kernel RPC service when shutting down, or before starting a fresh
instance of the service.
Use the new version only when CONFIG_SUNRPC_REGISTER_V4 is enabled;
otherwise, the legacy PMAP version is used to ensure complete
backwards-compatibility with the Linux portmapper daemon.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Mon, 15 Sep 2008 21:27:23 +0000 (16:27 -0500)]
SUNRPC: Register both netids for AF_INET6 servers
TI-RPC is a user-space library of RPC functions that replaces ONC RPC
and allows RPC to operate in the new world of IPv6.
TI-RPC combines the concept of a transport protocol (UDP and TCP)
and a protocol family (PF_INET and PF_INET6) into a single identifier
called a "netid." For example, "udp" means UDP over IPv4, and "udp6"
means UDP over IPv6.
For rpcbind, then, the RPC service tuple that is registered and
advertised is:
[RPC program, RPC version, service address and port, netid]
instead of
[RPC program, RPC version, port, protocol]
Service address is typically ANYADDR, but can be a specific address
of one of the interfaces on a multi-homed host. The third item in
the new tuple is expressed as a universal address.
The current Linux rpcbind implementation registers a netid for both
protocol families when RPCB_SET is done for just the PF_INET6 version
of the netid (ie udp6 or tcp6). So registering "udp6" causes a
registration for "udp" to appear automatically as well.
We've recently determined that this is incorrect behavior. In the
TI-RPC world, "udp6" is not meant to imply that the registered RPC
service handles requests from AF_INET as well, even if the listener
socket does address mapping. "udp" and "udp6" are entirely separate
capabilities, and must be registered separately.
The Linux kernel, unlike TI-RPC, leverages address mapping to allow a
single listener socket to handle requests for both AF_INET and AF_INET6.
This is still OK, but the kernel currently assumes registering "udp6"
will cover "udp" as well. It registers only "udp6" for it's AF_INET6
services, even though they handle both AF_INET and AF_INET6 on the same
port.
So svc_register() actually needs to register both "udp" and "udp6"
explicitly (and likewise for TCP). Until rpcbind is fixed, the
kernel can ignore the return code for the second RPCB_SET call.
Chuck Lever [Wed, 3 Sep 2008 18:36:08 +0000 (14:36 -0400)]
lockd: Support AF_INET6 when hashing addresses in nlm_lookup_host
Adopt an approach similar to the RPC server's auth cache (from Aurelien
Charbon and Brian Haley).
Note nlm_lookup_host()'s existing IP address hash function has the same
issue with correctness on little-endian systems as the original IPv4 auth
cache hash function, so I've also updated it with a hash function similar
to the new auth cache hash function.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Wed, 3 Sep 2008 18:36:01 +0000 (14:36 -0400)]
lockd: Teach nlm_cmp_addr() to support AF_INET6 addresses
Update the nlm_cmp_addr() helper to support AF_INET6 as well as AF_INET
addresses. New version takes two "struct sockaddr *" arguments instead of
"struct sockaddr_in *" arguments.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Wed, 3 Sep 2008 18:35:46 +0000 (14:35 -0400)]
lockd: Use sockaddr_storage for h_saddr field
To store larger addresses in the nlm_host structure, make h_saddr a
sockaddr_storage. And let's call it something more self-explanatory:
"saddr" could easily be mistaken for "server address".
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Wed, 27 Aug 2008 20:57:31 +0000 (16:57 -0400)]
lockd: Specify address family for source address
Make sure an address family is specified for source addresses passed to
nlm_lookup_host(). nlm_lookup_host() will need this when it becomes
capable of dealing with AF_INET6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Knowing which source address is used for communicating with remote NLM
services can be helpful for debugging configuration problems on hosts
with multiple addresses.
Keep the dprintk debugging here, but adapt it so it displays AF_INET6
addresses properly. There are also a couple of dprintk clean-ups as
well.
At some point we will aggregate the helpers that display presentation
format addresses into a single set of shared helpers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Wed, 27 Aug 2008 20:57:15 +0000 (16:57 -0400)]
NLM: Clean up before introducing new debugging messages
We're about to introduce some extra debugging messages in nlm_lookup_host().
Bring the coding style up to date first so we can cleanly introduce the new
debugging messages.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Mon, 18 Aug 2008 23:34:16 +0000 (19:34 -0400)]
SUNRPC: Support IPv6 when registering kernel RPC services
In order to advertise NFS-related services on IPv6 interfaces via
rpcbind, the kernel RPC server implementation must use
rpcb_v4_register() instead of rpcb_register().
A new kernel build option allows distributions to use the legacy
v2 call until they integrate an appropriate user-space rpcbind
daemon that can support IPv6 RPC services.
I tried adding some automatic logic to fall back if registering
with a v4 protocol request failed, but there are too many corner
cases. So I just made it a compile-time switch that distributions
can throw when they've replaced portmapper with rpcbind.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Mon, 18 Aug 2008 23:34:00 +0000 (19:34 -0400)]
SUNRPC: Simplify rpcb_register() API
Bruce suggested there's no need to expose the difference between an error
sending the PMAP_SET request and an error reply from the portmapper to
rpcb_register's callers. The user space equivalent of rpcb_register() is
pmap_set(3), which returns a bool_t : either the PMAP set worked, or it
didn't. Simple.
So let's remove the "*okay" argument from rpcb_register() and
rpcb_v4_register(), and simply return an error if any part of the call
didn't work.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Chuck Lever [Mon, 18 Aug 2008 23:33:44 +0000 (19:33 -0400)]
SUNRPC: Set V6ONLY socket option for RPC listener sockets
My plan is to use an AF_INET listener on systems that support only IPv4,
and an AF_INET6 listener on systems that can support IPv6. Incoming
IPv4 packets will be posted to an AF_INET6 listener with a mapped IPv4
address.
Max Matveev <makc@sgi.com> says:
Creating a single listener can be dangerous - if net.ipv6.bindv6only
is enabled then it's possible to create another listener in v4
namespace on the same port and steal the traffic from the "unifed"
listener. You need to disable V6ONLY explicitly via a sockopt to stop
that.
Set appropriate socket option on RPC server listener sockets to prevent
this.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Tue, 18 Mar 2008 23:00:19 +0000 (19:00 -0400)]
lockd: don't depend on lockd main loop to end grace
End lockd's grace period using schedule_delayed_work() instead of a
check on every pass through the main loop.
After a later patch, we'll depend on lockd to end its grace period even
if it's not currently handling requests; so it shouldn't depend on being
woken up from the main loop to do so.
Also, Nakano Hiroaki (who independently produced a similar patch)
noticed that the current behavior is buggy in the face of jiffies
wraparound:
"lockd uses time_before() to determine whether the grace period
has expired. This would seem to be enough to avoid timer
wrap-around issues, but, unfortunately, that is not the case.
The time_* family of comparison functions can be safely used to
compare jiffies relatively close in time, but they stop working
after approximately LONG_MAX/2 ticks. nfsd can suffer this
problem because the time_before() comparison in lockd() is not
performed until the first request comes in, which means that if
there is no lockd traffic for more than LONG_MAX/2 ticks we are
screwed.
"The implication of this is that once time_before() starts
misbehaving any attempt from a NFS client to execute fcntl()
will be received with a NLM_LCK_DENIED_GRACE_PERIOD message for
25 days (assuming HZ=1000). In other words, the 50 seconds grace
period could turn into a grace period of 50 days or more.
"Note: This bug was analyzed independently by Oda-san
<oda@valinux.co.jp> and myself."
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Nakano Hiroaki <nakano.hiroaki@oss.ntt.co.jp> Cc: Itsuro Oda <oda@valinux.co.jp>
J. Bruce Fields [Thu, 24 Jan 2008 16:11:34 +0000 (11:11 -0500)]
locks: allow lockd to process blocked locks during grace period
The check here is currently harmless but unnecessary, since, as the
comment notes, there aren't any blocked-lock callbacks to process
during the grace period anyway.
And eventually we want to allow multiple grace periods that come and go
for different filesystems over the course of the lifetime of lockd, at
which point this check is just going to get in the way.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Thu, 14 Aug 2008 02:03:27 +0000 (22:03 -0400)]
knfsd: allocate readahead cache in individual chunks
I had a report from someone building a large NFS server that they were
unable to start more than 585 nfsd threads. It was reported against an
older kernel using the slab allocator, and I tracked it down to the
large allocation in nfsd_racache_init failing.
It appears that the slub allocator handles large allocations better,
but large contiguous allocations can often be problematic. There
doesn't seem to be any reason that the racache has to be allocated as a
single large chunk. This patch breaks this up so that the racache is
built up from separate allocations.
(Thanks also to Takashi Iwai for a bugfix.)
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Takashi Iwai <tiwai@suse.de>
Benny Halevy [Tue, 12 Aug 2008 17:45:28 +0000 (20:45 +0300)]
nfsd: don't declare p in ENCODE_SEQID_OP_HEAD
After using the encode_stateid helper the "p" pointer declared
by ENCODE_SEQID_OP_HEAD is warned as unused.
In the single site where it is still needed it can be declared
separately using the ENCODE_HEAD macro.
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Benny Halevy [Tue, 12 Aug 2008 17:44:41 +0000 (20:44 +0300)]
nfsd: fix nfsd4_encode_open buffer space reservation
nfsd4_encode_open first reservation is currently for 36 + sizeof(stateid_t)
while it writes after the stateid a cinfo (20 bytes) and 5 more 4-bytes
words, for a total of 40 + sizeof(stateid_t).
Signed-off-by: Benny Halevy <bhalevy@panasas.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
This patch adds the CONFIG_FILE_LOCKING option which allows to remove
support for advisory locks. With this patch enabled, the flock()
system call, the F_GETLK, F_SETLK and F_SETLKW operations of fcntl()
and NFS support are disabled. These features are not necessarly needed
on embedded systems. It allows to save ~11 Kb of kernel code and data:
text data bss dec hex filename 1125436 118764 212992 1457192 163c28 vmlinux.old 1114299 118564 212992 1445855 160fdf vmlinux
-11137 -200 0 -11337 -2C49 +/-
This patch has originally been written by Matt Mackall
<mpm@selenic.com>, and is part of the Linux Tiny project.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Matt Mackall <mpm@selenic.com> Cc: matthew@wil.cx Cc: linux-fsdevel@vger.kernel.org Cc: mpm@selenic.com Cc: akpm@linux-foundation.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Thu, 7 Aug 2008 17:00:20 +0000 (13:00 -0400)]
nfsd: permit unauthenticated stat of export root
RFC 2623 section 2.3.2 permits the server to bypass gss authentication
checks for certain operations that a client may perform when mounting.
In the case of a client that doesn't have some form of credentials
available to it on boot, this allows it to perform the mount unattended.
(Presumably real file access won't be needed until a user with
credentials logs in.)
Being slightly more lenient allows lots of old clients to access
krb5-only exports, with the only loss being a small amount of
information leaked about the root directory of the export.
This affects only v2 and v3; v4 still requires authentication for all
access.
Thanks to Peter Staubach testing against a Solaris client, which
suggesting addition of v3 getattr, to the list, and to Trond for noting
that doing so exposes no additional information.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Cc: Peter Staubach <staubach@redhat.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Atsushi Nemoto [Tue, 5 Aug 2008 14:45:14 +0000 (23:45 +0900)]
[MIPS] vmlinux.lds.S: handle .text.*
The -ffunction-sections puts each text in .text.function_name section.
Without this patch, most functions are placed outside _text..._etext
area and it breaks show_stacktrace(), etc.
Akinobu Mita [Sat, 13 Sep 2008 10:03:32 +0000 (19:03 +0900)]
mmc_test: initialize mmc_test_lock statically
The mutex mmc_test_lock is initialized at every time mmc_test device
is probed. Probing another mmc_test device may break the mutex, if
the probe function is called while the mutex is locked.
This patch fixes it by statically initializing mmc_test_lock.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
We used to store a binary register snapshot in the "regs" file, so we
set the file size to be the size of this snapshot. This is no longer
valid since we switched to using seq_file.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
The debugfs hook atmci_regs_show allocates a temporary buffer for
storing a register snapshot, but it doesn't free it before returning.
Plug this leak.
Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx>
Andrew Paprocki [Sat, 20 Sep 2008 08:25:19 +0000 (10:25 +0200)]
hwmon: (it87) Fix fan tachometer reading in IT8712F rev 0x7 (I)
The IT8712F v0.9.1 datasheet applies to revisions >= 0x8 (J).
The driver was incorrectly attempting to enable 16-bit fan
readings on rev 0x7 (I) which led to incorrect RPM values.
Signed-off-by: Andrew Paprocki <andrew@ishiboo.com> Tested-by: John Gumb <john.gumb@tandberg.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
Commit 4611a77 ("[IA64] fix compile failure with non modular builds")
introduced struct fdesc into asm/elf.h, which duplicates KVM's definition.
Remove the latter to avoid the build error.
Signed-off-by: Jes Sorensen <jes@sgi.com> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* git://oss.sgi.com:8090/xfs/linux-2.6:
[XFS] Don't do I/O beyond eof when unreserving space
[XFS] Fix use-after-free with buffers
[XFS] Prevent lockdep false positives when locking two inodes.
[XFS] Fix barrier status change detection.
[XFS] Prevent direct I/O from mapping extents beyond eof
[XFS] Fix regression introduced by remount fixup
[XFS] Move memory allocations for log tracing out of the critical path
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop()
RDMA/nes: Fix client side QP destroy
IB/mlx4: Fix up fast register page list format
mlx4_core: Set RAE and init mtt_sz field in FRMR MPT entries
Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: fix deadlock in setting scheduler parameter to zero
sched: fix 2.6.27-rc5 couldn't boot on tulsa machine randomly
Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
clockevents: make device shutdown robust
clocksource, acpi_pm.c: fix check for monotonicity
clockevents: remove WARN_ON which was used to gather information
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc:
Fix compile failure with non modular builds
powerpc: Holly board needs dtbImage target
powerpc: Fix interrupt values for DMA2 in MPC8610 HPCD device tree
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 5255/1: Update jornada ssp to remove build errors/warnings
[ARM] omap: back out 'internal_clock' support
[ARM] 5249/1: davinci: remove redundant check in davinci_psc_config()
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc64: Fix SMP bootup with CONFIG_STACK_DEBUG or ftrace.
sparc64: Fix OOPS in psycho_pcierr_intr_other().
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
e100: Use pci_pme_active to clear PME_Status and disable PME#
e1000: prevent corruption of EEPROM/NVM
forcedeth: call restore mac addr in nv_shutdown path
bnx2: Promote vector field in bnx2_irq structure from u16 to unsigned int
sctp: Fix oops when INIT-ACK indicates that peer doesn't support AUTH
sctp: do not enable peer features if we can't do them.
sctp: set the skb->ip_summed correctly when sending over loopback.
udp: Fix rcv socket locking
Manfred Spraul [Wed, 20 Aug 2008 13:39:59 +0000 (15:39 +0200)]
avr32: nmi_enter() without nmi_exit()
While updating the rcu code, I noticed that do_nmi() for AVR32 is odd:
There is an nmi_enter() call without an nmi_exit().
This can't be correct, it breaks rcu (at least the preempt version) and
lockdep.
[haavard.skinnemoen@atmel.com: fixed another case that returned directly] Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
On AVR32, all parameters beyond the 5th are passed on the stack. System
calls don't use the stack -- they borrow a callee-saved register
instead. This means that syscalls that take 6 parameters must be called
through a stub that pushes the last parameter on the stack.
This patch adds a stub for sync_file_range syscall on AVR32
architecture. Tested with uClibc snapshot.