]> pilppa.com Git - linux-2.6-omap-h63xx.git/log
linux-2.6-omap-h63xx.git
18 years ago[IA64-SGI] XPC remove unnecessary GFP_DMA flag
Jes Sorensen [Tue, 24 Jan 2006 09:23:16 +0000 (04:23 -0500)]
[IA64-SGI] XPC remove unnecessary GFP_DMA flag

Remove the GFP_DMA flag from XPC kmalloc() calls.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Acked-by: Dean Nelson <dcn@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[IA64] sn2 maintainer update (Jes Sorensen)
Greg Edwards [Wed, 18 Jan 2006 16:21:59 +0000 (10:21 -0600)]
[IA64] sn2 maintainer update (Jes Sorensen)

We lured Jes to the dark side, and he's going to take over as the sn2
maintainer.  His experience and thoroughness will serve him well here.

Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[IA64-SGI] add sn_feature_sets bit
Dean Roe [Tue, 24 Jan 2006 22:49:43 +0000 (14:49 -0800)]
[IA64-SGI] add sn_feature_sets bit

SGI's prom has added a new feature which avoids an Altix-specific
MCA that can occur with excessive use of ia64_pal_cache_flush.  This
patch adds the #define to the sn_feature_sets.h to reflect that bit
is taken.

Signed-off-by: Dean Roe <roe@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[IA64] Scaling fix for simultaneous unaligned accesses
Jack Steiner [Tue, 24 Jan 2006 22:32:11 +0000 (16:32 -0600)]
[IA64] Scaling fix for simultaneous unaligned accesses

Eliminate a hot shared cacheline that occurs if multiple cpus are
taking unaligned exceptions.

Signed-off-by: Jack Steiner <steiner@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[IA64-SGI] sn_dma_alloc_coherent should use gfp flags
Takashi Iwai [Tue, 24 Jan 2006 22:30:56 +0000 (14:30 -0800)]
[IA64-SGI] sn_dma_alloc_coherent should use gfp flags

Takashi helped us track down a bad page state bug we thought was coming
from alsa.  It turns out we weren't paying attention to the gfp flags
that were passed in to sn_dma_alloc_coherent().

From: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Edwards <edwardsg@sgi.com>
Signed-off-by: Mark Maule <maule@sgi.com>
Signed-off-by: Jes Sorensen <jes@sgi.com>
18 years ago[IA64] Set the correct default OS status in the MCA handler
Keith Owens [Tue, 24 Jan 2006 01:31:26 +0000 (12:31 +1100)]
[IA64] Set the correct default OS status in the MCA handler

sos->os_status is set to a default value of IA64_MCA_COLD_BOOT for an
MCA, but then is incorrectly overwritten with IA64_MCA_SAME_CONTEXT (0).
This makes SAL think that all MCAs have been recovered.

Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[SPARC64]: Use compat_sys_futimesat in 32-bit syscall table.
David S. Miller [Fri, 20 Jan 2006 09:49:15 +0000 (01:49 -0800)]
[SPARC64]: Use compat_sys_futimesat in 32-bit syscall table.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Linus Torvalds [Fri, 20 Jan 2006 06:19:26 +0000 (22:19 -0800)]
Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6

18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
Linus Torvalds [Fri, 20 Jan 2006 06:17:38 +0000 (22:17 -0800)]
Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6

18 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6
Linus Torvalds [Fri, 20 Jan 2006 06:16:58 +0000 (22:16 -0800)]
Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6

18 years ago[PATCH] Fix regression added by ppoll/pselect code.
David S. Miller [Fri, 20 Jan 2006 00:40:42 +0000 (16:40 -0800)]
[PATCH] Fix regression added by ppoll/pselect code.

The compat layer timeout handling changes in:

9f72949f679df06021c9e43886c9191494fdb007

are busted.  This is most easily seen with an X application
that uses sub-second select/poll timeout such as emacs.  You
hit a key and it takes a second or so before the app responds.

The two ROUND_UP() calls upon entry are using {tv,ts}_sec where it
should instead be using {tv_usec,ts_nsec}, which perfectly explains
the observed incorrect behavior.

Another bug shot down with git bisect.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[NETFILTER] x_tables: Make XT_ALIGN align as strictly as necessary.
David S. Miller [Fri, 20 Jan 2006 00:58:37 +0000 (16:58 -0800)]
[NETFILTER] x_tables: Make XT_ALIGN align as strictly as necessary.

Or else we break on ppc32 and other 32-bit platforms.

Based upon a patch from Harald Welte.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/sridhar/lksctp-2.6
David S. Miller [Fri, 20 Jan 2006 00:53:02 +0000 (16:53 -0800)]
Merge master.kernel.org:/pub/scm/linux/kernel/git/sridhar/lksctp-2.6

18 years ago[IA64] eliminate softlockup warning
John Hawkes [Thu, 19 Jan 2006 07:46:53 +0000 (23:46 -0800)]
[IA64] eliminate softlockup warning

Fix an unnecessary softlockup watchdog warning in the ia64
uncached_build_memmap() that occurs occasionally at 256p and always at
512p.  The problem occurs at boot time.

Signed-off-by: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[IA64] sem2mutex: arch/ia64/kernel/perfmon.c
Jes Sorensen [Thu, 19 Jan 2006 07:46:52 +0000 (23:46 -0800)]
[IA64] sem2mutex: arch/ia64/kernel/perfmon.c

Migrate perfmon from using an old semaphore to a completion handler.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[IA64] sem2mutex: arch/ia64/ia32/sys_ia32.c
Jes Sorensen [Thu, 19 Jan 2006 07:46:51 +0000 (23:46 -0800)]
[IA64] sem2mutex: arch/ia64/ia32/sys_ia32.c

Migrate arch/ia64/ia32/sys_ia32 to using a mutex for mmap protection.

Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
18 years ago[SPARC]: Add support for *at(), ppoll, and pselect syscalls.
David S. Miller [Thu, 19 Jan 2006 10:42:49 +0000 (02:42 -0800)]
[SPARC]: Add support for *at(), ppoll, and pselect syscalls.

This also includes by necessity _TIF_RESTORE_SIGMASK support,
which actually resulted in a lot of cleanups.

The sparc signal handling code is quite a mess and I should
clean it up some day.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SPARC]: sparc32 needs PROMDEV_{I,O}RSC defines too.
David S. Miller [Thu, 19 Jan 2006 05:57:37 +0000 (21:57 -0800)]
[SPARC]: sparc32 needs PROMDEV_{I,O}RSC defines too.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years agoMerge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Thu, 19 Jan 2006 03:37:57 +0000 (19:37 -0800)]
Merge master.kernel.org:/pub/scm/linux/kernel/git/sfrench/cifs-2.6

18 years ago[PATCH] tlclk driver update
mark gross [Mon, 16 Jan 2006 01:37:30 +0000 (17:37 -0800)]
[PATCH] tlclk driver update

some driver clean ups, and a re-posting of changes that are needed

to match the updated TPS.

Signed-off-by: Mark Gross <mark.gross@intel.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] EDAC: core EDAC support code
Alan Cox [Thu, 19 Jan 2006 01:44:13 +0000 (17:44 -0800)]
[PATCH] EDAC: core EDAC support code

This is a subset of the bluesmoke project core code, stripped of the NMI work
which isn't ready to merge and some of the "interesting" proc functionality
that needs reworking or just has no place in kernel.  It requires no core
kernel changes except the added scrub functions already posted.

The goal is to merge further functionality only after the core code is
accepted and proven in the base kernel, and only at the point the upstream
extras are really ready to merge.

From: doug thompson <norsk5@xmission.com>

  This converts EDAC to sysfs and is the final chunk neccessary before EDAC
  has a stable user space API and can be considered for submission into the
  base kernel.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: doug thompson <norsk5@xmission.com>
Signed-off-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] EDAC: drivers for Radisys 82600
Alan Cox [Thu, 19 Jan 2006 01:44:12 +0000 (17:44 -0800)]
[PATCH] EDAC: drivers for Radisys 82600

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] EDAC: drivers for Intel i82860, i82875
Alan Cox [Thu, 19 Jan 2006 01:44:10 +0000 (17:44 -0800)]
[PATCH] EDAC: drivers for Intel i82860, i82875

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] EDAC: drivers for AMD 76x and Intel E750x, E752x
Alan Cox [Thu, 19 Jan 2006 01:44:08 +0000 (17:44 -0800)]
[PATCH] EDAC: drivers for AMD 76x and Intel E750x, E752x

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] EDAC: atomic scrub operations
Alan Cox [Thu, 19 Jan 2006 01:44:07 +0000 (17:44 -0800)]
[PATCH] EDAC: atomic scrub operations

EDAC requires a way to scrub memory if an ECC error is found and the chipset
does not do the work automatically.  That means rewriting memory locations
atomically with respect to all CPUs _and_ bus masters.  That means we can't
use atomic_add(foo, 0) as it gets optimised for non-SMP

This adds a function to include/asm-foo/atomic.h for the platforms currently
supported which implements a scrub of a mapped block.

It also adjusts a few other files include order where atomic.h is included
before types.h as this now causes an error as atomic_scrub uses u32.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Add pselect/ppoll system calls on i386
David Woodhouse [Thu, 19 Jan 2006 01:44:06 +0000 (17:44 -0800)]
[PATCH] Add pselect/ppoll system calls on i386

Add the sys_pselect6() and sys_poll() calls to the i386 syscall table.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Add pselect/ppoll system call implementation
David Woodhouse [Thu, 19 Jan 2006 01:44:05 +0000 (17:44 -0800)]
[PATCH] Add pselect/ppoll system call implementation

The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.

These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.

The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.

The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.

The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.

This patch:

Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: use generic sys_rt_sigsuspend
Jeff Dike [Thu, 19 Jan 2006 01:44:03 +0000 (17:44 -0800)]
[PATCH] uml: use generic sys_rt_sigsuspend

Use the generic sys_rt_sigsuspend.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: add TIF_RESTORE_SIGMASK support
Jeff Dike [Thu, 19 Jan 2006 01:44:02 +0000 (17:44 -0800)]
[PATCH] uml: add TIF_RESTORE_SIGMASK support

Add support for TIF_RESTORE_SIGMASK.  I copy the i386 handling of the flag.
sys_sigsuspend is also changed to follow i386.
Also a bit of cleanup -
   turn an if into a switch
   get rid of a couple more emacs formatting comments

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] TIF_RESTORE_SIGMASK support for arch/powerpc
David Woodhouse [Thu, 19 Jan 2006 01:44:01 +0000 (17:44 -0800)]
[PATCH] TIF_RESTORE_SIGMASK support for arch/powerpc

Implement the TIF_RESTORE_SIGMASK flag in the new arch/powerpc kernel, for
both 32-bit and 64-bit system call paths.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Handle TIF_RESTORE_SIGMASK for i386
David Howells [Thu, 19 Jan 2006 01:44:00 +0000 (17:44 -0800)]
[PATCH] Handle TIF_RESTORE_SIGMASK for i386

Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled:

        [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc
        [PATCH] 3/3 Generic sys_rt_sigsuspend

It does the following:

 (1) Declares TIF_RESTORE_SIGMASK for i386.

 (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set.

 (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved
     in current->saved_sigmask.

 (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead.

 (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK
     rather than attempting to fudge the return registers.

 (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping
     intrinsically.

 (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or
     -EFAULT rather than true/false to be consistent with the rest of the
     kernel.

Due to the fact do_signal() is then only called from one place:

 (8) Makes do_signal() no longer have a return value is it was just being
     ignored; force_sig() takes care of this.

 (9) Discards the old sigmask argument to do_signal() as it's no longer
     necessary.

(10) Makes do_signal() static.

(11) Marks the second argument to do_notify_resume() as unused. The unused
     argument should remain in the middle as the arguments are passed in as
     registers, and the ordering is specific in entry.S

Given the way do_signal() is now no longer called from sys_{,rt_}sigsuspend(),
they no longer need access to the exception frame, and so can just take
arguments normally.

This patch depends on sys_rt_sigsuspend patch.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Handle TIF_RESTORE_SIGMASK for FRV
David Howells [Thu, 19 Jan 2006 01:43:59 +0000 (17:43 -0800)]
[PATCH] Handle TIF_RESTORE_SIGMASK for FRV

Handle TIF_RESTORE_SIGMASK as added by David Woodhouse's patch entitled:

        [PATCH] 2/3 Add TIF_RESTORE_SIGMASK support for arch/powerpc
        [PATCH] 3/3 Generic sys_rt_sigsuspend

It does the following:

 (1) Declares TIF_RESTORE_SIGMASK for FRV.

 (2) Invokes it over to do_signal() when TIF_RESTORE_SIGMASK is set.

 (3) Makes do_signal() support TIF_RESTORE_SIGMASK, using the signal mask saved
     in current->saved_sigmask.

 (4) Discards sys_rt_sigsuspend() from the arch, using the generic one instead.

 (5) Makes sys_sigsuspend() save the signal mask and set TIF_RESTORE_SIGMASK
     rather than attempting to fudge the return registers.

 (6) Makes sys_sigsuspend() return -ERESTARTNOHAND rather than looping
     intrinsically.

 (7) Makes setup_frame(), setup_rt_frame() and handle_signal() return 0 or
     -EFAULT rather than true/false to be consistent with the rest of the
      kernel.

Due to the fact do_signal() is then only called from one place:

 (8) Make do_signal() no longer have a return value is it was just being
     ignored; force_sig() takes care of this.

 (9) Discards the old sigmask argument to do_signal() as it's no longer
     necessary.

This patch depends on the FRV signalling patches as well as the
sys_rt_sigsuspend patch.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Generic sys_rt_sigsuspend()
David Woodhouse [Thu, 19 Jan 2006 01:43:57 +0000 (17:43 -0800)]
[PATCH] Generic sys_rt_sigsuspend()

The TIF_RESTORE_SIGMASK flag allows us to have a generic implementation of
sys_rt_sigsuspend() instead of duplicating it for each architecture.  This
provides such an implementation and makes arch/powerpc use it.

It also tidies up the ppc32 sys_sigsuspend() to use TIF_RESTORE_SIGMASK.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] vfs: *at functions: x86_64
Ulrich Drepper [Thu, 19 Jan 2006 01:43:56 +0000 (17:43 -0800)]
[PATCH] vfs: *at functions: x86_64

Wire up the x86_64 syscalls.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] vfs: *at functions: i386
Ulrich Drepper [Thu, 19 Jan 2006 01:43:55 +0000 (17:43 -0800)]
[PATCH] vfs: *at functions: i386

Wire up the x86 syscalls

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] vfs: *at functions: core
Ulrich Drepper [Thu, 19 Jan 2006 01:43:53 +0000 (17:43 -0800)]
[PATCH] vfs: *at functions: core

Here is a series of patches which introduce in total 13 new system calls
which take a file descriptor/filename pair instead of a single file
name.  These functions, openat etc, have been discussed on numerous
occasions.  They are needed to implement race-free filesystem traversal,
they are necessary to implement a virtual per-thread current working
directory (think multi-threaded backup software), etc.

We have in glibc today implementations of the interfaces which use the
/proc/self/fd magic.  But this code is rather expensive.  Here are some
results (similar to what Jim Meyering posted before).

The test creates a deep directory hierarchy on a tmpfs filesystem.  Then
rm -fr is used to remove all directories.  Without syscall support I get
this:

real    0m31.921s
user    0m0.688s
sys     0m31.234s

With syscall support the results are much better:

real    0m20.699s
user    0m0.536s
sys     0m20.149s

The interfaces are for obvious reasons currently not much used.  But they'll
be used.  coreutils (and Jeff's posixutils) are already using them.
Furthermore, code like ftw/fts in libc (maybe even glob) will also start using
them.  I expect a patch to make follow soon.  Every program which is walking
the filesystem tree will benefit.

Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@ftp.linux.org.uk>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] exportfs: add find_acceptable_alias helper
Christoph Hellwig [Thu, 19 Jan 2006 01:43:52 +0000 (17:43 -0800)]
[PATCH] exportfs: add find_acceptable_alias helper

find_exported_dentry contains two duplicate loops to find an alias that the
acceptable callback likes.  Split this out to a new helper and switch from
list_for_each to list_for_each_entry to make it more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] knfsd: Provide missing NFSv2 part of patch for checking vfs_getattr.
David Shaw [Thu, 19 Jan 2006 01:43:51 +0000 (17:43 -0800)]
[PATCH] knfsd: Provide missing NFSv2 part of patch for checking vfs_getattr.

A recent patch which checked the return status of vfs_getattr in nfsd,
completely missed the nfsproc.c (NFSv2) part.  Here is it.

This patch moved the call to vfs_getattr from the xdr encoding (at which point
it is too late to return an error) to the call handling.  This means several
calls to vfs_getattr are needed in nfsproc.c.  Many are encapsulated in
nfsd_return_attrs and nfsd_return_dirop.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] knfsd: Fix some more errno/nfserr confusion in vfs.c
NeilBrown [Thu, 19 Jan 2006 01:43:50 +0000 (17:43 -0800)]
[PATCH] knfsd: Fix some more errno/nfserr confusion in vfs.c

nfsd_sync* return an errno, which usually needs to be converted to an errno,
sometimes immediately, sometimes a little later.

Also, nfsd_setattr returns an nfserr which SHOULDN'T be converted from
an errno (because it isn't one).

Also some tidyups of the form:
  err = XX
  err = nfserrno(err)
and
  err = XX
  if (err)
      err = nfserrno(err)
become
  err = nfserrno(XX)

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4_lock() returns bogus values to clients
Al Viro [Thu, 19 Jan 2006 01:43:48 +0000 (17:43 -0800)]
[PATCH] nfsd4_lock() returns bogus values to clients

missing nfserrno() in default case of a switch by return value of
posix_lock_file(); as the result we send negative host-endian to clients that
expect positive network-endian, preferably mentioned in RFC...  BTW, that case
is not impossible - posix_lock_file() can return -ENOLCK and we do not handle
that one explicitly.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] NFSERR_SERVERFAULT returned host-endian
Al Viro [Thu, 19 Jan 2006 01:43:47 +0000 (17:43 -0800)]
[PATCH] NFSERR_SERVERFAULT returned host-endian

->rp_status is network-endian and nobody byteswaps it before sending to
client; putting NFSERR_SERVERFAULT instead of nfserr_serverfault in there is
not nice...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4_truncate() bogus return value
Al Viro [Thu, 19 Jan 2006 01:43:46 +0000 (17:43 -0800)]
[PATCH] nfsd4_truncate() bogus return value

-EINVAL (in host order, no less) is not a good thing to return to client.

nfsd4_truncate() returns it in one case and its callers expect nfs_....  from
it.  AFAICS, it should be nfserr_inval

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd/vfs.c: endianness fixes
Al Viro [Thu, 19 Jan 2006 01:43:44 +0000 (17:43 -0800)]
[PATCH] nfsd/vfs.c: endianness fixes

Several failure exits return -E<something> instead of nfserr_<something> and
vice versa.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: clean up settattr code
Fred Isaman [Thu, 19 Jan 2006 01:43:43 +0000 (17:43 -0800)]
[PATCH] nfsd4: clean up settattr code

Clean up some unnecessary special-casing in the setattr code..

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: Fix bug in rdattr_error return
Fred Isaman [Thu, 19 Jan 2006 01:43:40 +0000 (17:43 -0800)]
[PATCH] nfsd4: Fix bug in rdattr_error return

Fix bug in rdattr_error return which causes correct error code to be
overwritten by nfserr_toosmall.

Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: fix open_downgrade
J. Bruce Fields [Thu, 19 Jan 2006 01:43:38 +0000 (17:43 -0800)]
[PATCH] nfsd4: fix open_downgrade

Bad bookkeeping of the share reservations when handling open upgrades was
causing open downgrade to fail.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: don't create on open that fails due to ERR_GRACE
J. Bruce Fields [Thu, 19 Jan 2006 01:43:36 +0000 (17:43 -0800)]
[PATCH] nfsd4: don't create on open that fails due to ERR_GRACE

In an earlier patch (commit b648330a1d741d5df8a5076b2a0a2519c69c8f41) I noted
that a too-early grace-period check was preventing us from bumping the
sequence id on open.  Unfortunately in that patch I stupidly moved the
grace-period check back too far, so now an open for create can succesfully
create the file while still returning ERR_GRACE.

The correct place for that check is after we've set the open_owner and handled
any replays, but before we actually start mucking with the filesystem.

Thanks to Avishay Traeger for reporting the bug.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: simplify process-open1 logic
J. Bruce Fields [Thu, 19 Jan 2006 01:43:34 +0000 (17:43 -0800)]
[PATCH] nfsd4: simplify process-open1 logic

nfsd4_process_open1 is very highly nested; flatten it out a bit.

Also, the preceding comment, which just outlines the logic, seems redundant.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: nfs4state.c miscellaneous goto removals
J. Bruce Fields [Thu, 19 Jan 2006 01:43:33 +0000 (17:43 -0800)]
[PATCH] nfsd4: nfs4state.c miscellaneous goto removals

Remove some goto's that made the logic here a little more tortuous than
necessary.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: no replays on unconfirmed owners
J. Bruce Fields [Thu, 19 Jan 2006 01:43:32 +0000 (17:43 -0800)]
[PATCH] nfsd4: no replays on unconfirmed owners

We shouldn't check for replays until after checking whether the open owner is
confirmed.  Clients are allowed to reuse openowners without bumping the seqid.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: handle replays of failed open reclaims
J. Bruce Fields [Thu, 19 Jan 2006 01:43:30 +0000 (17:43 -0800)]
[PATCH] nfsd4: handle replays of failed open reclaims

We need to make sure open reclaims are marked confirmed immediately so that we
can handle replays even if they fail (e.g.  with a seqid-incrementing error).
(See 8.1.8.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: recovery lookup dir check
J. Bruce Fields [Thu, 19 Jan 2006 01:43:29 +0000 (17:43 -0800)]
[PATCH] nfsd4: recovery lookup dir check

Make sure we get a directory when we look up the recovery directory.

Thanks to Christoph Hellwig for the bug report.

Based on feedback from Christoph and others, we may remove the need for this
lookup and just pass in a file descriptor from userspace instead, and/or
completely move the directory handling to userspace.  For now we're just
fixing the obvious bugs.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: fix open of recovery directory
J. Bruce Fields [Thu, 19 Jan 2006 01:43:27 +0000 (17:43 -0800)]
[PATCH] nfsd4: fix open of recovery directory

We should be opening this directory RDONLY, not RDWR.

Thanks to Christoph Hellwig for the bug report.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] svcrpc: gss: svc context creation error handling
J. Bruce Fields [Thu, 19 Jan 2006 01:43:26 +0000 (17:43 -0800)]
[PATCH] svcrpc: gss: svc context creation error handling

Allow mechanisms to return more varied errors on the context creation
downcall.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] svcrpc: gss: server context init failure handling
Kevin Coffman [Thu, 19 Jan 2006 01:43:25 +0000 (17:43 -0800)]
[PATCH] svcrpc: gss: server context init failure handling

We require the server's gssd to create a completed context before asking the
kernel to send a final context init reply.  However, gssd could be buggy, or
under some bizarre circumstances we might purge the context from our cache
before we get the chance to use it here.

Handle this case by returning GSS_S_NO_CONTEXT to the client.

Also move the relevant code here to a separate function rather than nesting
excessively.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] svcrpc: gss: handle the GSS_S_CONTINUE
Andy Adamson [Thu, 19 Jan 2006 01:43:24 +0000 (17:43 -0800)]
[PATCH] svcrpc: gss: handle the GSS_S_CONTINUE

Kerberos context initiation is handled in a single round trip, but other
mechanisms (including spkm3) may require more, so we need to handle the
GSS_S_CONTINUE case in svcauth_gss_accept.  Send a null verifier.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: operation debugging
J. Bruce Fields [Thu, 19 Jan 2006 01:43:23 +0000 (17:43 -0800)]
[PATCH] nfsd4: operation debugging

Simple, useful debugging printk: print the number of each op as we process it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: fix check_for_locks
J. Bruce Fields [Thu, 19 Jan 2006 01:43:22 +0000 (17:43 -0800)]
[PATCH] nfsd4: fix check_for_locks

Fix some bad logic.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: remove release_state_owner()
J. Bruce Fields [Thu, 19 Jan 2006 01:43:21 +0000 (17:43 -0800)]
[PATCH] nfsd4: remove release_state_owner()

It's confusing having both release_stateowner() and release_state_owner().

And as it turns out, release_state_owner() is short and only called from one
place; so just remove it.

Also note the confirmed check is superfluous there--preprocess_seqid_op
already check this.

And remove a redundant comment and a superfluous line assignment while we're
at it.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: rename lk_stateowner
J. Bruce Fields [Thu, 19 Jan 2006 01:43:19 +0000 (17:43 -0800)]
[PATCH] nfsd4: rename lk_stateowner

One of the things that's confusing about nfsd4_lock is that the lk_stateowner
field could be set to either of two different lockowners: the open owner or
the lock owner.  Rename to lk_replay_owner and add a comment to make it clear
that it's used for whichever stateowner has its sequence id bumped for replay
detection.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: fix nfsd4_lock cleanup on failure
J. Bruce Fields [Thu, 19 Jan 2006 01:43:18 +0000 (17:43 -0800)]
[PATCH] nfsd4: fix nfsd4_lock cleanup on failure

release_state_owner also puts the lock owner on the close_lru.  There's no
need for that, though; replays of the failed lock would be handled by the
openowner not the lockowner.

Also consolidate the cleanup a bit, fixing leaks that can happen if errors
occur between the time a new lock owner is allocated and the lock is done.

Remove a comment and dprintk that look a little redundant.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd4: misc lock fixes
Andy Adamson [Thu, 19 Jan 2006 01:43:17 +0000 (17:43 -0800)]
[PATCH] nfsd4: misc lock fixes

Logic fixes for LOCK and UNLOCK.

- Move the permission check on the current file handle outside of
  nfs4_lock_state()

- remove the file manager fl_release_private calls; fl_ops is not set.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] svcrpc: save and restore the daddr field when request deferred
J. Bruce Fields [Thu, 19 Jan 2006 01:43:16 +0000 (17:43 -0800)]
[PATCH] svcrpc: save and restore the daddr field when request deferred

The server code currently keeps track of the destination address on every
request so that it can reply using the same address.  However we forget to do
that in the case of a deferred request.  Remedy this oversight.  >From folks
at PolyServe.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd: remove inline from a couple of large NFS functions
NeilBrown [Thu, 19 Jan 2006 01:43:14 +0000 (17:43 -0800)]
[PATCH] nfsd: remove inline from a couple of large NFS functions

These are both called from two places close together.  I could rearrange that
code so there is only one call site, but just removing the 'inline' is
probably best.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] nfsd: check error status from nfsd_sync_dir
YAMAMOTO Takashi [Thu, 19 Jan 2006 01:43:13 +0000 (17:43 -0800)]
[PATCH] nfsd: check error status from nfsd_sync_dir

Change nfsd_sync_dir to return an error if ->sync fails, and pass that error
up through the stack.  This involves a number of rearrangements of error
paths, and care to distinguish between Linux -errno numbers and NFSERR
numbers.

In the 'create' routines, we continue with the 'setattr' even if a previous
sync_dir failed.

This patch is quite different from Takashi's in a few ways, but there is still
a strong lineage.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] hfs: set type/creator for symlinks
Roman Zippel [Thu, 19 Jan 2006 01:43:12 +0000 (17:43 -0800)]
[PATCH] hfs: set type/creator for symlinks

Set the correct type and creator for symlinks.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] hfs: set correct create date for links
Roman Zippel [Thu, 19 Jan 2006 01:43:10 +0000 (17:43 -0800)]
[PATCH] hfs: set correct create date for links

HFS+ also requires the correct creation date so recent version of OS X
recognize it as link.
Improve link handling:
- if something is wrong with the link, ignore the link attribute and treat
  it as regular file (this also fixes a missing unlock during lookup).
- check for incorrect link counts during unlink.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] hfs: set correct ctime
Roman Zippel [Thu, 19 Jan 2006 01:43:09 +0000 (17:43 -0800)]
[PATCH] hfs: set correct ctime

Read the correct ctime from disk (it was written but never read for some
reason).  Read also creation date, which is used in the next patch.  (Problem
found by Olivier Castan <olivier.castan@certa.ssi.gouv.fr>)

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] hfs: add HFSX support
David Elliott [Thu, 19 Jan 2006 01:43:08 +0000 (17:43 -0800)]
[PATCH] hfs: add HFSX support

Add support for HFSX, which allows for case-sensitive filenames.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] hfs: cleanup HFS prints
Roman Zippel [Thu, 19 Jan 2006 01:43:07 +0000 (17:43 -0800)]
[PATCH] hfs: cleanup HFS prints

Add the log level and a "hfs: " prefix to all kernel prints.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] hfs: cleanup HFS+ prints
Roman Zippel [Thu, 19 Jan 2006 01:43:05 +0000 (17:43 -0800)]
[PATCH] hfs: cleanup HFS+ prints

Add the log level and a "hfs: " prefix to all kernel prints.  (HFS and HFS+
will use the same prefix, as they share some code and could be merged at some
point.)

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] add missing syscall declarations
Arnd Bergmann [Thu, 19 Jan 2006 01:43:04 +0000 (17:43 -0800)]
[PATCH] add missing syscall declarations

All standard system calls should be declared in include/linux/syscalls.h.

Add some of the new additions that were previously missed.

Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix sched_setscheduler semantics
Jason Baron [Thu, 19 Jan 2006 01:43:03 +0000 (17:43 -0800)]
[PATCH] fix sched_setscheduler semantics

Currently, a negative policy argument passed into the
'sys_sched_setscheduler()' system call, will return with success.  However,
the manpage for 'sys_sched_setscheduler' says:

EINVAL The scheduling policy is not one of the recognized policies, or the
              parameter p does not make sense for the policy.

Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] v9fs: add readpage support
Eric Van Hensbergen [Thu, 19 Jan 2006 01:43:02 +0000 (17:43 -0800)]
[PATCH] v9fs: add readpage support

v9fs mmap support was originally removed from v9fs at Al Viro's request,
but recently there have been requests from folks who want readpage
functionality (primarily to enable execution of files mounted via 9P).
This patch adds readpage support (but not writepage which contained most of
the objectionable code).  It passes fsx-linux (and other regressions) so it
should be relatively safe.

Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml ubd code: fix a bit of whitespace
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:43:01 +0000 (17:43 -0800)]
[PATCH] uml ubd code: fix a bit of whitespace

Correct a bit of whitespace problems while working here.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: allow again to move backing file and to override saved location
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:43:00 +0000 (17:43 -0800)]
[PATCH] uml: allow again to move backing file and to override saved location

When the user specifies both a COW file and its backing file, if the previous
backing file is not found, currently UML tries again to use it and fails.

This can be corrected by changing same_backing_files() return value in that
case, so that the caller will try to change the COW file to point to the new
location, as already done in other cases.

Additionally, given the change in the meaning of the func, change its name,
invert its return value, so all values are inverted except when
stat(from_cow,&buf2) fails.  And add some comments and two minor bugfixes -
remove a fd leak (return err rather than goto out) and a repeated check.

Tested well.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: arch Kconfig menu cleanups
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:59 +0000 (17:42 -0800)]
[PATCH] uml: arch Kconfig menu cleanups

*) mark as "EXPERIMENTAL" various items that either aren't very stable or
   that are actively crashing the setup of users which don't really need them
   (i.e.  HIGHMEM and 3-level pagetables on x86 - nobody needs either,
   everybody reports "I'm using it and getting trouble").

*) move net/Kconfig near to the rest of network configurations, and
   drivers/block/Kconfig near "Block layer" submenu.

*) it's useless and doesn't work well to force NETDEVICES on and to disable
   the prompt like it's done.  Better remove the attempt, and change that to a
   simple "default y if UML".

*) drop the warning about "report problems about HPPFS" - it's redundant
   anyway, as that's the usual procedure, and HPPFS users are especially
   technical (i.e.  they know reporting bugs is _good_).

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: avoid malloc to sleep in atomic sections
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:58 +0000 (17:42 -0800)]
[PATCH] uml: avoid malloc to sleep in atomic sections

Ugly trick to help make malloc not sleeping - we can't do anything else.  But
this is not yet optimal, since spinlock don't trigger in_atomic() when
preemption is disabled.

Also, even if ugly, this was already used in one place, and was even more
bogus.  Fix it.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: sigio code - reduce spinlock hold time
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:57 +0000 (17:42 -0800)]
[PATCH] uml: sigio code - reduce spinlock hold time

In a previous patch I shifted an allocation to being atomic.

In this patch, a better but more intrusive solution is implemented, i.e.  hold
the lock only when really needing it, especially not over pipe operations, nor
over the culprit allocation.

Additionally, while at it, add a missing kfree in the failure path, and make
sure that if we fail in forking, write_sigio_pid is -1 and not, say, -ENOMEM.

And fix whitespace, at least for things I was touching anyway.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: fix spinlock recursion and sleep-inside-spinlock in error path
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:56 +0000 (17:42 -0800)]
[PATCH] uml: fix spinlock recursion and sleep-inside-spinlock in error path

In this error path, when the interface has had a problem, we call dev_close(),
which is disallowed for two reasons:

*) takes again the UML internal spinlock, inside the ->stop method of this
   device
*) can be called in process context only, while we're in interrupt context.

I've also thought that calling dev_close() may be a wrong policy to follow,
but it's not up to me to decide that.

However, we may end up with multiple dev_close() queued on the same device.
But the initial test for (dev->flags & IFF_UP) makes this harmless, though -
and dev_close() is supposed to care about races with itself.  So there's no
harm in delaying the shutdown, IMHO.

Something to mark the interface as "going to shutdown" would be appreciated,
but dev_deactivate has the same problems as dev_close(), so we can't use it
either.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: networking - clear transport-specific structure
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:55 +0000 (17:42 -0800)]
[PATCH] uml: networking - clear transport-specific structure

Pre-clear transport-specific private structure before passing it down.

In fact, I just got a slab corruption and kernel panic on exit because kfree()
was called on a pointer which probably was never allocated, BUT hadn't been
set to NULL by the driver.

As the code is full of such errors, I've decided for now to go the safe way
(we're talking about drivers), and to do the simple thing.  I'm also starting
to fix drivers, and already sent a patch for the daemon transport.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: make daemon transport behave properly
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:53 +0000 (17:42 -0800)]
[PATCH] uml: make daemon transport behave properly

Avoid uninitialized data in the daemon_data structure.  I used this transport
before doing proper setup before-hand, and I got some very nice SLAB
corruption due to freeing crap pointers.  So just make sure to clear
everything when appropriate.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: remove leftover from patch revertal
Paolo 'Blaisorblade' Giarrusso [Thu, 19 Jan 2006 01:42:52 +0000 (17:42 -0800)]
[PATCH] uml: remove leftover from patch revertal

I added this line to share this file with UML, but now it's no longer
shared so remove this useless leftover.

Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Acked-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: TT mode softint fixes
Bodo Stroesser [Thu, 19 Jan 2006 01:42:51 +0000 (17:42 -0800)]
[PATCH] uml: TT mode softint fixes

Some fixes to make softints work in tt mode.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: use setjmp/longjmp instead of sigsetjmp/siglongjmp
Jeff Dike [Thu, 19 Jan 2006 01:42:50 +0000 (17:42 -0800)]
[PATCH] uml: use setjmp/longjmp instead of sigsetjmp/siglongjmp

Now that we are doing soft interrupts, there's no point in using sigsetjmp and
siglongjmp.  Using setjmp and longjmp saves a sigprocmask on every jump.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: implement soft interrupts
Jeff Dike [Thu, 19 Jan 2006 01:42:49 +0000 (17:42 -0800)]
[PATCH] uml: implement soft interrupts

This patch implements soft interrupts.  Interrupt enabling and disabling no
longer map to sigprocmask.  Rather, a flag is set indicating whether
interrupts may be handled.  If a signal comes in and interrupts are marked as
OK, then it is handled normally.  If interrupts are marked as off, then the
signal handler simply returns after noting that a signal needs handling.  When
interrupts are enabled later on, this pending signals flag is checked, and the
IRQ handlers are called at that point.

The point of this is to reduce the cost of local_irq_save et al, since they
are very much more common than the signals that they are enabling and
disabling.  Soft interrupts produce a speed-up of ~25% on a kernel build.

Subtleties -

    UML uses sigsetjmp/siglongjmp to switch contexts.  sigsetjmp has been
    wrapped in a save_flags-like macro which remembers the interrupt state at
    setjmp time, and restores it when it is longjmp-ed back to.

    The enable_signals function has to loop because the IRQ handler
    disables interrupts before returning.  enable_signals has to return with
    signals enabled, and signals may come in between the disabling and the
    return to enable_signals.  So, it loops for as long as there are pending
    signals, ensuring that signals are enabled when it finally returns, and
    that there are no pending signals that need to be dealt with.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: eliminate some globals
Jeff Dike [Thu, 19 Jan 2006 01:42:48 +0000 (17:42 -0800)]
[PATCH] uml: eliminate some globals

Stop using global variables to hold the file descriptor and offset used to map
the skas0 stubs.  Instead, calculate them using the page physical addresses.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: move libc-dependent skas process handling
Gennady Sharapov [Thu, 19 Jan 2006 01:42:46 +0000 (17:42 -0800)]
[PATCH] uml: move libc-dependent skas process handling

The serial UML OS-abstraction layer patch (um/kernel/skas dir).

This moves all systemcalls from skas/process.c file under os-Linux dir and
join skas/process.c and skas/process_kern.c files.

Signed-off-by: Gennady Sharapov <gennady.v.sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: move libc-dependent skas memory mapping code
Gennady Sharapov [Thu, 19 Jan 2006 01:42:45 +0000 (17:42 -0800)]
[PATCH] uml: move libc-dependent skas memory mapping code

The serial UML OS-abstraction layer patch (um/kernel/skas dir).

This moves all systemcalls from skas/mem_user.c file under os-Linux dir and
join skas/mem_user.c and skas/mem.c files.

Signed-off-by: Gennady Sharapov <gennady.v.sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: move headers to arch/um/include
Gennady Sharapov [Thu, 19 Jan 2006 01:42:44 +0000 (17:42 -0800)]
[PATCH] uml: move headers to arch/um/include

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves skas headers to arch/um/include.

Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: change interface to boot_timer_handler
Bodo Stroesser [Thu, 19 Jan 2006 01:42:43 +0000 (17:42 -0800)]
[PATCH] uml: change interface to boot_timer_handler

Current implementation of boot_timer_handler isn't usable for s390.  So I
changed its name to do_boot_timer_handler, taking (struct sigcontext *)sc as
argument.  do_boot_timer_handler is called from new boot_timer_handler() in
arch/um/os-Linux/signal.c, which uses the same mechanisms as other signal
handler to find out sigcontext pointer.

Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: move libc-dependent time code
Gennady Sharapov [Thu, 19 Jan 2006 01:42:42 +0000 (17:42 -0800)]
[PATCH] uml: move libc-dependent time code

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all systemcalls from time.c file under os-Linux dir and joins
time.c and tine_kernel.c files

Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: move libc-dependent utility procedures
Gennady Sharapov [Thu, 19 Jan 2006 01:42:41 +0000 (17:42 -0800)]
[PATCH] uml: move libc-dependent utility procedures

The serial UML OS-abstraction layer patch (um/kernel dir).

This moves all systemcalls from user_util.c file under os-Linux dir

Signed-off-by: Gennady Sharapov <Gennady.V.Sharapov@intel.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: move LDT creation
Bodo Stroesser [Thu, 19 Jan 2006 01:42:39 +0000 (17:42 -0800)]
[PATCH] uml: move LDT creation

s390 doesn't have a LDT.  So MM_COPY_SEGMENTS will not be supported on s390.

The only user of MM_COPY_SEGMENTS is new_mm(), but that's no longer useful, as
arch/sys-i386/ldt.c defines init_new_ldt(), which is called immediately after
new_mm().  So we should copy host's LDT in init_new_ldt(), if /proc/mm is
available, to have this subarch specific call in subarch code.

Signed-off-by: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] uml: add __raw_writel definition
Jeff Dike [Thu, 19 Jan 2006 01:42:38 +0000 (17:42 -0800)]
[PATCH] uml: add __raw_writel definition

Add implementations of the write* and __raw_write* functions.  __raw_writel is
needed by lib/iocopy.c, which shouldn't be used in UML, but which is
unconditionally linked in anyway.

Signed-off-by: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] mm: optimize numa policy handling in slab allocator
Christoph Lameter [Thu, 19 Jan 2006 01:42:37 +0000 (17:42 -0800)]
[PATCH] mm: optimize numa policy handling in slab allocator

Move the interrupt check from slab_node into ___cache_alloc and adds an
"unlikely()" to avoid pipeline stalls on some architectures.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] NUMA policies in the slab allocator V2
Christoph Lameter [Thu, 19 Jan 2006 01:42:36 +0000 (17:42 -0800)]
[PATCH] NUMA policies in the slab allocator V2

This patch fixes a regression in 2.6.14 against 2.6.13 that causes an
imbalance in memory allocation during bootup.

The slab allocator in 2.6.13 is not numa aware and simply calls
alloc_pages().  This means that memory policies may control the behavior of
alloc_pages().  During bootup the memory policy is set to MPOL_INTERLEAVE
resulting in the spreading out of allocations during bootup over all
available nodes.  The slab allocator in 2.6.13 has only a single list of
slab pages.  As a result the per cpu slab cache and the spinlock controlled
page lists may contain slab entries from off node memory.  The slab
allocator in 2.6.13 makes no effort to discern the locality of an entry on
its lists.

The NUMA aware slab allocator in 2.6.14 controls locality of the slab pages
explicitly by calling alloc_pages_node().  The NUMA slab allocator manages
slab entries by having lists of available slab pages for each node.  The
per cpu slab cache can only contain slab entries associated with the node
local to the processor.  This guarantees that the default allocation mode
of the slab allocator always assigns local memory if available.

Setting MPOL_INTERLEAVE as a default policy during bootup has no effect
anymore.  In 2.6.14 all node unspecific slab allocations are performed on
the boot processor.  This means that most of key data structures are
allocated on one node.  Most processors will have to refer to these
structures making the boot node a potential bottleneck.  This may reduce
performance and cause unnecessary memory pressure on the boot node.

This patch implements NUMA policies in the slab layer.  There is the need
of explicit application of NUMA memory policies by the slab allcator itself
since the NUMA slab allocator does no longer let the page_allocator control
locality.

The check for policies is made directly at the beginning of __cache_alloc
using current->mempolicy.  The memory policy is already frequently checked
by the page allocator (alloc_page_vma() and alloc_page_current()).  So it
is highly likely that the cacheline is present.  For MPOL_INTERLEAVE
kmalloc() will spread out each request to one node after another so that an
equal distribution of allocations can be obtained during bootup.

It is not possible to push the policy check to lower layers of the NUMA
slab allocator since the per cpu caches are now only containing slab
entries from the current node.  If the policy says that the local node is
not to be preferred or forbidden then there is no point in checking the
slab cache or local list of slab pages.  The allocation better be directed
immediately to the lists containing slab entries for the allowed set of
nodes.

This way of applying policy also fixes another strange behavior in 2.6.13.
alloc_pages() is controlled by the memory allocation policy of the current
process.  It could therefore be that one process is running with
MPOL_INTERLEAVE and would f.e.  obtain a new page following that policy
since no slab entries are in the lists anymore.  A page can typically be
used for multiple slab entries but lets say that the current process is
only using one.  The other entries are then added to the slab lists.  These
are now non local entries in the slab lists despite of the possible
availability of local pages that would provide faster access and increase
the performance of the application.

Another process without MPOL_INTERLEAVE may now run and expect a local slab
entry from kmalloc().  However, there are still these free slab entries
from the off node page obtained from the other process via MPOL_INTERLEAVE
in the cache.  The process will then get an off node slab entry although
other slab entries may be available that are local to that process.  This
means that the policy if one process may contaminate the locality of the
slab caches for other processes.

This patch in effect insures that a per process policy is followed for the
allocation of slab entries and that there cannot be a memory policy
influence from one process to another.  A process with default policy will
always get a local slab entry if one is available.  And the process using
memory policies will get its memory arranged as requested.  Off-node slab
allocation will require the use of spinlocks and will make the use of per
cpu caches not possible.  A process using memory policies to redirect
allocations offnode will have to cope with additional lock overhead in
addition to the latency added by the need to access a remote slab entry.

Changes V1->V2
- Remove #ifdef CONFIG_NUMA by moving forward declaration into
  prior #ifdef CONFIG_NUMA section.

- Give the function determining the node number to use a saner
  name.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sem2mutex: mm/slab.c
Ingo Molnar [Thu, 19 Jan 2006 01:42:33 +0000 (17:42 -0800)]
[PATCH] sem2mutex: mm/slab.c

Convert mm/swapfile.c's swapon_sem to swapon_mutex.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Zone reclaim: proc override
Christoph Lameter [Thu, 19 Jan 2006 01:42:32 +0000 (17:42 -0800)]
[PATCH] Zone reclaim: proc override

proc support for zone reclaim

This patch creates a proc entry /proc/sys/vm/zone_reclaim_mode that may be
used to override the automatic determination of the zone reclaim made on
bootup.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Zone reclaim: Reclaim logic
Christoph Lameter [Thu, 19 Jan 2006 01:42:31 +0000 (17:42 -0800)]
[PATCH] Zone reclaim: Reclaim logic

Some bits for zone reclaim exists in 2.6.15 but they are not usable.  This
patch fixes them up, removes unused code and makes zone reclaim usable.

Zone reclaim allows the reclaiming of pages from a zone if the number of
free pages falls below the watermarks even if other zones still have enough
pages available.  Zone reclaim is of particular importance for NUMA
machines.  It can be more beneficial to reclaim a page than taking the
performance penalties that come with allocating a page on a remote zone.

Zone reclaim is enabled if the maximum distance to another node is higher
than RECLAIM_DISTANCE, which may be defined by an arch.  By default
RECLAIM_DISTANCE is 20.  20 is the distance to another node in the same
component (enclosure or motherboard) on IA64.  The meaning of the NUMA
distance information seems to vary by arch.

If zone reclaim is not successful then no further reclaim attempts will
occur for a certain time period (ZONE_RECLAIM_INTERVAL).

This patch was discussed before. See

http://marc.theaimsgroup.com/?l=linux-kernel&m=113519961504207&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=113408418232531&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=113389027420032&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=113380938612205&w=2

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>