Paul Mundt [Fri, 2 Nov 2007 03:16:51 +0000 (12:16 +0900)]
sh: Decouple 4k and soft/hardirq stacks.
While using separate IRQ stacks can cut down on stack consumption,
many users can also use 4k stacks directly without the additional
need of separate stacks for soft and hardirqs.
With this split, we support the same rationale for 4KSTACKS as
m68knommu, with the IRQSTACKS abstraction as per ppc64.
Stuart Menefy [Fri, 2 Nov 2007 03:14:09 +0000 (12:14 +0900)]
sh: Fix optimized __copy_user() movca.l usage.
movca.l is restricted to SH-4 and up only, though compilers that
are unable to support ISA tuning (especially older versions of
binutils) will happily compile in the bogus opcode on older parts.
Conditionalize it to fix SH-3 regressions noted by Kristoffer.
Signed-off-by: Stuart Menefy <stuart.menefy@st.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
Paul Mundt [Wed, 31 Oct 2007 06:22:45 +0000 (15:22 +0900)]
sh: Clean up SR.RB Kconfig mess.
CPU_HAS_SR_RB is selected by both CPU_SH3 and CPU_SH4, so having a
dependency and default y on those additionally doesn't make much sense.
The select also has to be special cased for CPUs that don't support
this.
This is also something that has been abused too much as a result
of being user-visible, hence the addition of the select in the first
place. So just kill the user-visibility entirely while we're at it.
Paul Mundt [Tue, 30 Oct 2007 08:38:03 +0000 (17:38 +0900)]
sh: linker script tidying.
Some cleanups to the SH linker script. This reorders some of the
data sections for more optimal placement, general tabification,
and plugging in omitted generic definitions.
Paul Mundt [Tue, 30 Oct 2007 08:25:29 +0000 (17:25 +0900)]
sh: Kill off legacy embedded ramdisk section.
When the SH kernel used to support embedding a ramdisk in the
pre-initramfs days it was placed in a special section and made to
look like a regular initrd. Since that was removed ages ago, kill
off the remaining cruft that was missed.
Paul Mundt [Tue, 30 Oct 2007 08:18:08 +0000 (17:18 +0900)]
sh: Fix up early mem cmdline parsing.
memory_end was being clobbered by whatever the kernel config had
specified, rather than obeying the setup option. Fix this up so
that memory_end is only initialized if nothing has been set on
the command line.
Manuel Lauss [Tue, 30 Oct 2007 00:54:12 +0000 (09:54 +0900)]
sh: fix zImage build with >=binutils-2.18
Starting with binutils somewhere around 2.17.50.14 the vmlinux file
contains a ".note.gnu.build-id" section which doesn't get removed when
the zImage is built; resulting in a 2GB intermediate file and a broken
zImage.
Signed-off-by: Manuel Lauss <mano@roarinelk.homelinux.net> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
So as you can see the phys_to_page() macro doesn't wrap the 'phys'
parameter in parentheses so we end up with;
pte_val(x)&PTE_PHYS_MASK >> PAGE_SHIFT
Which is not what we wanted as '>>' has a higher precedence than bitwise
AND. I dug into the git repository and I believe this bug was added with
this commit (104b8deaa5c0144cccfc7d914413ff80c7176af1);
2006-03-27 KAMEZAWA Hiroyuki [PATCH] unify pfn_to_page: sh pfn_to_page
* git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched:
sched: fix style in kernel/sched.c
sched: fix style of swap() macro in kernel/sched_fair.c
sched: report CPU usage in CFS cgroup directories
sched: move rcu_head to task_group struct
sched: fix incorrect assumption that cpu 0 exists
sched: keep utime/stime monotonic
sched: make kernel/sched.c:account_guest_time() static
First off, testing in Fedora has shown it to cause boot failures,
bisected down by Martin Ebourne, and reported by Dave Jobes. So the
commit will likely be reverted in the 2.6.23 stable kernels.
Secondly, in the 2.6.24 model, x86-64 has now grown support for
SPARSEMEM_VMEMMAP, which disables the relevant code anyway, so while the
bug is not visible any more, it's become invisible due to the code just
being irrelevant and no longer enabled on the only architecture that
this ever affected.
Reported-by: Dave Jones <davej@redhat.com> Tested-by: Martin Ebourne <fedora@ebourne.me.uk> Cc: Zou Nan hai <nanhai.zou@intel.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Acked-by: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
sched: clean up code under CONFIG_FAIR_GROUP_SCHED
Introduced an assumption of the existence of CPU0 via this line
cfs_rq = tg->cfs_rq[0];
If you have no CPU0, that will be NULL. The fix seems to be just to
take whatever cfs_rq queue comes out of the for_each_possible_cpu()
loop, since they're all equally good for the destruction operation.
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra [Mon, 29 Oct 2007 20:18:11 +0000 (21:18 +0100)]
sched: keep utime/stime monotonic
keep utime/stime monotonic.
cpustats use utime/stime as a ratio against sum_exec_runtime, as a
consequence it can happen - when the ratio changes faster than time
accumulates - that either can be appear to go backwards.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Ralf Baechle [Mon, 8 Oct 2007 15:38:37 +0000 (16:38 +0100)]
[MIPS] MT: Fix bug in multithreaded kernels.
When GDB writes a breakpoint into address area of inferior process the
kernel needs to invalidate the modified memory in the inferior which
is done by calling flush_cache_page which in turns calls
r4k_flush_cache_page and local_r4k_flush_cache_page for VSMP or SMTC
kernel via r4k_on_each_cpu().
As the VSMP and SMTC SMP kernels for 34K are running on a single shared
caches it is possible to get away without interprocessor function calls.
This optimization is implemented in r4k_on_each_cpu, so
local_r4k_flush_cache_page is only ever called on the local CPU.
This is where the following code in local_r4k_flush_cache_page() strikes:
/*
* If ownes no valid ASID yet, cannot possibly have gotten
* this page into the cache.
*/
if (cpu_context(smp_processor_id(), mm) == 0)
return;
On VSMP and SMTC had a function of cpu_context() for each CPU(TC).
So in case another CPU than the CPU executing local_r4k_cache_flush_page
has not accessed the mm but one of the other CPUs has there may be data
to be flushed in the cache yet local_r4k_cache_flush_page will falsely
return leaving the I-cache inconsistent for the breakpoint.
While the issue was discovered with GDB it also exists in
local_r4k_flush_cache_range() and local_r4k_flush_cache().
Fixed by introducing a new function has_valid_asid which on MT kernels
returns true if a mm is active on any processor in the system.
This is relativly expensive since for memory acccesses in that loop
cache misses have to be assumed but it seems the most viable solution
for 2.6.23 and older -stable kernels.
Contrary to the belief of some, the R3000 and related processors did have
caches, both a data and an instruction cache. Here is an implementation
of r3k_flush_cache_page(), which is the processor-specific back-end for
flush_cache_range(), done according to the spec in
Documentation/cachetlb.txt.
While at it, remove an unused local function: get_phys_page(), do some
trivial formatting fixes and modernise debugging facilities.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Atsushi Nemoto [Thu, 25 Oct 2007 15:53:02 +0000 (00:53 +0900)]
[MIPS] Store sign-extend register values for PTRACE_GETREGS
A comment on ptrace_getregs() states "Registers are sign extended to
fill the available space." but it is not true. Fix code to match the
comment. Also fix casts on each caller to get rid of some warnings.
David Daney [Sun, 28 Oct 2007 06:10:20 +0000 (23:10 -0700)]
[MIPS] Add len and addr validation for MAP_FIXED mappings.
Mmap with MAP_FIXED was not validating the addr and len parameters. This
leads to the failure of GCC's gcc.c-torture/execute/loop-2[fg].c testcases
when using the o32 ABI on a 64 bit kernel.
These testcases try to mmap 65536 bytes at 0x7fff8000 and then access all
the memory. In 2.6.18 and 2.6.23.1 (and likely other versions as well)
the kernel maps the requested memory, but since half of it is above
0x80000000 a SIGBUS is generated when it is accessed.
This patch moves the len validation above the MAP_FIXED processing so that
it is always validated. It also adds validation to the addr parameter for
MAP_FIXED mappings.
Signed-off-by: David Daney <ddaney@avtrex.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
They break the timer interrupt initialization and only seem to be a kludge
for initialization happening in the wrong order. Further testing done by
Thiemo confirms the suspicion that the other invocations also seem to have
useless.
Atsushi Nemoto [Wed, 24 Oct 2007 16:34:09 +0000 (01:34 +0900)]
[MIPS] txx9tmr clockevent/clocksource driver
Convert jmr3927_clock_event_device to more generic
txx9tmr_clock_event_device which supports one-shot mode. The
txx9tmr_clock_event_device can be used for TX49 too if the cp0 timer
interrupt was not available.
Convert jmr3927_hpt_read to txx9_clocksource driver which does not
depends jiffies anymore. The txx9_clocksource itself can be used for
TX49, but normally TX49 uses higher precision clocksource_mips.
Ralf Baechle [Fri, 26 Oct 2007 12:43:32 +0000 (13:43 +0100)]
[MIPS] time: Merge lasat plat_timer_setup into plat_time_init.
Since the cp0 compare interrupt handler isn't initialized by the time
plat_time_init is called don't set IE_IRQ5 anymore, cevt-r4k.c will do
that a little later itself.
Yoichi Yuasa [Tue, 23 Oct 2007 09:19:13 +0000 (18:19 +0900)]
[MIPS] time: Use non-interrupt locks in GT641xx clockevent driver
set_next_event() and set_mode() are always called with interrupt disabled.
irqsave and irqrestore are not necessary for spinlock.
Pointed out by Atsushi Nemoto.
Kevin D. Kissell [Wed, 21 Mar 2007 12:28:37 +0000 (13:28 +0100)]
[MIPS] SMTC: Allow control over TC assignment to vpe0.
Modify the SMTC initialization code to allow boot-time specification not
only of how many VPEs and TCs to use, but also how many TCs out of the
allowed pool are to be bound to VPE 0. The new boot option is "vpe0tcs=N",
where N is an integer. Using it in combination with the existing options
allows arbitrary assignments across the 2 VPEs of a 34K. e.g. "maxtcs=3
vpe0tcs=1" forces VPE0 to have 1 TC, while VPE1 has 2, and "maxtcs=4
vpe0tcs=3" forces VPE0 to have 3 TCs, while VPE1 gets 1. If no vpe0tcs
option is specified, the traditional algorithm of evenly dividing TCs
between available VPEs, with the odd "slop" going to VPE0, is retained.
The reason for doing this is to allow a finer balancing of TCs which can
handle I/O interrupts on Malta (those on VPE 0) and those which cannot.
Linus Torvalds [Mon, 29 Oct 2007 19:12:34 +0000 (12:12 -0700)]
Merge branch 'alpm' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'alpm' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] AHCI: add hw link power management support
[libata] Link power management infrastructure
Linus Torvalds [Mon, 29 Oct 2007 19:11:54 +0000 (12:11 -0700)]
Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev
* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
[libata] AHCI: fix newly introduced host-reset bug
[libata] sata_nv: fix SWNCQ enabling
libata: add MAXTOR 7V300F0/VA111900 to NCQ blacklist
libata: no need to speed down if already at PIO0
libata: relocate forcing PIO0 on reset
pata_ns87415: define SUPERIO_IDE_MAX_RETRIES
[libata] Address some checkpatch-spotted issues
[libata] fix 'if(' and similar areas that lack whitespace
libata: implement ata_wait_after_reset()
libata: track SLEEP state and issue SRST to wake it up
libata: relocate and fix post-command processing
[libata] AHCI: add hw link power management support
This patch will set the correct bits to turn on Aggressive
Link Power Management (ALPM) for the ahci driver. This
will cause the controller and disk to negotiate a lower
power state for the link when there is no activity (see
the AHCI 1.x spec for details). This feature is mutually
exclusive with Hot Plug, so when ALPM is enabled, Hot Plug
is disabled. ALPM will be enabled by default, but it is
settable via the scsi host syfs interface. Possible
settings for this feature are:
Setting Effect
----------------------------------------------------------
min_power ALPM is enabled, and link set to enter
lowest power state (SLUMBER) when idle
Hot plug not allowed.
max_performance ALPM is disabled, Hot Plug is allowed
medium_power ALPM is enabled, and link set to enter
second lowest power state (PARTIAL) when
idle. Hot plug not allowed.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Device Initiated Power Management, which is defined
in SATA 2.5 can be enabled for disks which support it.
This patch enables DIPM when the user sets the link
power management policy to "min_power".
Additionally, libata drivers can define a function
(enable_pm) that will perform hardware specific actions to
enable whatever power management policy the user set up
for Host Initiated Power management (HIPM).
This power management policy will be activated after all
disks have been enumerated and intialized. Drivers should
also define disable_pm, which will turn off link power
management, but not change link power management policy.
Documentation/scsi/link_power_management_policy.txt has additional
information.
Signed-off-by: Kristen Carlson Accardi <kristen.c.accardi@intel.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Linus Torvalds [Mon, 29 Oct 2007 14:49:28 +0000 (07:49 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
compat_ioctl: fix block device compat ioctl regression
[BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps
Fix a build error when BLOCK=n
block: use lock bitops for the tag map.
cciss: update copyright notices
cfq_get_queue: fix possible NULL pointer access
blk_sync_queue() should cancel request_queue->unplug_work
cfq_exit_queue() should cancel cfq_data->unplug_work
block layer: remove a unused argument of drive_stat_acct()
Linus Torvalds [Mon, 29 Oct 2007 14:49:10 +0000 (07:49 -0700)]
Merge branch 'sg' of git://git.kernel.dk/linux-2.6-block
* 'sg' of git://git.kernel.dk/linux-2.6-block:
Correction of "Update drivers to use sg helpers" patch for IMXMMC driver
sg_init_table() should use unsigned loop index variable
sg_last() should use unsigned loop index variable
Initialise scatter/gather list in sg driver
Initialise scatter/gather list in ata_sg_setup
x86: fix pci-gart failure handling
SG: s390-scsi: missing size parameter in zfcp_address_to_sg()
SG: clear termination bit in sg_chain()
Al Viro [Mon, 29 Oct 2007 05:08:38 +0000 (05:08 +0000)]
deal with resource allocation bugs in arcmsr
a) for type B we should _not_ iounmap() acb->pmu; it's not ioremapped.
b) for type B we should iounmap() two regions we _do_ ioremap.
c) if ioremap() fails, we need to bail out (and clean up).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 29 Oct 2007 05:11:28 +0000 (05:11 +0000)]
fix abuses of ptrdiff_t
Use of ptrdiff_t in places like
- if (!access_ok(VERIFY_WRITE, u_tmp->rx_buf, u_tmp->len))
+ if (!access_ok(VERIFY_WRITE, (u8 __user *)
+ (ptrdiff_t) u_tmp->rx_buf,
+ u_tmp->len))
is wrong; for one thing, it's a bad C (it's what uintptr_t is for; in general
we are not even promised that ptrdiff_t is large enough to hold a pointer,
just enough to hold a difference between two pointers within the same object).
For another, it confuses the fsck out of sparse.
Use unsigned long or uintptr_t instead. There are several places misusing
ptrdiff_t; fixed.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 29 Oct 2007 05:08:48 +0000 (05:08 +0000)]
arcmsr: endianness bug
initializing a field in data shared with the card with
cpu_to_le32(something) | 0x100000 is broken - the field is, indeed,
little-endian and we need cpu_to_le32() on both parts.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 29 Oct 2007 05:03:23 +0000 (05:03 +0000)]
SCTP endianness annotations regression
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Al Viro [Mon, 29 Oct 2007 04:37:58 +0000 (04:37 +0000)]
SUNRPC endianness annotations
rpcrdma stuff lacks endianness annotations for on-the-wire data.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The conversion of handlers to compat_blkdev_ioctl accidentally
disabled handling of most ioctl numbers on block devices because
of a typo. Fix the one line to enable it all again.
Jens Axboe [Thu, 25 Oct 2007 08:14:47 +0000 (10:14 +0200)]
[BLOCK] Fix bad sharing of tag busy list on queues with shared tag maps
For the locking to work, only the tag map and tag bit map may be shared
(incidentally, I was just explaining this to Nick yesterday, but I
apparently didn't review the code well enough myself). But we also share
the busy list! The busy_list must be queue private, or we need a
block_queue_tag covering lock as well.
So we have to move the busy_list to the queue. This'll work fine, and
it'll actually also fix a problem with blk_queue_invalidate_tags() which
will invalidate tags across all shared queues. This is a bit confusing,
the low level driver should call it for each queue seperately since
otherwise you cannot kill tags on just a single queue for eg a hard
drive that stops responding. Since the function has no callers
currently, it's not an issue.
Emil Medve [Wed, 24 Oct 2007 12:18:32 +0000 (14:18 +0200)]
Fix a build error when BLOCK=n
mm/filemap.c: In function '__filemap_fdatawrite_range':
mm/filemap.c:200: error: implicit declaration of function
'mapping_cap_writeback_dirty'
This happens when we don't use/have any block devices and a NFS root
filesystem is used.
mapping_cap_writeback_dirty() is defined in linux/backing-dev.h which
used to be provided in mm/filemap.c by linux/blkdev.h until commit f5ff8422bbdd59f8c1f699df248e1b7a11073027 (Fix warnings with
!CONFIG_BLOCK).
Signed-off-by: Emil Medve <Emilian.Medve@Freescale.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Mike Miller [Wed, 24 Oct 2007 08:30:34 +0000 (10:30 +0200)]
cciss: update copyright notices
This patch updates the copyright information for the cciss driver. It
includes extending the year to 2007 (how timely) and some minor corrections
deemed necessary by HP legal and the Open Source Review Board. Please
consider this patch for inclusion.
Signed-off-by: Mike Miller <mike.miller@hp.com>
-------------------------------------------------------------------------------- Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Oleg Nesterov [Tue, 23 Oct 2007 13:08:21 +0000 (15:08 +0200)]
cfq_get_queue: fix possible NULL pointer access
cfq_get_queue()->cfq_find_alloc_queue() can fail, check the returned value.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Note that this isn't a bug at the moment, since the regular IO path
does not call this path without __GFP_WAIT set. However, it could be a
future bug, so I've applied it.
Jerome Marchand [Tue, 23 Oct 2007 13:05:46 +0000 (15:05 +0200)]
block layer: remove a unused argument of drive_stat_acct()
The nr_sector argument of drive_stat_acct() is not used anymore since the read and write sectors statistics are now updated in end_that_request_first(). This patch removes the useless argument.
Tejun Heo [Mon, 29 Oct 2007 07:45:05 +0000 (16:45 +0900)]
libata: no need to speed down if already at PIO0
After reset, transfer mode is always PIO0 regardless of
dev->xfer_mask. Check dev->pio_mode before trying to slow down after
configuration failure. This prevents bogus speed down before device
is actually configured.
Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Tejun Heo [Mon, 29 Oct 2007 07:41:09 +0000 (16:41 +0900)]
libata: relocate forcing PIO0 on reset
Forcing PIO0 on reset was done inside ata_bus_softreset(), which is a
bit out of place as it should be applied to all resets - hard, soft
and implementation which don't use ata_bus_softreset(). Relocate it
such that...
* For new EH, it's done in ata_eh_reset() before calling prereset.
* For old EH, it's done before calling ap->ops->phy_reset() in
ata_bus_probe().
This makes PIO0 forced after all resets. Another difference is that
reset itself is done after PIO0 is forced.
Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Alan Cox <alan@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
Code copied from drivers/ide/pci/ns87415.c uses this, so copy the
definition from there as well.
Fixes the following build error:
CC [M] drivers/ata/pata_ns87415.o
drivers/ata/pata_ns87415.c: In function ‘ns87560_read_buggy’:
drivers/ata/pata_ns87415.c:228: error: ‘SUPERIO_IDE_MAX_RETRIES’ undeclared (first use in this function)
drivers/ata/pata_ns87415.c:228: error: (Each undeclared identifier is reported only once
drivers/ata/pata_ns87415.c:228: error: for each function it appears in.)
Signed-off-by: Frank Lichtenheld <frank@lichtenheld.de> Signed-off-by: Jeff Garzik <jeff@garzik.org>