Chris Mason [Tue, 31 Mar 2009 17:27:11 +0000 (13:27 -0400)]
Btrfs: add extra flushing for renames and truncates
Renames and truncates are both common ways to replace old data with new
data. The filesystem can make an effort to make sure the new data is
on disk before actually replacing the old data.
This is especially important for rename, which many application use as
though it were atomic for both the data and the metadata involved. The
current btrfs code will happily replace a file that is fully on disk
with one that was just created and still has pending IO.
If we crash after transaction commit but before the IO is done, we'll end
up replacing a good file with a zero length file. The solution used
here is to create a list of inodes that need special ordering and force
them to disk before the commit is done. This is similar to the
ext3 style data=ordering, except it is only done on selected files.
Btrfs is able to get away with this because it does not wait on commits
very often, even for fsync (which use a sub-commit).
For renames, we order the file when it wasn't already
on disk and when it is replacing an existing file. Larger files
are sent to filemap_flush right away (before the transaction handle is
opened).
For truncates, we order if the file goes from non-zero size down to
zero size. This is a little different, because at the time of the
truncate the file has no dirty bytes to order. But, we flag the inode
so that it is added to the ordered list on close (via release method). We
also immediately add it to the ordered list of the current transaction
so that we can try to flush down any writes the application sneaks in
before commit.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
TOMARI Hisanobu [Tue, 31 Mar 2009 18:15:34 +0000 (20:15 +0200)]
ide-pmac: IDE cable detection on Apple PowerBook
As IDE cable used on Apple PowerBook/iBook laptops are always of "Short 40"
type when the firmware says it's 80 conductor one, the cable detection should
return ATA_CBL_PATA40_SHORT on those machines. This enables to automatically
use UDMA5 even with drives that doesn't correctly detect those cables on Apple
laptops.
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:32 +0000 (20:15 +0200)]
ide: turn selectproc() method into dev_select() method (take 5)
Turn selectproc() method into dev_select() method by teaching it to write to the
device register and moving it from 'struct ide_port_ops' to 'struct ide_tp_ops'.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: benh@kernel.crashing.org Cc: petkovbb@gmail.com
[bart: add ->dev_select to at91_ide.c and tx4939.c (__BIG_ENDIAN case)] Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
MAINTAINERS: move old ide-{floppy,tape} entries to CREDITS (take 2)
Ben Hutchings noticed that MAINTAINERS somehow still contain old
ide-{floppy,tape} entries. Fix it by moving them to CREDITS (kudos
to Gadi and Paul for all the early hard work on ide-{floppy,tape}).
v2:
Rename IDE/ATAPI CDROM DRIVER entry to IDE/ATAPI DRIVERS one.
Cc: Gadi Oxman <gadio@netvision.net.il> Cc: Paul Bristow <paul@paulbristow.net> Acked-by: Borislav Petkov <petkovbb@gmail.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:31 +0000 (20:15 +0200)]
ide: move data register access out of tf_{read|load}() methods (take 2)
Move IDE_FTFLAG_{IN|OUT}_DATA flag handling out of tf_{read|load}() methods
into the only two functions where these flags actually need to be handled:
do_rw_taskfile() and ide_complete_cmd()...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:31 +0000 (20:15 +0200)]
ide: call {in|out}put_data() methods from tf_{read|load}() methods (take 2)
Handle IDE_FTFLAG_{IN|OUT}_DATA flags in tf_{read|load}() methods by calling
{in|out}put_data() methods to transfer 2 bytes -- this will allow us to move
that handling out of those methods altogether...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:30 +0000 (20:15 +0200)]
ide: rename IDE_TFLAG_IN_[HOB_]FEATURE
The feature register has never been readable -- when its location is read, one
gets the error register value; hence rename IDE_TFLAG_IN_[HOB_]FEATURE into
IDE_TFLAG_IN_[HOB_]ERROR and introduce the 'hob_error' field into the 'struct
ide_taskfile' (despite the error register not really depending on the HOB bit).
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:28 +0000 (20:15 +0200)]
ide: add support for CFA specified transfer modes (take 3)
Add support for the CompactFlash specific PIO modes 5/6 and MWDMA modes 3/4.
Since there were no PIO5 capable hard drives produced and one would also need
66 MHz IDE clock to actually get the difference WRT the address setup timings
programmed, I decided to simply replace the old non-standard PIO mode 5 timings
with the CFA specified ones.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: stf_xl@wp.pl Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:27 +0000 (20:15 +0200)]
ide-iops: only clear DMA words on setting DMA mode
The bytes indicating current DMA mode in the identify data words 62, 63, and 88
should only change on setting a DMA mode, so stop clearing them on setting PIO
mode in ide_config_drive_speed(). While at it, correct SW/MW DMA mode masks...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:27 +0000 (20:15 +0200)]
ide: identify data word 53 bit 1 doesn't cover words 62 and 63 (take 3)
The IDE code assumed for years that the bit 1 of the identify data word 53 also
covers the validity of the SW/MW DMA information in words 62 and 63, but it has
always covered only words 64 thru 70, with words 62 and 63 being defined in the
original ATA spec, not in ATA-2...
This fix however should only concern *very* old hard disks and rather old CF
cards...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sergei Shtylyov [Tue, 31 Mar 2009 18:15:27 +0000 (20:15 +0200)]
au1xxx-ide: auide_{in|out}sw() should be static
Make auide_{insw|outsw}() 'static' and mark them 'inline' as there's only one
call site for each: in the driver's {in|out}put_data() methods respectively...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Nowadays it is not worth having a separate config option for
Amiga IDE Doubler support so always include it (it still needs
to be explicitly enabled by module parameter).
ide: decrease size of ->pc_buf field in struct ide_atapi_pc
struct ide_atapi_pc is often allocated on the stack and size of ->pc_buf
size is 256 bytes. However since only ide_floppy_create_read_capacity_cmd()
and idetape_create_inquiry_cmd() require such size allocate buffers for
these pc-s explicitely and decrease ->pc_buf size to 64 bytes.
ide-generic: remove no longer needed sysfs interface
Nowadays we have "ide_generic.probe_mask=" module parameter
and ide_platform host driver so sysfs interface for adding
IDE interfaces is no longer needed.
ide-cd: unify transfer padding in cdrom_newpc_intr()
* 'thislen' is always <= cmd->nleft for non-fs requests so the transfer
padding inside the 'while (thislen > 0)' loop can happen only for fs
requests -- then move it out of the loop and unify with the transfer
padding for non-fs requests ('thislen' == 'len' for fs requests).
* blk_dump_rq_flags() dumps all request flags so it is enough to pass
only the function name to it.
ide-cd: use scatterlists for PIO transfers (non-fs requests) (v2)
Convert ide-cd to use scatterlists for PIO transfers and get rid of
partial completions (except on error) also for non-fs requests.
v2:
Do not map dataless commands to an sg since it oopses on the virt_to_page()
translation check when DEBUG_VIRTUAL is enabled. (from Borislav Petkov,
reported/bisected-by Tetsuo Handa).
ide-cd: use scatterlists for PIO transfers (fs requests)
* Export ide_pio_bytes().
* Add ->last_xfer_len field to struct ide_cmd.
* Add ide_cd_error_cmd() helper to ide-cd.
* Convert ide-cd to use scatterlists also for PIO transfers (fs requests
only for now) and get rid of partial completions (except when the error
happens -- which is still subject to change later because looking at
ATAPI spec it seems that the device is free to error the whole transfer
with setting the Error bit only on the last transfer chunk).
Borislav Petkov [Tue, 31 Mar 2009 18:14:58 +0000 (20:14 +0200)]
ide-atapi: start DMA after issuing a packet command
Apparently¹, some ATAPI devices want to see the packet command first
before enabling DMA otherwise they simply hang indefinitely. Reorder the
two steps and start DMA only after having issued the command first.
Fix some incorrect IDE_FTFLAG_* changes which slipped in commit
"ide: add "flagged" taskfile flags to struct ide_taskfile (v2)"
(commit 19710d25d50ae0be05eebe4231ed8918b1092d82) few days ago.
On m68k:
| drivers/ide/ide-atapi.c: In function 'ide_io_buffers':
| drivers/ide/ide-atapi.c:87: error: implicit declaration of function 'sg_page'
| drivers/ide/ide-atapi.c:87: warning: passing argument 1 of 'PageHighMem' makes pointer from integer without a cast
| drivers/ide/ide-atapi.c:91: warning: passing argument 1 of 'kmap_atomic' makes pointer from integer without a cast
| drivers/ide/ide-atapi.c:96: error: implicit declaration of function 'sg_virt'
| drivers/ide/ide-atapi.c:96: warning: assignment makes pointer from integer without a cast
| drivers/ide/ide-atapi.c:107: error: implicit declaration of function 'sg_next'
| drivers/ide/ide-atapi.c:107: warning: assignment makes pointer from integer without a cast
[bart: Dmitri Vorobiev submitted similar patch fixing MIPS]
Elias Oltmanns [Tue, 31 Mar 2009 18:14:56 +0000 (20:14 +0200)]
ide: Fix code dealing with sleeping devices in do_ide_request()
Unfortunately, I missed a catch when reviewing the patch committed as 201bffa4. Here is the fix to the currently broken handling of sleeping
devices. In particular, this is required to get the disk shock
protection code working again.
Reported-by: Christian Thaeter <ct@pipapo.org> Cc: stable@kernel.org Signed-off-by: Elias Oltmanns <eo@nebensachen.de> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Sebastian Ott [Tue, 31 Mar 2009 17:16:07 +0000 (19:16 +0200)]
[S390] cio: online_store - trigger recognition for boxed devices
Start a new device recognition if someone writes to sysfs online attribute
of a boxed ccw device. The current test will fail, since cu_type != 0
for devices which were recognized before.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Tue, 31 Mar 2009 17:16:05 +0000 (19:16 +0200)]
[S390] cio: introduce notifier for boxed state
If a ccw device did not respond in time during internal io, we set it
into boxed state. With this patch we have the following behaviour:
* the ccw driver will get a notification if the device was online and
goes into the boxed state
* if the device was disconnected and got boxed nothing special is to be
done (it will be handled in reprobing later)
* if the device got boxed while initial sensing it will be unregistered
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Introduce ccw_device_schedule_sch_unregister as a wrapper for queuing
ccw_device_call_sch_unregister on the slow_path_wq. This wrapper
will be used in the next patch.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Tue, 31 Mar 2009 17:16:03 +0000 (19:16 +0200)]
[S390] cio: wake up on failed recognition
Wake up even on failed device recognition, since this may be triggered
from a user trying to force a device online. With this patch a write
to the online sysfs attribute will not block for ever but return with
-EAGAIN in this case.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 31 Mar 2009 17:16:02 +0000 (19:16 +0200)]
[S390] fix hypfs build failure
Fix build breakage below which probably was introduced with
("rcu: don't include unnecessary headers, allow kmemtrace w/ tracepoints").
CC arch/s390/hypfs/hypfs_diag.o
arch/s390/hypfs/hypfs_diag.c: In function 'diag204_free_buffer':
arch/s390/hypfs/hypfs_diag.c:364: error: implicit declaration of function 'free_pages'
arch/s390/hypfs/hypfs_diag.c: In function 'diag204_alloc_rbuf':
arch/s390/hypfs/hypfs_diag.c:384: error: implicit declaration of function '__get_free_pages'
arch/s390/hypfs/hypfs_diag.c:384: error: 'GFP_KERNEL' undeclared (first use in this function)
arch/s390/hypfs/hypfs_diag.c:384: error: (Each undeclared identifier is reported only once
arch/s390/hypfs/hypfs_diag.c:384: error: for each function it appears in.)
Reported-by: Sachin Sant <sachinp@in.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Tue, 31 Mar 2009 17:13:13 +0000 (19:13 +0200)]
[PATCH] sysrq: include interrupt.h instead of irq.h
With "cpumask: update irq_desc to use cpumask_var_t"
we get this build failure on s390:
CC drivers/char/sysrq.o
In file included from drivers/char/sysrq.c:38:
include/linux/irq.h: In function 'init_alloc_desc_masks':
include/linux/irq.h:442: error: dereferencing pointer to incomplete type
drivers/char/sysrq.c should include interrupt.h instead of irq.h.
Cc: Mike Travis <travis@sgi.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Boaz Harrosh [Tue, 28 Oct 2008 14:11:41 +0000 (16:11 +0200)]
exofs: super_operations and file_system_type
This patch ties all operation vectors into a file system superblock
and registers the exofs file_system_type at module's load time.
* The file system control block (AKA on-disk superblock) resides in
an object with a special ID (defined in common.h).
Information included in the file system control block is used to
fill the in-memory superblock structure at mount time. This object
is created before the file system is used by mkexofs.c It contains
information such as:
- The file system's magic number
- The next inode number to be allocated
Boaz Harrosh [Tue, 28 Oct 2008 13:38:12 +0000 (15:38 +0200)]
exofs: dir_inode and directory operations
implementation of directory and inode operations.
* A directory is treated as a file, and essentially contains a list
of <file name, inode #> pairs for files that are found in that
directory. The object IDs correspond to the files' inode numbers
and are allocated using a 64bit incrementing global counter.
* Each file's control block (AKA on-disk inode) is stored in its
object's attributes. This applies to both regular files and other
types (directories, device files, symlinks, etc.).
Boaz Harrosh [Mon, 27 Oct 2008 17:31:34 +0000 (19:31 +0200)]
exofs: address_space_operations
OK Now we start to read and write from osd-objects. We try to
collect at most contiguous pages as possible in a single write/read.
The first page index is the object's offset.
TODO:
In 64-bit a single bio can carry at most 128 pages.
Add support of chaining multiple bios
Boaz Harrosh [Mon, 27 Oct 2008 16:37:02 +0000 (18:37 +0200)]
exofs: file and file_inode operations
implementation of the file_operations and inode_operations for
regular data files.
Most file_operations are generic vfs implementations except:
- exofs_truncate will truncate the OSD object as well
- Generic file_fsync is not good for none_bd devices so open code it
- The default for .flush in Linux is todo nothing so call exofs_fsync
on the file.
trace_seq_reserve() allows a caller to reserve space in a trace_seq and
write directly into it. This makes it easier to export binary data to
userspace via the tracing interface, by simply filling in a struct.
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Li Zefan [Fri, 27 Mar 2009 02:21:23 +0000 (10:21 +0800)]
blktrace: extract duplidate code
Impact: cleanup
blk_trace_event_print() and blk_tracer_print_line() share most of the code.
text data bss dec hex filename
8605 393 12 9010 2332 kernel/trace/blktrace.o.orig
text data bss dec hex filename
8555 393 12 8960 2300 kernel/trace/blktrace.o
This patch also prepares for the next patch, that prints out BLK_TN_MESSAGE.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
and registers/unregisters tracepoints when:
# echo blk/nop > /debugfs/tracing/current_tracer
or
# echo 1/0 > /debugfs/tracing/tracing_enable
The separatation of allocation and registeration causes 2 problems:
1. current user-space blktrace still calls ioctl(TEARDOWN) when
ioctl(SETUP) failed:
# echo 1 > /sys/block/sda/sda1/trace/enable
# blktrace /dev/sda
BLKTRACESETUP: Device or resource busy
^C
and now blk_probes_ref == -1
2. Another way to make blk_probes_ref == -1:
# plugin sdb && mount sdb1
# echo 1 > /sys/block/sdb/sdb1/trace/enable
# remove sdb
This patch does the allocation and registeration when writing
sdaX/trace/enable.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Li Zefan [Wed, 25 Mar 2009 09:21:26 +0000 (17:21 +0800)]
blktrace: fix the original blktrace
Currently the original blktrace, which is using relay and is used via
ioctl, is broken. You can use ftrace to see the output of blktrace,
but user-space blktrace is unusable.
Kumar Gala [Tue, 31 Mar 2009 13:46:25 +0000 (08:46 -0500)]
powerpc/85xx: Use fsl,mpc85.. as prefix for memory ctrl & l2-cache nodes
Older devices tree's used "fsl,85.." instead of the preferred
"fsl,mpc85.." for the memory controller & l2 cache controller nodes.
The EDAC code is the only use of these and has been updated for some
time to support both "fsl,85.." and "fsl,mpc85.."
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Grant Likely [Sat, 28 Mar 2009 21:07:16 +0000 (15:07 -0600)]
powerpc: Remove unused symbols from fsl_devices.h
Remove old artifacts leftover from the platform driver gianfar and
fsl_i2c drivers. These symbols became unused when the drivers
were migrated over to use the of_platform bus.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Josh Boyer [Tue, 31 Mar 2009 12:05:50 +0000 (08:05 -0400)]
powerpc: Make LOWMEM_CAM_NUM depend on FSL_BOOKE
The recent addition of CONFIG_LOWMEM_CAM_BOOL and
CONFIG_LOWMEM_CAM_NUM cause the latter to show up in configs
that do not need it during 'make oldconfig'. Make LOWMEM_CAM_NUM
depend on FSL_BOOKE.
Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Peter Zijlstra [Fri, 13 Mar 2009 11:21:27 +0000 (12:21 +0100)]
hrtimer: fix rq->lock inversion (again)
It appears I inadvertly introduced rq->lock recursion to the
hrtimer_start() path when I delegated running already expired
timers to softirq context.
This patch fixes it by introducing a __hrtimer_start_range_ns()
method that will not use raise_softirq_irqoff() but
__raise_softirq_irqoff() which avoids the wakeup.
It then also changes schedule() to check for pending softirqs and
do the wakeup then, I'm not quite sure I like this last bit, nor
am I convinced its really needed.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus@samba.org
LKML-Reference: <20090313112301.096138802@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>