pilppa.com Git - linux-2.6-omap-h63xx.git/log

Btrfs: do less aggressive btree readahead

Just before reading a leaf, btrfs scans the node for blocks that are
close by and reads them too. It tries to build up a large window
of IO looking for blocks that are within a max distance from the top
and bottom of the IO window.

This patch changes things to just look for blocks within 64k of the
target block. It will trigger less IO and make for lower latencies on
the read size.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: fiemap support

Now that bmap support is gone, this is the only way to get extent
mappings for userland. These are still not valid for IO, but they
can tell us if a file has holes or how much fragmentation there is.

Signed-off-by: Yehuda Sadeh <yehuda@hq.newdream.net>

Btrfs: stop providing a bmap operation to avoid swapfile corruptions

Swapfiles use bmap to build a list of extents belonging to the file,
and they assume these extents won't change over the life of the file.
They also use resulting list to do IO directly to the block device.

This causes problems for btrfs in a few ways:

btrfs returns logical block numbers through bmap, and these are not suitable
for IO. They might translate to different devices, raid etc.

COW means that file block mappings are going to change frequently.

Using swapfiles on btrfs will lead to corruption, so we're avoiding the
problem for now by dropping bmap support entirely. A later commit
will add fiemap support for people that really want to know how
a file is laid out.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: fix tree logs parallel sync

To improve performance, btrfs_sync_log merges tree log sync
requests. But it wrongly merges sync requests for different
tree logs. If multiple tree logs are synced at the same time,
only one of them actually gets synced.

This patch has following changes to fix the bug:

Move most tree log related fields in btrfs_fs_info to
btrfs_root. This allows merging sync requests separately
for each tree log.

Don't insert root item into the log root tree immediately
after log tree is allocated. Root item for log tree is
inserted when log tree get synced for the first time. This
allows syncing the log root tree without first syncing all
log trees.

At tree-log sync, btrfs_sync_log first sync the log tree;
then updates corresponding root item in the log root tree;
sync the log root tree; then update the super block.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: open_ctree() error handling can oops on fs_info

a bug in open_ctree:

struct btrfs_root *open_ctree(..)
{
....
if (!extent_root || !tree_root || !fs_info ||
!chunk_root || !dev_root || !csum_root) {
err = -ENOMEM;
goto fail;
//When code flow goes to "fail", fs_info may be NULL or uninitialized.
}
....

fail:
btrfs_close_devices(fs_info->fs_devices);// !
btrfs_mapping_tree_free(&fs_info->mapping_tree);// !

kfree(extent_root);
kfree(tree_root);
bdi_destroy(&fs_info->bdi);// !
...
)

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: fix stop searching test in replace_one_extent

replace_one_extent searches tree leaves for references to a given extent. It
stops searching if it goes beyond the last possible position.

The last possible position is computed by adding the starting offset of a found
file extent to the full size of the extent. The code uses physical size of the
extent as the full size. This is incorrect when compression is used.

The fix is get the full size from ram_bytes field of file extent item.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: change/remove typedef

Change one typedef to a regular enum, and remove an unused one.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: remove duplicated #include

Removed duplicated #include "compat.h"in
fs/btrfs/extent-tree.c

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Fix infinite loop in btrfs_extent_post_op

btrfs_extent_post_op calls finish_current_insert and del_pending_extents. They
both may enter infinite loops.

finish_current_insert enters infinite loop if it only finds some backrefs to
update. The fix is to check for pending backref updates before restarting the
loop.

The infinite loop in del_pending_extents is due to a the skipped variable
not being properly reset before looping around.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: fix locking issue in btrfs_remove_block_group

We should hold the block_group_cache_lock while modifying the
block groups red-black tree. Thank you,

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: simplify iteration codes

Merge list_for_each* and list_entry to list_for_each_entry*

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: check return value for kthread_run() correctly

kthread_run() returns the kthread or ERR_PTR(-ENOMEM), not NULL.

Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Remove extra KERN_INFO in the middle of a line

The "devid <xxx> transid <xxx>" printk in btrfs_scan_one_device()
actually follows another printk that doesn't end in a newline (since the
intention is for the two printks to make one line of output), so the
KERN_INFO just ends up messing up the output:

device label exp <6>devid 1 transid 9 /dev/sda5

Fix this by changing the extra KERN_INFO to KERN_CONT.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: removed unused #include <version.h>'s

Removed unused #include <version.h>'s in btrfs

Signed-off-by: Huang Weiyi <weiyi.huang@gmail.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: cleanup xattr code

Andrew's review of the xattr code revealed some minor issues that this patch
addresses. Just an error return fix, got rid of a useless statement and
commented one of the trickier parts of __btrfs_getxattr.

Signed-off-by: Josef Bacik <jbacik@redhat.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: MAINTAINERS entry

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: cleanup fs/btrfs/super.c::btrfs_control_ioctl()

- Remove the unused local variable 'len';
- Check return value of kmalloc().

Signed-off-by: Wang Cong <wangcong@zeuux.org>
Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: fix ioctl arg size (userland incompatible change!)

The structure used to send device in btrfs ioctl calls was not
properly aligned, and so 32 bit ioctls would not work properly on
64 bit kernels.

We could fix this with compat ioctls, but we're just one byte away
and it doesn't make sense at this stage to carry about the compat ioctls
forever at this stage in the project.

This patch brings the ioctl arg up to an evenly aligned 4k.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Clear the device->running_pending flag before bailing on congestion

Btrfs maintains a queue of async bio submissions so the checksumming
threads don't have to wait on get_request_wait. In order to avoid
extra wakeups, this code has a running_pending flag that is used
to tell new submissions they don't need to wake the thread.

When the threads notice congestion on a single device, they
may decide to requeue the job and move on to other devices. This
makes sure the running_pending flag is cleared before the
job is requeued.

It should help avoid IO stalls by making sure the task is woken up
when new submissions come in.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: explicitly mark the tree log root for writeback

Each subvolume has an extent_state_tree used to mark metadata
that needs to be sent to disk while syncing the tree. This is
used in addition to the dirty bits on the pages themselves so that
a single subvolume can be sent to disk efficiently in disk order.

Normally this marking happens in btrfs_alloc_free_block, which also does
special recording of dirty tree blocks for the tree log roots.

Yan Zheng noticed that when the root of the log tree is allocated, it is added
to the wrong writeback list. The fix used here is to explicitly set
it dirty as part of tree log creation.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Drop the hardware crc32c asm code

This is already in the arch specific directories in mainline and
shouldn't be copied into btrfs.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Add Documentation/filesystem/btrfs.txt, remove old COPYING

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: kmap_atomic(KM_USER0) is safe for btrfs_readpage_end_io_hook

None of the checksum verification code schedules, so we can use the faster
kmap_atomic

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Don't use kmap_atomic(..., KM_IRQ0) during checksum verifies

Checksum verification happens in a helper thread, and there is no
need to mess with interrupts. This switches to kmap() instead.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: tree logging checksum fixes

This patch contains following things.

1) Limit the max size of btrfs_ordered_sum structure to PAGE_SIZE.  This
struct is kmalloced so we want to keep it reasonable.

2) Replace copy_extent_csums by btrfs_lookup_csums_range.  This was
duplicated code in tree-log.c

3) Remove replay_one_csum. csum items are replayed at the same time as
   replaying file extents. This guarantees we only replay useful csums.

4) nbytes accounting fix.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: don't change file extent's ram_bytes in btrfs_drop_extents

btrfs_drop_extents doesn't change file extent's ram_bytes
in the case of booked extent. To be consistent, we should
also not change ram_bytes when truncating existing extent.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: Use btrfs_join_transaction to avoid deadlocks during snapshot creation

Snapshot creation happens at a specific time during transaction commit. We
need to make sure the code called by snapshot creation doesn't wait
for the running transaction to commit.

This changes btrfs_delete_inode and finish_pending_snaps to use
btrfs_join_transaction instead of btrfs_start_transaction to avoid deadlocks.

It would be better if btrfs_delete_inode didn't use the join, but the
call path that triggers it is:

btrfs_commit_transaction->create_pending_snapshots->
create_pending_snapshot->btrfs_lookup_dentry->
fixup_tree_root_location->btrfs_read_fs_root->
btrfs_read_fs_root_no_name->btrfs_orphan_cleanup->iput

This will be fixed in a later patch by moving the orphan cleanup to the
cleaner thread.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: drop remaining LINUX_KERNEL_VERSION checks and compat code

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable

Btrfs: drop EXPORT symbols from extent_io.c

They should stay out until this is turned into generic code.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Fix checkpatch.pl warnings

There were many, most are fixed now. struct-funcs.c generates some warnings
but these are bogus.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Fix free block discard calls down to the block layer

This is a patch to fix discard semantic to make Btrfs work with FTL and SSD.
We can improve FTL's performance by telling it which sectors are freed by file
system. But if we don't tell FTL the information of free sectors in proper
time, the transaction mechanism of Btrfs will be destroyed and Btrfs could not
roll back the previous transaction under the power loss condition.

There are some problems in the old implementation:
1, In __free_extent(), the pinned down extents should not be discarded.
2, In free_extents(), the free extents are all pinned, so they need to
be discarded in transaction committing time instead of free_extents().
3, The reserved extent used by log tree should be discard too.

This patch change discard behavior as follows:
1, For the extents which need to be free at once,
   we discard them in update_block_group().
2, Delay discarding the pinned extent in btrfs_finish_extent_commit()
   when committing transaction.
3, Remove discarding from free_extents() and __free_extent()
4, Add discard interface into btrfs_free_reserved_extent()
5, Discard sectors before updating the free space cache, otherwise,
   FTL will destroy file system data.

Btrfs: avoid orphan inode caused by log replay

drop_one_dir_item does not properly update inode's link count. It can be
reproduced by executing following commands:

#touch test
#sync
#rm -f test
#dd if=/dev/zero bs=4k count=1 of=test conv=fsync
#echo b > /proc/sysrq-trigger

This fixes it by adding an BTRFS_ORPHAN_ITEM_KEY for the inode

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: avoid potential super block corruption

The data in fs_info->super_for_commit are zeros before the
first transaction commit. If tree log sync and system crash
both occur before the first transaction commit, super block
will get corrupted.

This fixes it by properly filling in the super_for_commit field at
open time.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: do not call kfree if kmalloc failed in btrfs_sysfs_add_super

Signed-off-by: Shen Feng <shen@cn.fujitsu.com>

Btrfs: fix a memory leak in btrfs_get_sb

subvol_name should be freed if error occurs.

Signed-off-by: Shen Feng <shen@cn.fujitsu.com>

Btrfs: Fix typo in clear_state_cb

In clear_state_cb, we should check 'tree->ops->clear_bit_hook' instead
of 'tree->ops->set_bit_hook'.

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: Fix memset length in btrfs_file_write

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Btrfs: update directory's size when creating subvol/snapshot

Make sure directory's size properly updated when creating
subvol/snapshot.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: add permission checks to the ioctls

Only root can add/remove devices
Only root can defrag subtrees
Only files open for writing can be defragged
Only files open for writing can be the destination for a clone

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Linux 2.6.28

Happy holidays..

Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6:
  V4L/DVB (9920): em28xx: fix NULL pointer dereference in call to VIDIOC_INT_RESET command
  V4L/DVB (9908a): MAINTAINERS: mark linux-uvc-devel as subscribers only
  V4L/DVB (9906): v4l2-compat: test for unlocked_ioctl as well.
  V4L/DVB (9885): drivers/media Kconfig's: fix bugzilla #12204
  V4L/DVB (9875): gspca - main: Fix vidioc_s_jpegcomp locking.
  V4L/DVB (9781): [PATCH] Cablestar 2 I2C retries (fix CableStar2 support)
  V4L/DVB (9780): dib0700: Stop repeating after user stops pushing button

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: disable X86_PTRACE_BTS

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Add missing terminators in patch_sigmatel.c

ALSA: hda - Add missing terminators in patch_sigmatel.c

Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br>
Cc: stable@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>

x86: disable X86_PTRACE_BTS

there's a new ptrace arch level feature in .28:

  config X86_PTRACE_BTS
  bool "Branch Trace Store"

it has broken fork() handling: the old DS area gets copied over into
a new task without clearing it.

Fixes exist but they came too late:

  c5dee61: x86, bts: memory accounting
  bf53de9: x86, bts: add fork and exit handling

and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.

Signed-off-by: Ingo Molnar <mingo@elte.hu>

parisc: disable UP-optimized flush_tlb_mm

flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.

Debugged-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/radeon: fix correctness of irq_enabled check for radeon.

edac: fix edac core deadlock when removing a device

When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure.  Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed.  This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier.  The edac_poller will wake up sleepers when
it is found.

EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices.  They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device.  This is exactly
where edac_poller and rmmod would have a great chance to deadlock.

In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,

device_ctls_mutex would have already been grabbed by
edac_device_del_device().  So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.

edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex.  Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.

Moreover, an edac_dev.work should check first if it is being removed.  If
this is the case, then it should bail out immediately.  Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed.  The current edac_dev.op_state can
be used to serve this purpose.

The original deadlock problem and the solution have been witnessed and
tested on actual hardware.  Without the solution, rmmod an edac driver
would result in below deadlock:

root@localhost:/root> rmmod mv64x60_edac
EDAC DEBUG: mv64x60_dma_err_remove()
EDAC DEBUG: edac_device_del_device()
EDAC DEBUG: find_edac_device_by_dev()

(hang for a moment)

INFO: task edac-poller:2030 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
edac-poller   D 00000000     0  2030      2
Call Trace:
[df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable)
[df159e80] [c000a024] __switch_to+0x6c/0xa0
[df159ea0] [c03587d8] schedule+0x2f4/0x4d8
[df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174
[df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core]
[df159f60] [c003beb4] run_workqueue+0x114/0x218
[df159f90] [c003c674] worker_thread+0x5c/0xc8
[df159fd0] [c004106c] kthread+0x5c/0xa0
[df159ff0] [c0013538] original_kernel_thread+0x44/0x60
INFO: task rmmod:2062 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod         D 0ff2c9fc     0  2062   1839
Call Trace:
[df119c00] [c0437a74] 0xc0437a74 (unreliable)
[df119cc0] [c000a024] __switch_to+0x6c/0xa0
[df119ce0] [c03587d8] schedule+0x2f4/0x4d8
[df119d40] [c03591dc] schedule_timeout+0xb0/0xf4

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cgroups: avoid accessing uninitialized data in failure path

If cgroup_get_rootdir() failed, free_cg_links() will be called in the
failure path, but tmp_cg_links hasn't been initialized at that time.

I introduced this bug in the 2.6.27 merge window.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cgroups: suppress bogus warning messages

Remove spurious warning messages that are thrown onto the console during
cgroup operations.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Sharyathi Nagesh <sharyathi@in.ibm.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

w1: fix slave selection on big-endian systems

During test of the w1-gpio driver i found that in "w1.c:679
w1_slave_found()" the device id is converted to little-endian with
"cpu_to_le64()", but its not converted back to cpu format in "w1_io.c:293
w1_reset_select_slave()".

Based on a patch created by Andreas Hummel.

[akpm@linux-foundation.org: remove unneeded cast]
Reported-by: Andreas Hummel <andi_hummel@gmx.de>
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

rtc: rtc-isl1208: reject invalid dates

This patch for the rtc-isl1208 driver makes it reject invalid dates.

Signed-off-by: Chris Elston <celston@katalix.com>
[a.zummo@towertech.it: added comment explaining the check]
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: Hebert Valerio Riedel <hvr@gnu.org>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

V4L/DVB (9920): em28xx: fix NULL pointer dereference in call to VIDIOC_INT_RESET command

Fix a NULL pointer dereference that would occur if the video decoder tied to
the em28xx supports the VIDIOC_INT_RESET call (for example: the cx25840 driver)

Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

drm/radeon: fix correctness of irq_enabled check for radeon.

This check was introduced with the logic the wrong way around.

Fixes regression: http://bugzilla.kernel.org/show_bug.cgi?id=12216

Tested-by: François Valenduc <francois.valenduc@tvcablenet.be>
Signed-off-by: Dave Airlie <airlied@redhat.com>

Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
ACPI: don't cond_resched() when irqs_disabled()
ACPI: fix 2.6.28 acpi.debug_level regression

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
drivers/ide/{cs5530.c,sc1200.c}: Move a dereference below a NULL test

drivers/ide/{cs5530.c,sc1200.c}: Move a dereference below a NULL test

In each case, if the NULL test is necessary, then the dereference should be
moved below the NULL test.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
type T;
expression E;
identifier i,fld;
statement S;
@@

- T i = E->fld;
+ T i;
  ... when != E
      when != i
  if (E == NULL) S
+ i = E->fld;
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus

* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
MIPS: MIPS64R2: Fix buggy __arch_swab64
MIPS: Fix preprocessor warnings flaged by GCC 4.4

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
ppp: fix segfaults introduced by netdev_priv changes
net: Fix module refcount leak in kernel_accept()

MIPS: MIPS64R2: Fix buggy __arch_swab64

The way the code is written it was assuming dshd has the function of a
hypothetical dshw instruction ...

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

MIPS: Fix preprocessor warnings flaged by GCC 4.4

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

Null pointer deref with hrtimer_try_to_cancel()

Impact: Prevent kernel crash with posix timer clockid CLOCK_MONOTONIC_RAW

commit 2d42244ae71d6c7b0884b5664cf2eda30fb2ae68 (clocksource:
introduce CLOCK_MONOTONIC_RAW) introduced a new clockid, which is only
available to read out the raw not NTP adjusted system time.

The above commit did not prevent that a posix timer can be created
with that clockid. The timer_create() syscall succeeds and initializes
the timer to a non existing hrtimer base. When the timer is deleted
either by timer_delete() or by the exit() cleanup the kernel crashes.

Prevent the creation of timers for CLOCK_MONOTONIC_RAW by setting the
posix clock function to no_timer_create which returns an error code.

Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  fs/9p: change simple_strtol to simple_strtoul
  9p: convert d_iname references to d_name.name
  9p: Remove potentially bad parameter from function entry debug print.

Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix resume (S2R) broken by Intel microcode module, on A110L
  x86 gart: don't complain if no AMD GART found
  AMD IOMMU: panic if completion wait loop fails
  AMD IOMMU: set cmd buffer pointers to zero manually
  x86: re-enable MCE on secondary CPUS after suspend/resume
  AMD IOMMU: allocate rlookup_table with __GFP_ZERO

x86: fix resume (S2R) broken by Intel microcode module, on A110L

Impact: fix deadlock

This is in response to the following bug report:

Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12100
Subject         : resume (S2R) broken by Intel microcode module, on A110L
Submitter       : Andreas Mohr <andi@lisas.de>
Date            : 2008-11-25 08:48 (19 days old)
Handled-By      : Dmitry Adamushko <dmitry.adamushko@gmail.com>

[ The deadlock scenario has been discovered by Andreas Mohr ]

I think I might have a logical explanation why the system:

  (http://bugzilla.kernel.org/show_bug.cgi?id=12100)

might hang upon resuming, OTOH it should have likely hanged each and every time.

(1) possible deadlock in microcode_resume_cpu() if either 'if' section is
taken;

(2) now, I don't see it in spec. and can't experimentally verify it (newer
ucodes don't seem to be available for my Core2duo)... but logically-wise, I'd
think that when read upon resuming, the 'microcode revision' (MSR 0x8B) should
be back to its original one (we need to reload ucode anyway so it doesn't seem
logical if a cpu doesn't drop the version)... if so, the comparison with
memcmp() for the full 'struct cpu_signature' is wrong... and that's how one of
the aforementioned 'if' sections might have been triggered - leading to a
deadlock.

Obviously, in my tests I simulated loading/resuming with the ucode of the same
version (just to see that the file is loaded/re-loaded upon resuming) so this
issue has never popped up.

I'd appreciate if someone with an appropriate system might give a try to the
2nd patch (titled "fix a comparison && deadlock...").

In any case, the deadlock situation is a must-have fix.

Reported-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Tested-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>

fs/9p: change simple_strtol to simple_strtoul

Since v9ses->uid is unsigned, it would seem better to use simple_strtoul that
simple_strtol.

A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@r2@
long e;
position p;
@@

e = simple_strtol@p(...)

@@
position p != r2.p;
type T;
T e;
@@

e =
- simple_strtol@p
+ simple_strtoul
(...)
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>

9p: convert d_iname references to d_name.name

d_iname is rubbish for long file names.
Use d_name.name in printks instead.

Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>

9p: Remove potentially bad parameter from function entry debug print.

Signed-off-by: Duane Griffin <duaneg@dghda.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>

Btrfs: Fix compile warning around num_online_cpus() in a min statement

Signed-off-by: Chris Mason <chris.mason@oracle.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] mpt fusion: clear list of outstanding commands on host reset
  [SCSI] scsi_lib: only call scsi_unprep_request() under queue lock
  [SCSI] ibmvstgt: move crq_queue_create to the end of initialization
  [SCSI] libiscsi REGRESSION: fix passthrough support with older iscsi tools
  [SCSI] aacraid: disable Dell Percraid quirk on Adaptec 2200S and 2120S

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
  ALSA: Fix a Oops bug in omap soc driver.
  ALSA: hda - Remove non-working headphone control for Dell laptops
  ALSA: hda - Add no-jd model for IDT 92HD73xx
  ALSA: Revert "ALSA: hda: removed unneeded hp_nid references"
  ALSA: hda - Add quirk for Dell Studio 17
  ALSA: hda - Fix silent HP output on D975

Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
cciss: fix problem that deleting multiple logical drives could cause a panic

Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6

* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: GEM on PAE has problems - disable it for now.
drm/i915: Don't return busy for buffers left on the flushing list.

Merge branch 'for-linus' of git://neil.brown.name/md

* 'for-linus' of git://neil.brown.name/md:
md: Don't read past end of bitmap when reading bitmap.

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI hotplug: ibmphp: Fix module ref count underflow
  PCI hotplug: acpiphp wants a 64-bit _SUN
  PCI: pciehp: fix unexpected power off with pciehp_force
  PCI: fix aer resume sanity check

Btrfs: set EXTENT_BOUNDARY bit before marking extent delalloc.

There is a race in relocate_inode_pages, it happens when
find_delalloc_range finds the delalloc extent before the
boundary bit is set. Thank you,

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: properly update block accounting for metadata

This adds the missing block accounting code to finish_current_insert and makes
block accounting for root item properly protected by the delalloc spin lock.

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Btrfs: Add missing mnt_drop_write in ioctl.c

This patch adds the missing mnt_drop_write to match
mnt_want_write in btrfs_ioctl_defrag and btrfs_ioctl_clone

Signed-off-by: Yan Zheng <zheng.yan@oracle.com>

Merge branch 'fix/asoc' into for-linus

ALSA: Fix a Oops bug in omap soc driver.

There will be a Oops or frequent underrun messages when playing music with
omap soc driver, this is because a data region is incorretly sized, other data
region will be overwriten when writing to this data region.

Signed-off-by: Stanley Miao <stanley.miao@windriver.com>
Acked-by: Jarkko Nikula <jarkko.nikula@nokia.com>
Cc: stable@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: hda - Remove non-working headphone control for Dell laptops

The previous commit re-enabled hp_nid setup for IDT92HD73*, but
it's unneeded indeed for Dell laptops that have multiple headphones.
Setting the extra hp_nid results in a non-working "Headpohne" mixer
control. Thus hp_nid should be 0 for these dell models.

Also, the automatic addition of hp_nid should check whether it's
a dual-HP model or not. For dual-HPs, the pins are already checked
by the early workaround.

Signed-off-by: Takashi Iwai <tiwai@suse.de>

ACPI: don't cond_resched() when irqs_disabled()

The ACPI interpreter usually runs with irqs enabled.
However, during suspend/resume it runs with
irqs disabled to evaluate _GTS/_BFS, as well as
by irqrouter_resume() which evaluates _CRS, _PRS, _SRS.

http://bugzilla.kernel.org/show_bug.cgi?id=12252

Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ACPI: fix 2.6.28 acpi.debug_level regression

acpi_early_init() was changed to over-write the cmdline param,
making it really inconvenient to set debug flags at boot-time.

Also,
This sets the default level to "info", which is what all the ACPI
drivers use. So to enable messages from drivers, you only have to
supply the "layer" (a.k.a. "component"). For non-"info" ACPI core
and ACPI interpreter messages, you have to supply both level and
layer masks, as before.

Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>

ALSA: hda - Add no-jd model for IDT 92HD73xx

Added the model without the jack-detection for some desktops that
have really no jack-detection. The recent driver caused regressions
regarding the sound output on such machines.

Signed-off-by: Takashi Iwai <tiwai@suse.de>

ALSA: Revert "ALSA: hda: removed unneeded hp_nid references"

This reverts commit 07f455f779acfb3eba4921fd1399761559b10fa9.
ALSA: hda: removed unneeded hp_nid references

Removed unneeded hp_nid references for 92hd73xx codec family.

This caused the silent output on some Intel desktops due to missing
routing of widget 0x0a and 0x0d.

Signed-off-by: Takashi Iwai <tiwai@suse.de>

cciss: fix problem that deleting multiple logical drives could cause a panic

Fix problem that deleting multiple logical drives could cause a panic.

It fixes a panic which can be easily reproduced in the following way: Just
create several "arrays," each with multiple logical drives via hpacucli,
then delete the first array, and it will blow up in deregister_disk(), in
the call to get_host() when it tries to dig the hba pointer out of a NULL
queue pointer.

The problem has been present since my code to make rebuild_lun_table
behave better went in.

Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

ALSA: hda - Add quirk for Dell Studio 17

Added the matching model=dell-m6 for Dell Studio 17 laptop.

Signed-off-by: Joerg Schirottke <master@kanotix.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>

drm/i915: GEM on PAE has problems - disable it for now.

On PAE systems, GEM allocates pages using shmem, and passes these
pages to be bound into AGP, however the AGP interfaces + the x86
set_memory interfaces all take unsigned long not dma_addr_t.

The initial fix for this was a mess, so we need to do this correctly
for 2.6.29.

Signed-off-by: Dave Airlie <airlied@redhat.com>

drm/i915: Don't return busy for buffers left on the flushing list.

These buffers don't have active rendering still occurring to them, they just
need either a flush to be emitted or a retire_requests to occur so that we
notice they're done. Return unbusy so that one of the two occurs. The two
expected consumers of this interface (OpenGL and libdrm_intel BO cache) both
want this behavior.

Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

md: Don't read past end of bitmap when reading bitmap.

When we read the write-intent-bitmap off the device, we currently
read a whole number of pages.
When PAGE_SIZE is 4K, this works due to the alignment we enforce
on the superblock and bitmap.
When PAGE_SIZE is 64K, this case read past the end-of-device
which causes an error.

When we write the superblock, we ensure to clip the last page
to just be the required size. Copy that code into the read path
to just read the required number of sectors.

Signed-off-by: Neil Brown <neilb@suse.de>
Cc: stable@kernel.org

ppp: fix segfaults introduced by netdev_priv changes

This patch fixes a segfault in ppp_shutdown_interface() and
ppp_destroy_interface() when a PPP connection is closed. I bisected
the problem to the following commit:

  commit c8019bf3aff653cceb64f66489fc299ee5957b57
  Author: Wang Chen <wangchen@cn.fujitsu.com>
  Date:   Thu Nov 20 04:24:17 2008 -0800

    netdevice ppp: Convert directly reference of netdev->priv

    1. Use netdev_priv(dev) to replace dev->priv.
    2. Alloc netdev's private data by alloc_netdev().

Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original ppp_generic code treated the netdev and struct ppp as
independent data structures which were freed separately. In moving the
ppp struct into the netdev, it is now possible for the private data to
be freed before the call to ppp_shutdown_interface(), which is bad.

The kfree(ppp) in ppp_destroy_interface() is also wrong; presumably
ppp hasn't worked since the above commit.

The following patch fixes both problems.

Signed-off-by: James Chapman <jchapman@katalix.com>
Reviewed-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: Fix module refcount leak in kernel_accept()

The kernel_accept() does not hold the module refcount of newsock->ops->owner,
so we need __module_get(newsock->ops->owner) code after call kernel_accept()
by hand.
In sunrpc, the module refcount is missing to hold. So this cause kernel panic.

Used following script to reproduct:

while [ 1 ];
do
    mount -t nfs4 192.168.0.19:/ /mnt
    touch /mnt/file
    umount /mnt
    lsmod | grep ipv6
done

This patch fixed the problem by add __module_get(newsock->ops->owner) to
kernel_accept(). So we do not need to used __module_get(newsock->ops->owner)
in every place when used kernel_accept().

Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Linux 2.6.28-rc9

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  async_xor: dma_map destination DMA_BIDIRECTIONAL
  dmaengine: protect 'id' from concurrent registrations
  ioat: wait for self-test completion

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6:
  avr32: favr-32 build fix
  ATSTK1006: Fix boot from NAND flash
  avr32: remove .note.gnu.build-id section when making vmlinux.bin
  avr32: Enable pullup on USART TX lines

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  bnx2: Fix bug in bnx2_free_rx_mem().
  irda: Add irda_skb_cb qdisc related padding
  jme: Fixed a typo
  net: kernel BUG at drivers/net/phy/mdio_bus.c:165!
  drivers/net: starfire: Fix napi ->poll() weight handling
  tlan: Fix pci memory unmapping
  enc28j60: use netif_rx_ni() to deliver RX packets
  tlan: Fix small (< 64 bytes) datagram transmissions
  netfilter: ctnetlink: fix missing CTA_NAT_SEQ_UNSPEC

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
sparc: We need to implement arch_ptrace_stop().

Maintainer email fixes for inotify

Update John McCutchan and Robert Love's email addresses for
maintenance of inotify

Signed-off-by: John McCutchan <john@johnmccutchan.com>
Acked-by: Robert Love <rlove@rlove.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

avr32: favr-32 build fix

The favr-32 board code still refers to the old asm/arch header files
which were moved to mach/ some time ago.

Signed-off-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>