Johannes Berg [Mon, 30 Mar 2009 10:02:35 +0000 (12:02 +0200)]
toshiba-acpi: remove MAINTAINERS entry
"I'm not much opposed to marking this driver orphaned. I haven't used
a Toshiba laptop in four years or so, and disagree with the recent
additions of bluetooth and wireless control to the driver.
--John"
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: John Belmonte <john@neggie.net> Signed-off-by: Len Brown <len.brown@intel.com>
Arjan van de Ven [Sat, 10 Jan 2009 19:19:05 +0000 (14:19 -0500)]
ACPI: battery: asynchronous init
The battery driver tends to take quite some time to initialize
(100ms-300ms is quite typical).
This patch initializes the batter driver asynchronously, so that other
things in the kernel can initialize in parallel to this 300 msec.
As part of this, the battery driver had to move to the back
of the ACPI init order (hence the Makefile change).
Without this move, the next ACPI driver would just block
on the ACPI/devicee layer semaphores until the battery driver was
done anyway, not gaining any boot time.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Andy Whitcroft [Sat, 4 Apr 2009 08:33:34 +0000 (09:33 +0100)]
acer-wmi: Cleanup the failure cleanup handling
Cleanup the failure cleanup handling for brightness and email led.
[cc: Split out from another patch]
Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk> Signed-off-by: Len Brown <len.brown@intel.com>
Carlos Corbacho [Sat, 4 Apr 2009 08:33:29 +0000 (09:33 +0100)]
acer-wmi: Blacklist Acer Aspire One
The Aspire One's ACPI-WMI interface is a placeholder that does nothing,
and the invalid results that we get from it are now causing userspace
problems as acer-wmi always returns that the rfkill is enabled (i.e. the
radio is off, when it isn't). As it's hardware controlled, acer-wmi
isn't needed on the Aspire One either.
Thanks to Andy Whitcroft at Canonical for tracking down Ubuntu's userspace
issues to this.
Signed-off-by: Carlos Corbacho <carlos@strangeworlds.co.uk> Reported-by: Andy Whitcroft <apw@canonical.com> Cc: stable@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>
Refactor and redesign the brightness control backend...
In order to fix bugzilla #11750...
Add a new brightness control mode: support direct NVRAM checkpointing
of the backlight level (i.e. store directly to NVRAM without the need
for UCMS calls), and use that together with the EC-based control.
Disallow UCMS+EC, thus avoiding races with the SMM firmware.
Switch the models that define HBRV (EC Brightness Value) in the DSDT
to the new mode. These are: T40-T43, R50-R52, R50e, R51e, X31-X41.
Change the default for all other IBM ThinkPads to UCMS-only. The
Lenovo models already default to UCMS-only.
Reported-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
thinkpad-acpi: enhanced debugging messages for rfkill subdrivers
Enhance debugging messages for all rfkill subdrivers in thinkpad-acpi.
Also, log a warning if the deprecated sysfs attributes are in use.
These attributes are going to be removed sometime in 2010.
There is an user-visible side-effect: we now coalesce attempts to
enable/disable bluetooth or WWAN in the procfs interface, instead of
hammering the firmware with multiple requests.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
thinkpad-acpi: restrict access to some firmware LEDs
Some of the ThinkPad LEDs indicate critical conditions that can cause
data loss or cause hardware damage when ignored (e.g. force-ejecting
a powered up bay; ignoring a failing battery, or empty battery; force-
undocking with the dock buses still active, etc).
On almost all ThinkPads, LED access is write-only, and the firmware
usually does fire-and-forget signaling on them, so you effectively
lose whatever message the firmware was trying to convey to the user
when you override the LED state, without any chance to restore it.
Restrict access to all LEDs that can convey important alarms, or that
could mislead the user into incorrectly operating the hardware. This
will make the Lenovo engineers less unhappy about the whole issue.
Allow users that really want it to still control all LEDs, it is the
unaware user that we have to worry about.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
The HKEY disable functionality basically cripples the entire event
model of the ThinkPad firmware and of the thinkpad-acpi driver.
Remove this functionality from the driver. HKEY must be enabled at
all times while thinkpad-acpi is loaded, and disabled otherwise.
For sysfs, according to the sysfs ABI and the thinkpad-acpi sysfs
rules of engagement, we will just remove the attributes. This will be
done in two stages: disable their function now, after two kernel
releases, remove the attributes.
For procfs, we call WARN(). If nothing triggers it, I will simply
remove the enable/disable commands entirely in the future along with
the sysfs attributes.
I don't expect much, if any fallout from this. There really isn't any
reason to mess with hotkey_enable or with the enable/disable commands
to /proc/acpi/ibm/hotkey, and this has been true for years...
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
thinkpad-acpi: add new debug helpers and warn of deprecated atts
Add a debug helper that discloses the TGID of the userspace task
attempting to access the driver. This is highly useful when dealing
with bug reports, since often the user has no idea that some userspace
application is accessing thinkpad-acpi...
Also add a helper to log warnings about sysfs attributes that are
deprecated.
Use the new helpers to issue deprecation warnings for bluetooth_enable
and wwan_enabled, that have been deprecated for a while, now.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
Add missing log levels in a standalone commit, to avoid dependencies in
future unrelated changes, just because they wanted to use one of the
missing log levels.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
The driver was renamed two years ago, on 2.6.21. Drop the old
compatibility alias, we have given everybody quite enough time
to update their configs to the new name.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Len Brown <len.brown@intel.com>
Harald Welte [Wed, 14 Jan 2009 06:01:17 +0000 (14:01 +0800)]
panasonic-laptop: use snprintf with PAGE_SIZE in sysfs attributes
Instead of just sprintf() into the page-sized buffer provided
by the sysfs/device_attribute API, we use snprintf with PAGE_SIZE
as an additional safeguard.
Signed-off-by: Martin Lucina <mato@kotelna.sk> Signed-off-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Len Brown <len.brown@intel.com>
Harald Welte [Wed, 14 Jan 2009 05:59:50 +0000 (13:59 +0800)]
panasonic-laptop: Fix autoloading
This patch adds MODULE_DEVICE_TABLE() to panasonic-laptop.c in order
to ensure automatic loading of the module on systems with the respective
"MAT*" ACPI devices.
Signed-off-by: Martin Lucina <mato@kotelna.sk> Signed-off-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Len Brown <len.brown@intel.com>
Matthew Garrett [Fri, 9 Jan 2009 20:17:11 +0000 (20:17 +0000)]
dell-wmi: new driver for hotkey control
Add a WMI driver for Dell laptops. Currently it does nothing but send a
generic input event when a button with a picture of a battery on it is
pressed, but maybe other uses will appear over time.
Signed-off-by: Matthew Garrett <mjg@redhat.com> Cc: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
(This is an update to the patch presented earlier in
http://lkml.org/lkml/2008/12/8/284, with new error handling.)
This patch sets the power of PnP ACPI devices to D0 when they
are activated and to D3 when they are disabled. The latter is
in correspondence with the ACPI 3.0 specification, whereas the
former is added in order to be able to power up a device after
it has been previously disabled (or when booting up a system).
(As a consequence, the patch makes the PnP ACPI code more ACPI
compliant.)
Section 6.2.2 of the ACPI Specification (at least versions 1.0b
and 3.0a) states: "Prior to running this control method [_DIS],
the OS[PM] will have already put the device in the D3 state."
Unfortunately, there is no clear statement as to when to put
a device in the D0 state. :-( Therefore, the patch executes the
method calls as _PS3/_DIS and _SRS/_PS0. What is clear: "If the
device is disabled, _SRS enables the device at the specified
resources." (From the ACPI 3.0a Specification.)
The patch fixes a problem with some IBM ThinkPads (at least the
600E and the 600X) where the serial ports have a dedicated
power source that needs to be brought up before the serial port
can be used. Without this patch, the serial port is enabled
but has no power. (In the past, the tpctl utility had to be
utilized to turn on the power, but support for this feature
stopped with version 5.9 as it did not support the more recent
kernel versions.)
The error handlers that handle any errors that can occur during
the power up/power down phases return the error codes to the
caller directly. Comments welcome! :-)
No regressions were observed on hardware that does not require
this patch.
The patch is applied against 2.6.27.x.
Signed-off-by: Witold Szczeponik <Witold.Szczeponik@gmx.net> Acked-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Merge branch 'stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
symbols, stacktrace: look up init symbols after module symbols
Merge branch 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: rcu_barrier VS cpu_hotplug: Ensure callbacks in dead cpu are migrated to online cpu
Merge branch 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'ipi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
s390: remove arch specific smp_send_stop()
panic: clean up kernel/panic.c
panic, smp: provide smp_send_stop() wrapper on UP too
panic: decrease oops_in_progress only after having done the panic
generic-ipi: eliminate WARN_ON()s during oops/panic
generic-ipi: cleanups
generic-ipi: remove CSD_FLAG_WAIT
generic-ipi: remove kmalloc()
generic IPI: simplify barriers and locking
Merge branch 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
locking: rename trace_softirq_[enter|exit] => lockdep_softirq_[enter|exit]
lockdep: remove duplicate CONFIG_DEBUG_LOCKDEP definitions
lockdep: require framepointers for x86
lockdep: remove extra "irq" string
lockdep: fix incorrect state name
Suresh Siddha [Mon, 30 Mar 2009 21:55:30 +0000 (13:55 -0800)]
x86, ACPI: add support for x2apic ACPI extensions
All logical processors with APIC ID values of 255 and greater will have their
APIC reported through Processor X2APIC structure (type-9 entry type) and all
logical processors with APIC ID less than 255 will have their APIC reported
through legacy Processor Local APIC (type-0 entry type) only. This is the
same case even for NMI structure reporting.
The Processor X2APIC Affinity structure provides the association between the
X2APIC ID of a logical processor and the proximity domain to which the logical
processor belongs.
For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device()
objects in the ACPI namespace.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: remove compat stuff
HID: constify arrays of struct apple_key_translation
HID: add support for Kye/Genius Ergo 525V
HID: Support Apple mini aluminum keyboard
HID: support for Kensington slimblade device
HID: DragonRise game controller force feedback driver
HID: add support for another version of 0e8f:0003 device in hid-pl
HID: fix race between usb_register_dev() and hiddev_open()
HID: bring back possibility to specify vid/pid ignore on module load
HID: make HID_DEBUG defaults consistent
HID: autosuspend -- fix lockup of hid on reset
HID: hid_reset_resume() needs to be defined only when CONFIG_PM is set
HID: fix USB HID devices after STD with autosuspend
HID: do not try to compile PM code with CONFIG_PM unset
HID: autosuspend support for USB HID
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
trivial: Update my email address
trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
trivial: Fix misspelling of "Celsius".
trivial: remove unused variable 'path' in alloc_file()
trivial: fix a pdlfush -> pdflush typo in comment
trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
trivial: wusb: Storage class should be before const qualifier
trivial: drivers/char/bsr.c: Storage class should be before const qualifier
trivial: h8300: Storage class should be before const qualifier
trivial: fix where cgroup documentation is not correctly referred to
trivial: Give the right path in Documentation example
trivial: MTD: remove EOL from MODULE_DESCRIPTION
trivial: Fix typo in bio_split()'s documentation
trivial: PWM: fix of #endif comment
trivial: fix typos/grammar errors in Kconfig texts
trivial: Fix misspelling of firmware
trivial: cgroups: documentation typo and spelling corrections
trivial: Update contact info for Jochen Hein
trivial: fix typo "resgister" -> "register"
...
* git://git.kernel.org/pub/scm/linux/kernel/git/czankel/xtensa-2.6: (21 commits)
xtensa: we don't need to include asm/io.h
xtensa: only build platform or variant if they contain a Makefile
xtensa: make startup code discardable
xtensa: ccount clocksource
xtensa: remove platform rtc hooks
xtensa: use generic sched_clock()
xtensa: platform: s6105
xtensa: let platform override KERNELOFFSET
xtensa: s6000 variant
xtensa: s6000 variant core definitions
xtensa: variant irq set callbacks
xtensa: variant-specific code
xtensa: nommu support
xtensa: add flat support
xtensa: enforce slab alignment to maximum register width
xtensa: cope with ram beginning at higher addresses
xtensa: don't make bootmem bitmap larger than required
xtensa: fix init_bootmem_node() argument order
xtensa: use correct stack pointer for stack traces
xtensa: beat Kconfig into shape
...
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
Btrfs: BUG to BUG_ON changes
Btrfs: remove dead code
Btrfs: remove dead code
Btrfs: fix typos in comments
Btrfs: remove unused ftrace include
Btrfs: fix __ucmpdi2 compile bug on 32 bit builds
Btrfs: free inode struct when btrfs_new_inode fails
Btrfs: fix race in worker_loop
Btrfs: add flushoncommit mount option
Btrfs: notreelog mount option
Btrfs: introduce btrfs_show_options
Btrfs: rework allocation clustering
Btrfs: Optimize locking in btrfs_next_leaf()
Btrfs: break up btrfs_search_slot into smaller pieces
Btrfs: kill the pinned_mutex
Btrfs: kill the block group alloc mutex
Btrfs: clean up find_free_extent
Btrfs: free space cache cleanups
Btrfs: unplug in the async bio submission threads
Btrfs: keep processing bios for a given bdev if our proc is batching
x86, PAT: Remove duplicate memtype reserve in pci mmap
pci mmap code was doing memtype reserve for a while now. Recently we
added memtype tracking in remap_pfn_range, and pci code indirectly calls
remap_pfn_range. So, we don't need seperate tracking in pci code
anymore. Which means a patch that removes ~50 lines of code :-).
Also, recently we found out that the pci tracking is not working as we expect
it to work in some cases. Specifically, userlevel X mmap of pci, with some
recent version of X, is having a problem with vm_page_prot getting reset.
The pci tracking uses vm_page_prot to pass on the protection type from parent
to child during fork.
a) Parent does a pci mmap
b) We look at PAT and get either UC_MINUS or WC mapping for parent
c) Store that mapping type in vma vm_page_prot for future use
d) This thread does a fork
e) Fork results in mmap_ops ->open for the child process
f) We get the vm_page_prot from vma and reserve that type for the child process
But, between c) and e) above, the vma vm_page_prot is getting reset to zero.
This results in PAT reserve failing at the time of fork as in here.
http://marc.info/?l=linux-kernel&m=123858163103240&w=2
This cleanup makes the above problem go away as we do not depend on
vm_page_prot in our PAT code anymore.
Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
ocfs2: recover orphans in offline slots during recovery and mount
ocfs2: Pagecache usage optimization on ocfs2
ocfs2: fix rare stale inode errors when exporting via nfs
ocfs2/dlm: Tweak mle_state output
ocfs2/dlm: Do not purge lockres that is being migrated dlm_purge_lockres()
ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entry
ocfs2/dlm: Show the number of lockres/mles in dlm_state
ocfs2/dlm: dlm_set_lockres_owner() and dlm_change_lockres_owner() inlined
ocfs2/dlm: Improve lockres counts
ocfs2/dlm: Track number of mles
ocfs2/dlm: Indent dlm_cleanup_master_list()
ocfs2/dlm: Activate dlm->master_hash for master list entries
ocfs2/dlm: Create and destroy the dlm->master_hash
ocfs2/dlm: Refactor dlm_clean_master_list()
ocfs2/dlm: Clean up struct dlm_lock_name
ocfs2/dlm: Encapsulate adding and removing of mle from dlm->master_list
ocfs2: Optimize inode group allocation by recording last used group.
ocfs2: Allocate inode groups from global_bitmap.
ocfs2: Optimize inode allocation by remembering last group
ocfs2: fix leaf start calculation in ocfs2_dx_dir_rebalance()
...
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
dma: Add SoF and EoF debugging to ipu_idmac.c, minor cleanup
dw_dmac: add cyclic API to DW DMA driver
dmaengine: Add privatecnt to revert DMA_PRIVATE property
dmatest: add dma interrupts and callbacks
dmatest: add xor test
dmaengine: allow dma support for async_tx to be toggled
async_tx: provide __async_inline for HAS_DMA=n archs
dmaengine: kill some unused headers
dmaengine: initialize tx_list in dma_async_tx_descriptor_init
dma: i.MX31 IPU DMA robustness improvements
dma: improve section assignment in i.MX31 IPU DMA driver
dma: ipu_idmac driver cosmetic clean-up
dmaengine: fail device registration if channel registration fails
Srinivas Eeda [Fri, 6 Mar 2009 22:21:46 +0000 (14:21 -0800)]
ocfs2: recover orphans in offline slots during recovery and mount
During recovery, a node recovers orphans in it's slot and the dead node(s). But
if the dead nodes were holding orphans in offline slots, they will be left
unrecovered.
If the dead node is the last one to die and is holding orphans in other slots
and is the first one to mount, then it only recovers it's own slot, which
leaves orphans in offline slots.
This patch queues complete_recovery to clean orphans for all offline slots
during mount and node recovery.
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Hisashi Hifumi [Thu, 5 Mar 2009 08:22:21 +0000 (17:22 +0900)]
ocfs2: Pagecache usage optimization on ocfs2
A page can have multiple buffers and even if a page is not uptodate, some buffers
can be uptodate on pagesize != blocksize environment.
This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.
"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read after
random write workloads can be optimized and we can get performance improvement.
Signed-off-by: Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
wengang wang [Fri, 6 Mar 2009 13:29:10 +0000 (21:29 +0800)]
ocfs2: fix rare stale inode errors when exporting via nfs
For nfs exporting, ocfs2_get_dentry() returns the dentry for fh.
ocfs2_get_dentry() may read from disk when the inode is not in memory,
without any cross cluster lock. this leads to the file system loading a
stale inode.
This patch fixes above problem.
Solution is that in case of inode is not in memory, we get the cluster
lock(PR) of alloc inode where the inode in question is allocated from (this
causes node on which deletion is done sync the alloc inode) before reading
out the inode itsself. then we check the bitmap in the group (the inode in
question allcated from) to see if the bit is clear. if it's clear then it's
stale. if the bit is set, we then check generation as the existing code
does.
We have to read out the inode in question from disk first to know its alloc
slot and allot bit. And if its not stale we read it out using ocfs2_iget().
The second read should then be from cache.
And also we have to add a per superblock nfs_sync_lock to cover the lock for
alloc inode and that for inode in question. this is because ocfs2_get_dentry()
and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked
in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so
that mutliple ocfs2_delete_inode() can run concurrently in normal case.
[mfasheh@suse.com: build warning fixes and comment cleanups] Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com> Acked-by: Joel Becker <joel.becker@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sunil Mushran [Thu, 26 Feb 2009 23:00:47 +0000 (15:00 -0800)]
ocfs2/dlm: Remove struct dlm_lock_name in struct dlm_master_list_entry
This patch removes struct dlm_lock_name and adds the entries directly
to struct dlm_master_list_entry. Under the new scheme, both mles that
are backed by a lockres or not, will have the name populated in mle->mname.
This allows us to get rid of code that was figuring out the location of
the mle name.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sunil Mushran [Thu, 26 Feb 2009 23:00:44 +0000 (15:00 -0800)]
ocfs2/dlm: Improve lockres counts
This patch replaces the lockres counts that tracked the number number of
locally and remotely mastered lockres' with a current and total count. The
total count is the number of lockres' that have been created since the dlm
domain was created.
The number of locally and remotely mastered counts can be computed using
the locking_state output.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sunil Mushran [Thu, 26 Feb 2009 23:00:43 +0000 (15:00 -0800)]
ocfs2/dlm: Track number of mles
The lifetime of a mle is limited to the duration of the lockres mastery
process. While typically this lifetime is fairly short, we have noticed
the number of mles explode under certain circumstances. This patch tracks
the number of each different types of mles and should help us determine
how best to speed up the mastery process.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sunil Mushran [Thu, 26 Feb 2009 23:00:41 +0000 (15:00 -0800)]
ocfs2/dlm: Activate dlm->master_hash for master list entries
With this patch, the mles are stored in a hash and not a simple list.
This should improve the mle lookup time when the number of outstanding
masteries is large.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sunil Mushran [Thu, 26 Feb 2009 23:00:38 +0000 (15:00 -0800)]
ocfs2/dlm: Clean up struct dlm_lock_name
For master mle, the name it stored in the attached lockres in struct qstr.
For block and migration mle, the name is stored inline in struct dlm_lock_name.
This patch attempts to make struct dlm_lock_name look like a struct qstr. While
we could use struct qstr, we don't because we want to avoid having to malloc
and free the lockname string as the mle's lifetime is fairly short.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Sunil Mushran [Thu, 26 Feb 2009 23:00:37 +0000 (15:00 -0800)]
ocfs2/dlm: Encapsulate adding and removing of mle from dlm->master_list
This patch encapsulates adding and removing of the mle from the
dlm->master_list. This patch is part of the series of patches that
converts the mle list to a mle hash.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Tao Ma [Tue, 24 Feb 2009 16:53:25 +0000 (00:53 +0800)]
ocfs2: Optimize inode group allocation by recording last used group.
In ocfs2, the block group search looks for the "emptiest" group
to allocate from. So if the allocator has many equally(or almost
equally) empty groups, new block group will tend to get spread
out amongst them.
So we add osb_inode_alloc_group in ocfs2_super to record the last
used inode allocation group.
For more details, please see
http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy.
I have done some basic test and the results are a ten times improvement on
some cold-cache stat workloads.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Tao Ma [Tue, 24 Feb 2009 16:53:24 +0000 (00:53 +0800)]
ocfs2: Allocate inode groups from global_bitmap.
Inode groups used to be allocated from local alloc file,
but since we want all inodes to be contiguous enough, we
will try to allocate them directly from global_bitmap.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Tao Ma [Tue, 24 Feb 2009 16:53:23 +0000 (00:53 +0800)]
ocfs2: Optimize inode allocation by remembering last group
In ocfs2, the inode block search looks for the "emptiest" inode
group to allocate from. So if an inode alloc file has many equally
(or almost equally) empty groups, new inodes will tend to get
spread out amongst them, which in turn can put them all over the
disk. This is undesirable because directory operations on conceptually
"nearby" inodes force a large number of seeks.
So we add ip_last_used_group in core directory inodes which records
the last used allocation group. Another field named ip_last_used_slot
is also added in case inode stealing happens. When claiming new inode,
we passed in directory's inode so that the allocation can use this
information.
For more details, please see
http://oss.oracle.com/osswiki/OCFS2/DesignDocs/InodeAllocationStrategy.
Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Mark Fasheh [Thu, 19 Feb 2009 21:17:05 +0000 (13:17 -0800)]
ocfs2: fix leaf start calculation in ocfs2_dx_dir_rebalance()
ocfs2_dx_dir_rebalance() is passed the block offset of a dx leaf which needs
rebalancing. Since we rebalance an entire cluster at a time however, this
function needs to calculate the beginning of that cluster, in blocks. The
calculation was wrong, which would result in a read of non-leaf blocks. Fix
the calculation by adding ocfs2_block_to_cluster_start() which is a more
straight-forward way of determining this.
Reported-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Mark Fasheh [Wed, 18 Feb 2009 19:41:38 +0000 (11:41 -0800)]
ocfs2: re-order ocfs2_empty_dir checks
ocfs2_empty_dir() is far more expensive than checking link count. Since both
need to be checked at the same time, we can improve performance by checking
link count first.
Mark Fasheh [Fri, 21 Nov 2008 01:54:57 +0000 (17:54 -0800)]
ocfs2: Increase max links count
Since we've now got a directory format capable of handling a large number of
entries, we can increase the maximum link count supported. This only gets
increased if the directory indexing feature is turned on.
Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
Mark Fasheh [Fri, 30 Jan 2009 02:17:46 +0000 (18:17 -0800)]
ocfs2: Introduce dir free space list
The only operation which doesn't get faster with directory indexing is
insert, which still has to walk the entire unindexed directory portion to
find a free block. This patch provides an improvement in directory insert
performance by maintaining a singly linked list of directory leaf blocks
which have space for additional dirents.
Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
Mark Fasheh [Tue, 25 Nov 2008 01:02:08 +0000 (17:02 -0800)]
ocfs2: Store dir index records inline
Allow us to store a small number of directory index records in the
ocfs2_dx_root_block. This saves us a disk read on small to medium sized
directories (less than about 250 entries). The inline root is automatically
turned into a root block with extents if the directory size increases beyond
it's capacity.
Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
Mark Fasheh [Thu, 13 Nov 2008 00:27:44 +0000 (16:27 -0800)]
ocfs2: Add a name indexed b-tree to directory inodes
This patch makes use of Ocfs2's flexible btree code to add an additional
tree to directory inodes. The new tree stores an array of small,
fixed-length records in each leaf block. Each record stores a hash value,
and pointer to a block in the traditional (unindexed) directory tree where a
dirent with the given name hash resides. Lookup exclusively uses this tree
to find dirents, thus providing us with constant time name lookups.
Some of the hashing code was copied from ext3. Unfortunately, it has lots of
unfixed checkpatch errors. I left that as-is so that tracking changes would
be easier.
Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
Mark Fasheh [Wed, 12 Nov 2008 23:43:34 +0000 (15:43 -0800)]
ocfs2: Introduce dir lookup helper struct
Many directory manipulation calls pass around a tuple of dirent, and it's
containing buffer_head. Dir indexing has a bit more state, but instead of
adding yet more arguments to functions, we introduce 'struct
ocfs2_dir_lookup_result'. In this patch, it simply holds the same tuple, but
future patches will add more state.
Signed-off-by: Mark Fasheh <mfasheh@suse.com> Acked-by: Joel Becker <joel.becker@oracle.com>
Sunil Mushran [Wed, 17 Dec 2008 22:17:43 +0000 (14:17 -0800)]
ocfs2: Expose the file system state via debugfs
This patch creates a per mount debugfs file, fs_state, which exposes
information like, cluster stack in use, states of the downconvert, recovery
and commit threads, number of journal txns, some allocation stats, list of
all slots, etc.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Merge branch 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'ext3-latency-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext3: Add replace-on-rename hueristics for data=writeback mode
ext3: Add replace-on-truncate hueristics for data=writeback mode
ext3: Use WRITE_SYNC for commits which are caused by fsync()
block_write_full_page: Use synchronous writes for WBC_SYNC_ALL writebacks
Joseph Cihula [Mon, 30 Mar 2009 21:03:01 +0000 (14:03 -0700)]
x86: disable stack-protector for __restore_processor_state()
The __restore_processor_state() fn restores %gs on resume from S3. As
such, it cannot be protected by the stack-protector guard since %gs will
not be correct on function entry.
There are only a few other fns in this file and it should not negatively
impact kernel security that they will also have the stack-protector
guard removed (and so it's not worth moving them to another file).
Without this change, S3 resume on a kernel built with
CONFIG_CC_STACKPROTECTOR_ALL=y will fail.
Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Tested-by: Chris Wright <chrisw@sous-sol.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Tejun Heo <tj@kernel.org>
LKML-Reference: <49D13385.5060900@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>