Alexey Dobriyan [Thu, 28 Aug 2008 09:53:51 +0000 (02:53 -0700)]
net: more #ifdef CONFIG_COMPAT
All users of struct proto::compat_[gs]etsockopt and
struct inet_connection_sock_af_ops::compat_[gs]etsockopt are under
#ifdef already, so use it in structure definition too.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 28 Aug 2008 08:11:25 +0000 (01:11 -0700)]
ip: speedup /proc/net/rt_cache handling
When scanning route cache hash table, we can avoid taking locks for
empty buckets. Both /proc/net/rt_cache and NETLINK RTM_GETROUTE
interface are taken into account.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andi Kleen [Thu, 28 Aug 2008 08:08:02 +0000 (01:08 -0700)]
tcp: Skip empty hash buckets faster in /proc/net/tcp
On most systems most of the TCP established/time-wait hash buckets are empty.
When walking the hash table for /proc/net/tcp their read locks would
always be aquired just to find out they're empty. This patch changes the code
to check first if the buckets have any entries before taking the lock, which
is much cheaper than taking a lock. Since the hash tables are large
this makes a measurable difference on processing /proc/net/tcp,
especially on architectures with slow read_lock (e.g. PPC)
On a 2GB Core2 system time cat /proc/net/tcp > /dev/null (with a mostly
empty hash table) goes from 0.046s to 0.005s.
On systems with slower atomics (like P4 or POWER4) or larger hash tables
(more RAM) the difference is much higher.
This can be noticeable because there are some daemons around who regularly
scan /proc/net/tcp.
Original idea for this patch from Marcus Meissner, but redone by me.
Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Sat, 23 Aug 2008 11:28:27 +0000 (13:28 +0200)]
dccp ccid-3: Replace lazy BUG_ON with condition
The BUG_ON(w_tot == 0) only holds if there is no more than 1 loss interval in
the loss history. If there is only a single loss interval, the calc_i_mean()
routine need in fact not be called (RFC 3448, 6.3.1).
Gerrit Renker [Sat, 23 Aug 2008 11:28:27 +0000 (13:28 +0200)]
dccp: Toggle debug output without module unloading
This sets the sysfs permissions so that root can toggle the `debug'
parameter available for nearly every DCCP module. This is useful
since there are various module inter-dependencies. The debug flag
can now be toggled at runtime using
Gerrit Renker [Sat, 23 Aug 2008 11:28:27 +0000 (13:28 +0200)]
dccp: Empty the write queue when disconnecting
dccp_disconnect() can be called due to several reasons:
1. when the connection setup failed (inet_stream_connect());
2. when shutting down (inet_shutdown(), inet_csk_listen_stop());
3. when aborting the connection (dccp_close() with 0 linger time).
In case (1) the write queue is empty. This patch empties the write queue,
if in case (2) or (3) it was not yet empty.
This avoids triggering the write-queue BUG_TRAP in sk_stream_kill_queues()
later on.
It also seems natural to do: when breaking an association, to delete all
packets that were originally intended for the soon-disconnected end (compare
with call to tcp_write_queue_purge in tcp_disconnect()).
Gerrit Renker [Sat, 23 Aug 2008 11:28:27 +0000 (13:28 +0200)]
dccp: Fill in the Data fields for "Option Error" Resets
This updates the use of the `out_invalid_option' label, which produces a
Reset (code 5, "Option Error"), to fill in the Data1...Data3 fields as
specified in RFC 4340, 5.6.
Gerrit Renker [Sat, 23 Aug 2008 11:28:27 +0000 (13:28 +0200)]
dccp: Silently ignore options with nonsensical lengths
This updates the option-parsing code with regard to RFC 4340, 5.8:
"[..] options with nonsensical lengths (length byte less than two or more
than the remaining space in the options portion of the header) MUST be
ignored, and any option space following an option with nonsensical length
MUST likewise be ignored."
Hence in the following cases erratic options will be ignored:
1. The type byte of a multi-byte option is the last byte of the header
options (i.e. effective option length of 1).
2. The value of the length byte is less than the minimum 2. This has been
changed from previously 3: although no multi-byte option with a length
less than 3 yet exists (cf. table 3 in 5.8), a length of 2 is valid.
(The switch-statement in dccp_parse has further per-option length checks.)
3. The option length exceeds the length of the remaining option space.
Wei Yongjun [Sat, 23 Aug 2008 11:28:27 +0000 (13:28 +0200)]
dccp: Always generate a Reset in response to option errors
RFC4340 states that if a packet is received with an option error (such as a
Mandatory Option as the last byte of the option list), the endpoint should
repond with a Reset.
In the LISTEN and RESPOND states, the endpoint correctly reponds with Reset,
while in the REQUEST/OPEN states, packets with option errors are just ignored.
Julius Volz [Fri, 22 Aug 2008 12:06:12 +0000 (14:06 +0200)]
IPVS: Integrate ESP protocol into ip_vs_proto_ah.c
Rename all ah_* functions to ah_esp_* (and adjust comments). Move ESP
protocol definition into ip_vs_proto_ah.c and remove all usage of
ip_vs_proto_esp.c.
Make the compilation of ip_vs_proto_ah.c dependent on a new config
variable, IP_VS_PROTO_AH_ESP, which is selected either by
IP_VS_PROTO_ESP or IP_VS_PROTO_AH. Only compile the selected protocols'
structures within this file.
Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: Simon Horman <horms@verge.net.au>
Ilpo Järvinen [Sat, 23 Aug 2008 12:10:12 +0000 (05:10 -0700)]
tcp: Add tcp_validate_incoming & put duplicated code there
Large block of code duplication removed.
Sadly, the return value thing is a bit tricky here but it
seems the most sensible way to return positive from validator
on success rather than negative.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
David Kilroy [Thu, 21 Aug 2008 22:28:01 +0000 (23:28 +0100)]
orinoco: Use a macro to define wireless handlers
The macro identifiers for the various ioctls required for WPA support
are longer than those currently used by the driver. This makes it messy
to keep line length below 80 character.
By defining a macro to initialise the handler table, we recover the
common text.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Kilroy [Thu, 21 Aug 2008 22:27:59 +0000 (23:27 +0100)]
orinoco: Don't use boolean parameter to record encoding type
For WPA support we need to encode NONE, WEP and TKIP in the encoding
parameter. In anticipation of this we need to change the usage away from
the current boolean usage.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Kilroy [Thu, 21 Aug 2008 22:27:54 +0000 (23:27 +0100)]
orinoco: Invoke firmware download in main driver
Firmware download is enabled for Agere in orinoco_cs. Symbol firmware
download has been moved out of spectrum_cs into orinoco_cs. Firmware
download is not enabled for Intersil.
Symbol based firmware is restricted to only download on spectrum_cs
based cards.
The firmware names are hardcoded for each firmware type.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Kilroy [Thu, 21 Aug 2008 22:27:53 +0000 (23:27 +0100)]
orinoco: Extend hermes_dld routines for Agere firmware
Add programming initialisation and termination functions.
Add checks to avoid overrunning the firmware image or PDA areas.
Extra algorithm to program PDA values using defaults where necessary.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Kilroy [Thu, 21 Aug 2008 22:27:52 +0000 (23:27 +0100)]
orinoco: Make firmware download logic more generic
Ensure PDA read is terminated.
Prevent invalid programming blocks from causing reads outside the
firmware image
Turn off aux stuff when finished.
Option to program in limited block sizes (controlled by macro).
Option to read PDA from EEPROM.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Kilroy [Thu, 21 Aug 2008 22:27:50 +0000 (23:27 +0100)]
orinoco: Add function to execute Hermes initialisation commands synchronously
The current synchronous execution function doesn't work
for certain Hermes commands which clear the MAGIC number from
SWSUPPORT0. These commands seem to be related to initialisation or
programming, for example HERMES_CMD_INIT.
Replicate hermes_docmd_wait for commands which clear the MAGIC number
from SWSUPPORT0. This version accepts two extra arguments which are
passed straight to the firmware.
Functionality copied out of hermes_init.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Kilroy [Thu, 21 Aug 2008 22:27:47 +0000 (23:27 +0100)]
orinoco: Update scan translation
Report channel, beacon interval and capabilities.
Use WEXT defines instead of magic numbers.
State quality stats in dB.
Also a few changes to keep line length less than 80.
Signed-off-by: David Kilroy <kilroyd@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ron Rindjunsky [Sat, 9 Aug 2008 00:02:19 +0000 (03:02 +0300)]
mac80211: add direct probe before association
This patch adds a direct probe request as first step in the association
flow if data we have is not up to date. Motivation of this step is to make
sure that the bss information we have is correct, since last scan could
have been done a while ago, and beacons do not fully answer this need as
there are potential differences between them and probe responses (e.g.
WMM parameter element)
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
It's been a long time, but fullmac prism54 driver is still around...
I think we should rename every prism54* in order to avoid some
confusion about "what is actually what" in the future ;-).
Thanks-to: Maxi <maxi@daemonizer.de> Signed-off-by: Christian Lamparter <chunkeey@web.de> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ron Rindjunsky [Thu, 7 Aug 2008 22:50:46 +0000 (01:50 +0300)]
mac80211: change number of pre-assoc scans
This patch fixes noticed problem in noisy environments of 50+ APs
that scan fails to find the requested AP on first try, which
leads to connection refusal. second scan has empirically proven to fix
this problem in almost all cases.
Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com> Signed-off-by: Esti Kummer <ester.kummer@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Wed, 6 Aug 2008 15:27:31 +0000 (17:27 +0200)]
rt2x00: Add module parameter to disable HW crypto
Add a module parameter to rt61 and rt73 to disable
HW crypto. The option should only be checked when
determining if the SUPPORT_HW_CRYPTO flag should
be set or not.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Wed, 6 Aug 2008 14:22:17 +0000 (16:22 +0200)]
rt2x00: Move lna_gain calculation to config() callback
We can optimize lna calculation in IRQ context by
calculating most of the value during the config() callback
when most of the value is actually influenced.
This will be required later by rt2800pci and rt2800usb as
well, since they need the lna_gain value during config().
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Tomas Winkler [Wed, 6 Aug 2008 11:22:01 +0000 (14:22 +0300)]
mac80211: cleanup mlme state namespace
This patch move add STA_MLME to station mlme state defines.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 4 Aug 2008 14:38:47 +0000 (16:38 +0200)]
rt2x00: Gather channel information in structure
Channel information which is read from EEPROM should
be read into an array containing per-channel information.
This removes the requirement of multiple arrays and makes
the channel handling a bit cleaner and easier to expand.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Ivo van Doorn [Mon, 4 Aug 2008 14:37:44 +0000 (16:37 +0200)]
rt2x00: Implement HW encryption
Various rt2x00 devices support hardware encryption.
Most of them require the IV/EIV to be generated by mac80211,
but require it to be provided seperately instead of within
the frame itself. This means that rt2x00lib should extract
the data from the frame and place it in the frame descriptor.
During RX the IV/EIV is provided in the descriptor by the
hardware which means that it should be inserted into the
frame by rt2x00lib.
Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Tomas Winkler [Sun, 3 Aug 2008 11:32:01 +0000 (14:32 +0300)]
mac80211: filter probes in ieee80211_rx_mgmt_probe_resp
This patch moves filtering statement from ieee80211_rx_bss_info
which is called for both beacon and probe to ieee80211_rx_mgmt_probe_resp
and save few cycles in beacon parsing.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
replace net_device arguments with ieee80211_{local,sub_if_data} as appropriate
This patch replaces net_device arguments to mac80211 internal functions
with ieee80211_{local,sub_if_data} as appropriate.
It also does the same for many 802.11s mesh functions, and changes the
mesh path table to be indexed on sub_if_data rather than net_device.
If the mesh part needs to be a separate patch let me know, but since
mesh uses a lot of mac80211 functions which were being converted anyway,
the changes go hand-in-hand somewhat.
This patch probably does not convert all the functions which could be
converted, but it is a large chunk and followup patches will be
provided.
Signed-off-by: Jasper Bryant-Greene <jasper@amiton.co.nz> Signed-off-by: John W. Linville <linville@tuxdriver.com>
ETH_P_PAE belongs in if_ether.h with the other ETH_P_* definitions. This
patch moves it there.
Signed-off-by: Jasper Bryant-Greene <jasper@amiton.co.nz> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
While it is interesting to not add last-enum-markers because it allows gcc
to warn us of switch() statements missing a valid state, we really should
be handling memory corruption on a rfkill state with default clauses,
anyway.
So add RFKILL_STATE_MAX and use it where applicable. It makes for safer
code in the long run.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
rfkill is not a small, mere detail in wireless support. Once it starts
supporting rfkill and users start counting on that support, a wireless
device is at risk of operating in dangerous conditions should rfkill
support fail to properly activate.
Therefore, add the required __must_check annotations on some key functions
of the rfkill API, for which the wireless drivers absolutely MUST handle
the failure mode safely in order to avoid a potentially dangerous situation
where the wireless transmitter is left enabled when the user don't want it
to.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Cc: Matthew Garrett <mjg@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Add a second set of global states, "rfkill_default_states", to track the
state that will be used when the first rfkill class of a given type is
registered, and also to save "undo" information when rfkill_epo is called.
Add a new exported function, rfkill_set_default(), which can be used by
platform drivers to restore radio state saved by the platform across
reboots or shutdown.
Also, fix rfkill_epo to properly update rfkill_states, but still preserve a
copy of the state so that we can undo the effect of rfkill_epo later if we
want to. Add rfkill_restore_states() to restore rfkill_states from the
copy.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Detect and abort with -EEXIST if rfkill_register is called twice on the
same rfkill struct. And WARN_ON(it) for good measure.
While at it, flag when we are adding the first switch of a type, we will
need that information later.
Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Ivo van Doorn <IvDoorn@gmail.com> Cc: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Luis Carlos Cobo [Thu, 14 Aug 2008 17:40:57 +0000 (10:40 -0700)]
libertas_tf: main.c, data paths and mac80211 handlers
This patch contains most of the libertastf driver, just lacking command helper
functions and usb specific functions. Currently, monitor, managed, ap and mesh
interfaces are supported. Even though this driver supports the same hardware as
the "libertas" driver, it uses a different (thin) firmware, that makes it
suitable for a mac80211 driver.
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Luis Carlos Cobo [Thu, 14 Aug 2008 17:40:48 +0000 (10:40 -0700)]
mac80211: allow no mac address until firmware load
Originally by Johannes Berg. This patch adds support for devices that do not
report their MAC address until the firmware is loaded. While the address is not
known, a multicast on is used.
Signed-off-by: Luis Carlos Cobo <luisca@cozybit.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Harvey Harrison [Wed, 16 Jul 2008 01:44:05 +0000 (18:44 -0700)]
mac80211: explicitly check skb->len
ieee80211_get_hdrlen_from_skb internally checks the skb is long enough to
hold the full ieee80211_hdr, else it returns zero. Use ieee80211_hdrlen
which always returns the hdrlen and check the remaining room in the
skb explicitly when removing encryption headers or the qos control field.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Harvey Harrison [Wed, 16 Jul 2008 01:44:02 +0000 (18:44 -0700)]
ath5k: explicitly check skb->len
ieee80211_get_hdrlen_from_skb internally checks that the skb is long
enough to hold the full header, or it returns 0 if not. The check in
ath5k does not check this case and assumes it always got the actual
header length which it then checks against the skb->len plus some headroom.
Change to ieee80211_hdrlen which always returns the hdrlen and keep the
existing headroom check.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Harvey Harrison [Wed, 16 Jul 2008 01:43:56 +0000 (18:43 -0700)]
b43legacy: use le16 frame control directly, avoid byteswapping
Acked-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Esti Kummer [Mon, 4 Aug 2008 08:00:45 +0000 (16:00 +0800)]
iwlwifi: add level for debugging host command
This patch adds another level for debugging host command. This adds an
option to suppress the debug prints for sensitivity and link quality
commands.
Signed-off-by: Esti Kummer <ester.kummer@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Tomas Winkler [Mon, 4 Aug 2008 08:00:41 +0000 (16:00 +0800)]
iwlwifi: kill struct iwl4965_lq_mngr
This patch removes struct iwl4965_lq_mngr it is not used.
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Bruno Randolf [Wed, 30 Jul 2008 15:12:58 +0000 (17:12 +0200)]
ath5k: rates cleanup
cleanup the rates structures used by ath5k. instead of separate driver and
mac80211 rate structures we now setup a static ieee80211_rate array and use it
directly. no conversion between two different rate structures has to be done
any more. a lot of unused and confusing junk was deleted.
renamed ath5k_getchannels into ath5k_setup_bands because this is what it does.
rewrote it to copy the bitrates correctly for each band. this is necessary for
running different hardware with the same driver (e.g. 5211 and 5212 based
cards).
add special handling of rates for AR5211 chipsets: it uses different rate codes
for CCK rates (which are actually like the other chips but with a 0xF mask).
setup a hardware code to rate index reverse mapping table for getting the rate
index of received frames.
the rates for control frames which have to be set in
ath5k_hw_write_rate_duration are now in one single array.
There were 3 code copy and pastes of reset. Unify the resets and place
in separate function.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Nick Kossifidis <mickflemm@gmail.com> Cc: Luis R. Rodriguez <mcgrof@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Julia Lawall [Wed, 16 Jul 2008 14:34:54 +0000 (16:34 +0200)]
net/ieee80211: adjust error handling
Converts a test in error handling code to a sequence of labels.
The semantic match that found the problem is:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
expression E,E1,E2;
@@
E = alloc_etherdev(...)
... when != E = E1
if (...) { ... free_netdev(E); ... return ...; }
... when != E = E2
(
if (...)
{
... when != free_netdev(E);
return dev; }
|
* if (...)
{
... when != free_netdev(E);
return ...; }
|
register_netdev(E)
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Brian Cavagnolo [Mon, 21 Jul 2008 18:02:46 +0000 (11:02 -0700)]
libertas: support boot commands to write persistent firmware and bootloader
Add locking and non-locking versions of if_usb_prog_firmware to support
programming firmware after and before driver bring-up respectively. Add more
suitable error codes for firmware programming process. Add capability checks
for persistent features before attempting to use them.
Based on patches from Brajesh Dave and Priyank Singh.
Signed-off-by: Brian Cavagnolo <brian@cozybit.com> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Jarek Poplawski [Thu, 21 Aug 2008 12:11:14 +0000 (05:11 -0700)]
pkt_sched: Fix qdisc_watchdog() vs. dev_deactivate() race
dev_deactivate() can skip rescheduling of a qdisc by qdisc_watchdog()
or other timer calling netif_schedule() after dev_queue_deactivate().
We prevent this checking aliveness before scheduling the timer. Since
during deactivation the root qdisc is available only as qdisc_sleeping
additional accessor qdisc_root_sleeping() is created.
With feedback from Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Al Viro [Wed, 20 Aug 2008 21:09:24 +0000 (14:09 -0700)]
cramfs: fix named-pipe handling
After commit a97c9bf33f4612e2aed6f000f6b1d268b6814f3c (fix cramfs
making duplicate entries in inode cache) in kernel 2.6.14, named-pipe
on cramfs does not work properly.
It seems the commit make all named-pipe on cramfs share their inode
(and named-pipe buffer).
Make ..._test() refuse to merge inodes with ->i_ino == 1, take inode setup
back to get_cramfs_inode() and make ->drop_inode() evict ones with ->i_ino
== 1 immediately.
Reported-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@kernel.org> [2.6.14 and later] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
relying on this behaviour was incorrect in any case and the BUG also
appeared when the device node was on an ext3 filesystem.
v2: override a_ops at open() time rather than mmap() time to minimise
races per AKPM's concerns.
Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Cc: Jaya Kumar <jayakumar.lkml@gmail.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Hugh Dickins <hugh@veritas.com> Cc: Johannes Weiner <hannes@saeurebad.de> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Kel Modderman <kel@otaku42.de> Cc: Markus Armbruster <armbru@redhat.com> Cc: Krzysztof Helt <krzysztof.h1@poczta.fm> Cc: <stable@kernel.org> [14fcc23fd is in 2.6.25.14 and 2.6.26.1] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Wed, 20 Aug 2008 21:09:20 +0000 (14:09 -0700)]
mm: xip/ext2 fix block allocation race
XIP can call into get_xip_mem concurrently with the same file,offset with
create=1. This usually maps down to get_block, which expects the page
lock to prevent such a situation. This causes ext2 to explode for one
reason or another.
Serialise those calls for the moment. For common usages today, I suspect
get_xip_mem rarely is called to create new blocks. In future as XIP
technologies evolve we might need to look at which operations require
scalability, and rework the locking to suit.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jared Hulbert <jaredeh@gmail.com> Acked-by: Carsten Otte <cotte@freenet.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Wed, 20 Aug 2008 21:09:20 +0000 (14:09 -0700)]
mm: xip fix fault vs sparse page invalidate race
XIP has a race between sparse pages being inserted into page tables, and
sparse pages being zapped when its time to put a non-sparse page in.
What can happen is that a process can be left with a dangling sparse page
in a MAP_SHARED mapping, while the rest of the world sees the non-sparse
version. Ie. data corruption.
Guard these operations with a seqlock, making fault-in-sparse-pages the
slowpath, and try-to-unmap-sparse-pages the fastpath.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jared Hulbert <jaredeh@gmail.com> Acked-by: Carsten Otte <cotte@freenet.de> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nick Piggin [Wed, 20 Aug 2008 21:09:18 +0000 (14:09 -0700)]
mm: dirty page tracking race fix
There is a race with dirty page accounting where a page may not properly
be accounted for.
clear_page_dirty_for_io() calls page_mkclean; then TestClearPageDirty.
page_mkclean walks the rmaps for that page, and for each one it cleans and
write protects the pte if it was dirty. It uses page_check_address to
find the pte. That function has a shortcut to avoid the ptl if the pte is
not present. Unfortunately, the pte can be switched to not-present then
back to present by other code while holding the page table lock -- this
should not be a signal for page_mkclean to ignore that pte, because it may
be dirty.
For example, powerpc64's set_pte_at will clear a previously present pte
before setting it to the desired value. There may also be other code in
core mm or in arch which do similar things.
The consequence of the bug is loss of data integrity due to msync, and
loss of dirty page accounting accuracy. XIP's __xip_unmap could easily
also be unreliable (depending on the exact XIP locking scheme), which can
lead to data corruption.
Fix this by having an option to always take ptl to check the pte in
page_check_address.
It's possible to retain this optimization for page_referenced and
try_to_unmap.
Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Jared Hulbert <jaredeh@gmail.com> Cc: Carsten Otte <cotte@freenet.de> Cc: Hugh Dickins <hugh@veritas.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP
process, only the task leader of the process is affected, all other
sibling LWP threads didn't receive the setting. The problem was that the
iterator used in sys_setpriority() only iteartes over one task for each
process, ignoring all other sibling thread.
Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk
each thread of a process. Convert 4 call sites in {set/get}priority and
ioprio_{set/get}.
Signed-off-by: Ken Chen <kenchen@google.com> Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Roland McGrath <roland@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>