connector: create connector workqueue only while needed once
The netlink connector uses its own workqueue to relay the datas sent
from userspace to the appropriate callback. If you launch the test
from Documentation/connector and change it a bit to send a high flow
of data, you will see thousands of events coming to the "cqueue"
workqueue by looking at the workqueue tracer.
This flow of events can be sent very quickly. So, to not encumber the
kevent workqueue and delay other jobs, the "cqueue" workqueue should
remain.
But this workqueue is pointless most of the time, it will always be
created (assuming you have built it of course) although only
developpers with specific needs will use it.
So avoid this "most of the time useless task", this patch proposes to
create this workqueue only when needed once. The first jobs to be
sent to connector callbacks will be sent to kevent while the "cqueue"
thread creation will be scheduled to kevent too.
The following jobs will continue to be scheduled to keventd until the
cqueue workqueue is created, and then the rest of the jobs will
continue to perform as usual, through this dedicated workqueue.
Each time I tested this patch, only the first event was sent to
keventd, the rest has been sent to cqueue which have been created
quickly.
Also, this patch fixes some trailing whitespaces on the connector files.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 1 Feb 2009 09:24:55 +0000 (01:24 -0800)]
gro: Fix handling of imprecisely split packets
The commit 89a1b249edcf9be884e71f92df84d48355c576aa (gro: Avoid
copying headers of unmerged packets) only worked for packets
which are either completely linear, completely non-linear, or
packets which exactly split at the boundary between headers and
payload.
Anything else would cause bits in the header to go missing if
the packet is held by GRO.
This may have broken drivers such as ixgbe.
This patch fixes the places that assumed or only worked with
the above cases.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
ixgbe: Update copyright dates, bump the driver version number
New year, new copyright date ranges. Also bump the driver version
number to reflect many of the recent changes.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Our current MSI-X allocation mechanism does not support new hardware
at all. It also isn't getting the actual number of supported MSI-X vectors
from the device.
This patch allows the number of MSI-X vectors to be specific to a device,
plus it gets the number of MSI-X vectors available from PCIe configuration
space.
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Sun, 1 Feb 2009 09:18:23 +0000 (01:18 -0800)]
ixgbe: Add 82598 support for BX mezzanine devices
Add the device ID for BX devices using the 82598 MAC.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarek Poplawski [Sun, 1 Feb 2009 09:13:22 +0000 (01:13 -0800)]
pkt_sched: sch_htb: Use workqueue to schedule after too many events.
Patrick McHardy <kaber@trash.net> suggested using a workqueue instead
of hrtimers to trigger netif_schedule() when there is a problem with
setting exact time of this event: 'The differnce - yeah, it shouldn't
make much, mainly wake up the qdisc earlier (but not too early) after
"too many events" occured _and_ no further enqueue events wake up the
qdisc anyways.'
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy <kaber@trash.net> suggested:
> How about making this flag and the warning message (in a out-of-line
> function) globally available? Other qdiscs (f.i. HFSC) can't deal with
> inner non-work-conserving qdiscs as well.
This patch uses qdisc->flags field of "suspected" child qdisc.
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This adds another inet device option to enable gratuitous ARP
when device is brought up or address change. This is handy for
clusters or virtualization.
Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sascha Hauer [Wed, 28 Jan 2009 23:03:11 +0000 (23:03 +0000)]
FEC: Turn FEC driver into platform device driver
This turns the fec driver into a platform device driver for new
platforms. Old platforms are still supported through a FEC_LEGACY define
till they are also ported.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Sascha Hauer [Wed, 28 Jan 2009 23:03:09 +0000 (23:03 +0000)]
fec: replace flush_dcache_range with dma_sync_single
flush_dcache_range is not portable across architectures. Use
dma_sync_single instead. Also, the memory must be synchronised in the
receive path aswell.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Acked-by: Greg Ungerer <gerg@uclinux.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Sun, 1 Feb 2009 08:54:16 +0000 (00:54 -0800)]
gianfar: Fix sparse warnings
This patch fixes following sparse warnings:
CHECK gianfar_ethtool.c
gianfar_ethtool.c:610:26: warning: symbol 'gfar_ethtool_ops' was not declared. Should it be static?
CHECK gianfar_mii.c
gianfar_mii.c:108:35: warning: cast adds address space to expression (<asn:2>)
gianfar_mii.c:119:35: warning: cast adds address space to expression (<asn:2>)
gianfar_mii.c:128:35: warning: cast adds address space to expression (<asn:2>)
gianfar_mii.c:272:5: warning: cast removes address space of expression
gianfar_mii.c:271:15: warning: cast adds address space to expression (<asn:2>)
gianfar_mii.c:340:11: warning: cast adds address space to expression (<asn:2>)
CHECK gianfar_sysfs.c
gianfar_sysfs.c:84:1: warning: symbol 'dev_attr_bd_stash' was not declared. Should it be static?
gianfar_sysfs.c:133:1: warning: symbol 'dev_attr_rx_stash_size' was not declared. Should it be static?
gianfar_sysfs.c:175:1: warning: symbol 'dev_attr_rx_stash_index' was not declared. Should it be static?
gianfar_sysfs.c:213:1: warning: symbol 'dev_attr_fifo_threshold' was not declared. Should it be static?
gianfar_sysfs.c:250:1: warning: symbol 'dev_attr_fifo_starve' was not declared. Should it be static?
gianfar_sysfs.c:287:1: warning: symbol 'dev_attr_fifo_starve_off' was not declared. Should it be static?
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Sun, 1 Feb 2009 08:53:34 +0000 (00:53 -0800)]
phylib: Rework suspend/resume code to check netdev wakeup capability
In most cases (e.g. PCI drivers) MDIO and MAC controllers are
represented by the same device. But for SOC ethernets we have
separate devices. So, in SOC case, checking whether MDIO
controller may wakeup is not only makes little sense, but also
prevents us from doing per-netdevice wakeup management.
This patch reworks suspend/resume code so that now it checks
for net device's wakeup flags, not MDIO controller's ones.
Each netdevice should manage its wakeup flags, and phylib will
decide whether suspend an attached PHY or not.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Harvey Harrison [Sun, 1 Feb 2009 08:43:54 +0000 (00:43 -0800)]
wimax: replace uses of __constant_{endian}
Base versions handle constant folding now.
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jarek Poplawski [Sun, 1 Feb 2009 08:41:42 +0000 (00:41 -0800)]
net: Optimize memory usage when splicing from sockets.
The recent fix of data corruption when splicing from sockets uses
memory very inefficiently allocating a new page to copy each chunk of
linear part of skb. This patch uses the same page until it's full
(almost) by caching the page in sk_sndmsg_page field.
With changes from David S. Miller <davem@davemloft.net>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
smsc911x: allow mac address to be saved before device reset
Some platforms (for example pcm037) do not have an EEPROM fitted,
instead storing their mac address somewhere else. The bootloader
fetches this and configures the ethernet adapter before the kernel is
started.
This patch allows a platform to indicate to the driver via the
SMSC911X_SAVE_MAC_ADDRESS flag that the mac address has already been
configured via such a mechanism, and should be saved before resetting
the chip.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Tested-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
On LAN9115/LAN9117/LAN9215/LAN9217, external phys are supported. These
are usually indicated by a hardware strap which sets an "external PHY
detected" bit in the HW_CFG register.
In some cases it is desirable to override this hardware strap and force
use of either the internal phy or an external PHY. This patch adds
SMSC911X_FORCE_INTERNAL_PHY and SMSC911X_FORCE_EXTERNAL_PHY flags so a
platform can indicate this preference via its platform_data.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Tested-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
smsc911x: add support for platform-specific irq flags
this patch adds support for the platform_device's resources to indicate
additional flags to use when registering the irq, for example
IORESOURCE_IRQ_LOWLEVEL (which corresponds to IRQF_TRIGGER_LOW). These
should be set in the irq resource flags field.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 30 Jan 2009 22:12:06 +0000 (14:12 -0800)]
packet: Avoid lock_sock in mmap handler
As the mmap handler gets called under mmap_sem, and we may grab
mmap_sem elsewhere under the socket lock to access user data, we
should avoid grabbing the socket lock in the mmap handler.
Since the only thing we care about in the mmap handler is for
pg_vec* to be invariant, i.e., to exclude packet_set_ring, we
can achieve this by simply using a new mutex.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Tested-by: Martin MOKREJŠ <mmokrejs@ribosome.natur.cuni.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Thu, 29 Jan 2009 18:00:07 +0000 (18:00 +0000)]
sfc: Replace stats_enabled flag with a disable count
Currently we use a spin-lock to serialise statistics fetches and also
to inhibit them for short periods of time, plus a flag to
enable/disable statistics fetches for longer periods of time, during
online reset. This was apparently insufficient to deal with the several
reasons for stats being disabled.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Hodgson [Thu, 29 Jan 2009 17:51:15 +0000 (17:51 +0000)]
sfc: Test for PHYXS faults whenever we cannot test link state bits
Depending on the loopback mode, there may be no pertinent link state
bits. In this case we test the PHYXS RX fault bit instead. Make
sure to do this in all cases where there are no link state bits.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Hodgson [Thu, 29 Jan 2009 17:49:59 +0000 (17:49 +0000)]
sfc: Fix post-reset MAC selection
Modify falcon_switch_mac() to always set NIC_STAT_REG, even if the the
MAC is the same as it was before. This ensures that the value is
correct after an online reset.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Steve Hodgson [Thu, 29 Jan 2009 17:48:43 +0000 (17:48 +0000)]
sfc: SFX7101: Remove workaround for bad link training
Early versions of the SFX7101 firmware could complete link training in
a state where it would not adequately cancel noise (Solarflare bug
10750). We previously worked around this by resetting the PHY after
seeing many Ethernet CRC errors. This workaround is unsafe since it
takes no account of the interval between errors; it also appears to
be unnecessary with production firmware. Therefore remove it.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Fri, 30 Jan 2009 21:45:31 +0000 (13:45 -0800)]
sky2: fix hard hang with netconsoling and iface going up
Printing anything over netconsole before hw is up and running is,
of course, not going to work.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
with current kernels, tulip 21142 ethernet controllers fail to connect
to a 10Mbps only (i.e. without negotiation-partner) network. It used
to work in 2.4 kernels. Fix that. Tested on a 21142 Rev 0x11.
Signed-off-by: Philippe De Muyter <phdm@macqel.be> Signed-off-by: David S. Miller <davem@davemloft.net>
Now phylib turns off PHY power during suspend, and thus WOL
doesn't work anymore.
This patch workarounds the issue by enabling wakeup in the MDIO
device, i.e. just restores the old behaviour for the gianfar
driver. Note that this way all PHYs on a given MDIO bus won't
be turned off during suspend, which isn't good from the power
saving point of view.
A proper, per netdevice wakeup management support will need
a bit reworked phylib suspend/resume logic.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roel Kluin [Fri, 30 Jan 2009 01:30:00 +0000 (17:30 -0800)]
smsc911x: timeout reaches -1
With a postfix decrement the timeout will reach -1 rather than 0,
so the warning will not be issued.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
smsc9420 performs an interrupt signalling test when the interface is
brought up. The current code mistakenly sets its test flag to false
AFTER enabling the software interrupt source, making failure quite
likely.
This patch changes the code to set the test flag BEFORE enabling
interrupts. I've also removed an smp_wmb because the following spinlock
provides an implicit memory barrier.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Haiying Wang [Fri, 30 Jan 2009 01:28:04 +0000 (17:28 -0800)]
ucc_geth: Change uec phy id to the same format as gianfar's
The commit b31a1d8b41513b96e9c7ec2f68c5734cef0b26a4 ("gianfar: Convert
gianfar to an of_platform_driver") changes the gianfar's phy id to the
format like "mdio@xxxx:xx", but uec still uses the old format like
"xxxxxxxx:xx". For the board whose UEC uses gianfar-mdio like
MPC8568MDS, the phy can not be attached because of the incompatible
phy id format. This patch changes uec's phy id to the same format as
gianfar's.
Signed-off-by: Haiying Wang <Haiying.Wang@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The definitions needed for the wimax stack and i2400m driver debug
infrastructure was, by mistake, compiled depending on CONFIG_DEBUG_FS
(by them being placed in the debugfs.c files); thus the build broke in
2.6.29-rc3 when debugging was enabled (CONFIG_WIMAX_DEBUG) and
DEBUG_FS was disabled.
These definitions are always needed if debug is enabled at compile
time (independently of DEBUG_FS being or not enabled), so moving them
to a file that is always compiled fixes the issue.
Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 29 Jan 2009 14:19:52 +0000 (14:19 +0000)]
gro: Open-code memcpy in napi_fraginfo_skb
This patch optimises napi_fraginfo_skb to only copy the bits
necessary. We also open-code the memcpy so that the alignment
information is always available to gcc.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 29 Jan 2009 14:19:51 +0000 (14:19 +0000)]
gro: Do not merge paged packets into frag_list
gro: Do not merge paged packets into frag_list
Bigger is not always better :)
It was easy to continue to merged packets into frag_list after the
page array is full. However, this turns out to be worse than LRO
because frag_list is a much less efficient form of storage than the
page array. So we're better off stopping the merge and starting
a new entry with an empty page array.
In future we can optimise this further by doing frag_list merging
but making sure that we continue to fill in the page array.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 29 Jan 2009 14:19:50 +0000 (14:19 +0000)]
gro: Avoid copying headers of unmerged packets
Unfortunately simplicity isn't always the best. The fraginfo
interface turned out to be suboptimal. The problem was quite
obvious. For every packet, we have to copy the headers from
the frags structure into skb->head, even though for 99% of the
packets this part is immediately thrown away after the merge.
LRO didn't have this problem because it directly read the headers
from the frags structure.
This patch attempts to address this by creating an interface
that allows GRO to access the headers in the first frag without
having to copy it. Because all drivers that use frags place the
headers in the first frag this optimisation should be enough.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 29 Jan 2009 14:19:48 +0000 (14:19 +0000)]
gro: Move common completion code into helpers
Currently VLAN still has a bit of common code handling the aftermath
of GRO that's shared with the common path. This patch moves them
into shared helpers to reduce code duplication.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Actually I did some testing and added a few printks and found that the
st->cur_skb->data was 0 and hence the ptr used by iscsi_tcp was null.
This caused the kernel panic.
I enabled the debug_tcp and with a few printks found that the code did
not go to the next_skb label and could find that the sequence being
followed was this -
Reversing the two conditions the attached patch fixes the issue for me
on top of Herbert's patches.
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Fri, 30 Jan 2009 00:05:19 +0000 (16:05 -0800)]
netxen: revert jumbo ringsize
Reducing jumbo ring size below 1024 reduces throughput for old
firmwares (3.4.216 and older) running on older (NX2031) chip,
so restore it back to 1024.
Bob Copeland [Thu, 22 Jan 2009 13:44:16 +0000 (08:44 -0500)]
ath5k: fix locking in ath5k_config
ath5k_config updates the software context without taking sc->lock.
Changes-licensed-under: 3-Clause-BSD
Signed-off-by: Bob Copeland <me@bobcopeland.com> Acked-by: Nick Kossifidis <mickflemm@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
When CONFIG_CFG80211_REG_DEBUG is enabled and an intersection
occurs we are printing the regulatory domain passed by CRDA
and indicating its the intersected regulatory domain. Lets fix
this and print the intersection as originally intended.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
cfg80211: Fix sanity check on 5 GHz when processing country IE
This fixes two issues with the sanity check loop when processing
the country IE:
1. Do not use frequency for the current subband channel check,
this was a big fat typo.
2. Apply the 5 GHz 4-channel steps when considering max channel
on each subband as was done with a recent patch.
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Zhu, Yi [Fri, 23 Jan 2009 21:45:22 +0000 (13:45 -0800)]
iwlwifi: fix kernel oops when ucode DMA memory allocation failure
The patch fixes memcpy to NULL address when the ucode DMA allocation failure.
This is a fix to bug
http://www.intellinuxwireless.org/bugzilla/show_bug.cgi?id=1861
Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Tue, 27 Jan 2009 18:31:23 +0000 (12:31 -0600)]
rtl8187: Fix error in setting OFDM power settings for RTL8187L
After reports of poor performance, a review of the latest vendor driver
(rtl8187_linux_26.1025.0328.2007) for RTL8187L devices was undertaken.
A difference was found in the code used to index the OFDM power tables. When
the Linux driver was changed, my unit works at a much greater range than
before. I think this fixes Bugzilla #12380 and has been tested by at least
two other users.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Martín Ernesto Barreyro <barreyromartin@gmail.com> Cc: Stable <stable@kernel.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Thomas Goff [Wed, 28 Jan 2009 06:39:59 +0000 (22:39 -0800)]
IPv6: Fix multicast routing bugs.
This patch addresses the IPv6 multicast routing issues described
below. It was tested with XORP 1.4/1.5 as the IPv6 PIM-SM routing
daemon against FreeBSD peers.
net/ipv6/ip6_input.c:
- Don't try to forward link-local multicast packets.
- Don't reset skb2->dev before calling ip6_mr_input() so packets can
be identified as coming from the PIM register vif properly.
net/ipv6/ip6mr.c:
- Fix incoming PIM register messages processing:
* The IPv6 pseudo-header should be included when checksumming PIM
messages (RFC 4601 section 4.9; RFC 3973 section 4.7.1).
* Packets decapsulated from PIM register messages should have
skb->protocol ETH_P_IPV6.
- Enable/disable IPv6 multicast forwarding on the corresponding
interface when a routing daemon adds/removes a multicast virtual
interface.
- Remove incorrect skb_pull() to fix userspace signaling.
- Enable/disable global IPv6 multicast forwarding when an IPv6
multicast routing socket is opened/closed.
net/ipv6/route.c:
- Don't use strict routing logic for packets decapsulated from PIM
register messages (similar to disabling rp_filter for the IPv4
case).
Signed-off-by: Thomas Goff <thomas.goff@boeing.com> Reviewed-by: Fred Templin <fred.l.templin@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Wed, 28 Jan 2009 06:30:19 +0000 (22:30 -0800)]
net: fix xfrm reverse flow lookup for icmp6
This patch fixes the xfrm reverse flow lookup for icmp6 so that icmp6 packets
don't get lost over ipsec tunnels. Similar patch is in RHEL5 kernel for a quite
long time and I do not see why it isn't in mainline.
Signed-off-by: Jiri Pirko <jpirko@redhat.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 28 Jan 2009 01:45:10 +0000 (17:45 -0800)]
net: wrong test in inet_ehash_locks_alloc()
In commit 9db66bdcc83749affe61c61eb8ff3cf08f42afec (net: convert
TCP/DCCP ehash rwlocks to spinlocks), I forgot to change one
occurrence of rwlock_t to spinlock_t
I believe sizeof(raw_spinlock_t) might be > 0 on !CONFIG_SMP if
CONFIG_DEBUG_SPINLOCK while sizeof(raw_rwlock_t) should be 0 in this
case.
Fortunatly, CONFIG_DEBUG_SPINLOCK adds fields to both spinlock_t and
rwlock_t, but at this might change in the future (being able to debug
spinlocks but not rwlocks for example), better to be safe.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jesse Brandeburg [Wed, 28 Jan 2009 00:41:58 +0000 (16:41 -0800)]
e1000: fix bug with shared interrupt during reset
A nasty bug was found where an MTU change (or anything else that caused a
reset) could race with the interrupt code. The interrupt code was entered
by a shared interrupt during the MTU change.
This change prevents the interrupt code from running while the driver is in
the middle of its reset path.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 28 Jan 2009 00:34:47 +0000 (16:34 -0800)]
net: Get rid of by-hand TX queue hashing.
We now only TX hash on pre-computed SKB properties.
The thinking is:
1) High performance routing and firewalling setups will
have a multiqueue capable card used for receive, and
therefore would have RX queue recordings made into
the SKB which can be used for the TX side hash.
2) Locally generated packets will have an attached socket
and thus a valid sk->sk_hash to make use of.
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: Fix length tcp_splice_data_recv passes to skb_splice_bits.
tcp_splice_data_recv has two lengths to consider: the len parameter it
gets from tcp_read_sock, which specifies the amount of data in the skb,
and rd_desc->count, which is the amount of data the splice caller still
wants. Currently it passes just the latter to skb_splice_bits, which then
splices min(rd_desc->count, skb->len - offset) bytes.
Most of the time this is fine, except when the skb contains urgent data.
In that case len goes only up to the urgent byte and is less than
skb->len - offset. By ignoring len tcp_splice_data_recv may a) splice
data tcp_read_sock told it not to, b) return to tcp_read_sock a value > len.
Now, tcp_read_sock doesn't handle used > len and leaves the socket in a
bad state (both sk_receive_queue and copied_seq are bad at that point)
resulting in duplicated data and corruption.
Fix by passing min(rd_desc->count, len) to skb_splice_bits.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 27 Jan 2009 05:35:35 +0000 (21:35 -0800)]
udp: optimize bind(0) if many ports are in use
commit 9088c5609584684149f3fb5b065aa7f18dcb03ff
(udp: Improve port randomization) introduced a regression for UDP bind() syscall
to null port (getting a random port) in case lot of ports are already in use.
This is because we do about 28000 scans of very long chains (220 sockets per chain),
with many spin_lock_bh()/spin_unlock_bh() calls.
Fix this using a bitmap (64 bytes for current value of UDP_HTABLE_SIZE)
so that we scan chains at most once.
Instead of 250 ms per bind() call, we get after patch a time of 2.9 ms
Based on a report from Vitaly Mayatskikh
Reported-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Tested-by: Vitaly Mayatskikh <v.mayatskih@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
smsc911x_set_multicast_list currently performs the only non-atomic
read-modify-write of INT_EN. This patch permanently enables the
RXSTOP_INT interrupt, and changes the ISR to only conditionally run the
multicast filter workaround code.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexey Dobriyan [Tue, 27 Jan 2009 05:10:08 +0000 (21:10 -0800)]
pppol2tp: stop using proc internals
PDE_NET usage in driver code is a sign and, indeed, switching
to seq_open_net/seq_release_net saves code and fixes bogus things, like
user triggerabble BUG_ON(!net) after maybe_get_net, and NULLifying ->private.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Incoming packets and sockets are already gone.
The netdevice notifier is unregistered under the RTNL lock
There remains a race with the rtnetlink handlers unregistration, but it
is a generic RTNL issue that was already present before this change.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ira W. Snyder [Tue, 27 Jan 2009 05:00:33 +0000 (21:00 -0800)]
virtio_net: use correct accessors for scatterlists
Without this fix, virtio_net makes incorrect usage of scatterlists. It sets
the end of the scatterlist chain after the first element, despite the fact
that more entries come after it.
If you try to run dma_map_sg() on one of the scatterlists given to you by
add_buf(), you will get a null pointer oops.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Tue, 27 Jan 2009 04:57:51 +0000 (20:57 -0800)]
ixgbe: add support KX/KX4 device
And support for the KX/KX4 mezzanine card. Device id 0x10B6.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Don Skidmore [Tue, 27 Jan 2009 04:57:17 +0000 (20:57 -0800)]
ixgbe: fix slow load times on 82598 nics
Load times for NICs that use i2c to communicate with the phy were taking
~4.5 sec per port. This fix first checks to see if the link is already
up before calling get_link_capabilities, since if it is we don't need
query the phy for link state.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Timo Teras [Tue, 27 Jan 2009 04:56:10 +0000 (20:56 -0800)]
gre: optimize hash lookup
Instead of keeping candidate tunnel device from all categories,
keep only one candidate with best score. This optimizes stack
usage and speeds up exit code.
Signed-off-by: Timo Teras <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>