* master.kernel.org:/home/rmk/linux-2.6-serial:
[SERIAL] 8250: add locking to console write function
[SERIAL] Remove unconditional enable of TX irq for console
[SERIAL] 8250: set divisor register correctly for AMD Alchemy SoC uart
[SERIAL] AMD Alchemy UART: claim memory range
[SERIAL] Clean up serial locking when obtaining a reference to a port
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[NET_SCHED]: HFSC: fix thinko in hfsc_adjust_levels()
[IPV6]: skb leakage in inet6_csk_xmit
[BRIDGE]: Do sysfs registration inside rtnl.
[NET]: Do sysfs registration as part of register_netdevice.
[TG3]: Fix possible NULL deref in tg3_run_loopback().
[NET] linkwatch: Handle jiffies wrap-around
[IRDA]: Switching to a workqueue for the SIR work
[IRDA]: smsc-ircc: Minimal hotplug support.
[IRDA]: Removing unused EXPORT_SYMBOLs
[IRDA]: New maintainer.
[NET]: Make netdev_chain a raw notifier.
[IPV4]: ip_options_fragment() has no effect on fragmentation
[NET]: Add missing operstates documentation.
Jens Axboe [Thu, 11 May 2006 06:20:16 +0000 (08:20 +0200)]
[BLOCK] limit request_fn recursion
Don't recurse back into the driver even if the unplug threshold is met,
when the driver asks for a requeue. This is both silly from a logical
point of view (requeues typically happen due to driver/hardware
shortage), and also dangerous since we could hit an endless request_fn
-> requeue -> unplug -> request_fn loop and crash on stack overrun.
Also limit blk_run_queue() to one level of recursion, similar to how
blk_start_queue() works.
This patch fixed a real problem with SLES10 and lpfc, and it could hit
any SCSI lld that returns non-zero from it's ->queuecommand() handler.
Linus Torvalds [Thu, 11 May 2006 18:08:49 +0000 (11:08 -0700)]
ptrace_attach: fix possible deadlock schenario with irqs
Eric Biederman points out that we can't take the task_lock while holding
tasklist_lock for writing, because another CPU that holds the task lock
might take an interrupt that then tries to take tasklist_lock for writing.
Which would be a nasty deadlock, with one CPU spinning forever in an
interrupt handler (although admittedly you need to really work at
triggering it ;)
Since the ptrace_attach() code is special and very unusual, just make it
be extra careful, and use trylock+repeat to avoid the possible deadlock.
Cc: Oleg Nesterov <oleg@tv-sign.ru> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Roland McGrath <roland@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Addresses for ioremap must be calculated off of pci_resource_start;
we can't directly use the bus address as seen by the HCA. Fix the
code that remaps device memory for FMR access.
Based on patch by Klaus Smolin.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
James Cameron [Wed, 10 May 2006 20:33:29 +0000 (13:33 -0700)]
sis900: phy for FoxCon motherboard
661FX7MI-S motherboard which uses the SiS 661FX chipset. The patch adds
an entry to mii_chip_info for the transceiver.
The PHY ids were found using the sis900_c_122.diff patch from
http://brownhat.org/sis900.html but that patch didn't solve the problem,
because the PHY at address 1 was already being chosen.
Without my patch, when bursts of packets arrive from other hosts on a
LAN, the interface dropped one roughly 10% of the time, causing
retransmits. There were fifth second pauses in refresh of large xterms,
and it made Netrek suck. I can provide further test data.
Workaround in lieu of patch is to use mii-tool to advertise
100baseTx-HD, then force renegotiation.
I wasn't able to identify the actual transceiver, so the description
field is a guess.
This patch is similar to Artur Skawina's patch:
http://marc.theaimsgroup.com/?l=linux-netdev&m=114297516729079&w=2
I'm not sure, but I wonder if it means the default behaviour should be
changed, so as to better handle future transceivers.
Diff is against 2.6.16.13.
Signed-off-by: James Cameron <james.cameron@hp.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
phy: mdiobus_register(): initialize all phy_map entries
make sure phy_map entries whose PHY address is masked are initialized
to NULL, given that other code (such as mdiobus_unregister for
instance) assumes that non-NULL phy_map entries are allocated
phy_devices
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org> Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Harald Welte [Wed, 10 May 2006 11:28:52 +0000 (13:28 +0200)]
[Cardman 40x0] Fix udev device creation
This patch corrects the order of the calls to register_chrdev() and
pcmcia_register_driver(). Now udev correctly creates userspace device
files /dev/cmmN and /dev/cmxN respectively.
Based on an earlier patch by Jan Niehusmann <jan@gondor.com>.
Signed-off-by: Harald Welte <laforge@netfilter.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Now that netdevice sysfs registration is done as part of
register_netdevice; bridge code no longer has to be tricky when adding
it's kobjects to bridges.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[NET]: Do sysfs registration as part of register_netdevice.
The last step of netdevice registration was being done by a delayed
call, but because it was delayed, it was impossible to return any error
code if the class_device registration failed.
Side effects:
* one state in registration process is unnecessary.
* register_netdevice can sleep inside class_device registration/hotplug
* code in netdev_run_todo only does unregistration so it is simpler.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
mdr@sgi.com [Mon, 1 May 2006 18:07:04 +0000 (13:07 -0500)]
[SCSI] mptfc: race between mptfc_register_dev and mptfc_target_alloc
A race condition exists in mptfc between the thread registering a device
with the fc transport and the scan work generated by the transport.
This race existed prior to the application of the mptfc bug fix patch.
mptfc_register_dev() calls fc_remote_port_add() with the FC_RPORT_ROLE_TARGET
bit set in the rport ids passed to the function. Having this bit set causes
fc_remote_port_add() to schedule a scan of the device.
This scan can execute before mptfc_register_dev() can fill in the dd_data
in the rport structure. When this happens, mptfc_target_alloc() will fail
because dd_data is null.
Attached is a patch which fixes the problem. The patch changes the rport ids
passed to fc_remote_port_add() to not have the TARGET bit set. This prevents
the scan from being scheduled. After mptfc_register_dev() fills in the rport
dd_data field, fc_remote_port_rolechg() is called, changing the role of the
rport to TARGET. Thus, the scan is scheduled after dd_data is filled
in which prevents the failure in mptfc_target_alloc().
Signed-off-by: Michael Reed <mdr@sgi.com> Signed-off-by: Eric Moore <Eric.Moore@lsil.com> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Jesper Juhl [Wed, 10 May 2006 06:14:35 +0000 (23:14 -0700)]
[TG3]: Fix possible NULL deref in tg3_run_loopback().
tg3_run_loopback doesn't check that dev_alloc_skb() returns anything
useful.
Even if dev_alloc_skb() fails to return an skb to us we'll happily go
on and assume it did, so we risk dereferencing a NULL pointer. Much
better to fail gracefully by returning -ENOMEM than crashing here.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roland Dreier [Wed, 10 May 2006 05:54:59 +0000 (22:54 -0700)]
IPoIB: Free child interfaces properly
When deleting a child interface with a non-default P_Key via
/sys/class/net/ibX/delete_child, the interface must be freed with
free_netdev() (rather than kfree() on the private data).
Herbert Xu [Tue, 9 May 2006 22:27:54 +0000 (15:27 -0700)]
[NET] linkwatch: Handle jiffies wrap-around
The test used in the linkwatch does not handle wrap-arounds correctly.
Since the intention of the code is to eliminate bursts of messages we
can afford to delay things up to a second. Using that fact we can
easily handle wrap-arounds by making sure that we don't delay things
by more than one second.
This is based on diagnosis and a patch by Stefan Rompf.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stefan Rompf <stefan@loplof.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Since sir_kthread.c pretty much duplicates the workqueue
functionality, we'd better switch. The SIR fsm has been merged into
sir_dev.c and thus sir_kthread.c is deleted.
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Brownell [Tue, 9 May 2006 22:26:11 +0000 (15:26 -0700)]
[IRDA]: smsc-ircc: Minimal hotplug support.
Minimal PNP hotplug support for the smsc-ircc2 driver. A modular
driver will be modprobed via hotplug, but still bypasses driver model
probing.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <samuel.ortiz@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 9 May 2006 21:14:28 +0000 (22:14 +0100)]
[ARM] Fix thread struct allocator for SMP case
The ARM thread struct allocator is racy on SMP systems. Fix it by
turning it into a per-cpu based allocator. This also allows keeps
the cache cache warm for thread structs and kernel stacks.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Roland Dreier [Tue, 9 May 2006 17:50:29 +0000 (10:50 -0700)]
IB/mthca: Fix race in reference counting
Fix races in in destroying various objects. If a destroy routine
waits for an object to become free by doing
wait_event(&obj->wait, !atomic_read(&obj->refcount));
/* now clean up and destroy the object */
and another place drops a reference to the object by doing
if (atomic_dec_and_test(&obj->refcount))
wake_up(&obj->wait);
then this is susceptible to a race where the wait_event() and final
freeing of the object occur between the atomic_dec_and_test() and the
wake_up(). And this is a use-after-free, since wake_up() will be
called on part of the already-freed object.
Fix this in mthca by replacing the atomic_t refcounts with plain old
integers protected by a spinlock. This makes it possible to do the
decrement of the reference count and the wake_up() so that it appears
as a single atomic operation to the code waiting on the wait queue.
While touching this code, also simplify mthca_cq_clean(): the CQ being
cleaned cannot go away, because it still has a QP attached to it. So
there's no reason to be paranoid and look up the CQ by number; it's
perfectly safe to use the pointer that the callers already have.
Roland Dreier [Tue, 9 May 2006 17:50:28 +0000 (10:50 -0700)]
IB/srp: Fix tracking of pending requests during error handling
If a SCSI abort completes, or the command completes successfully, then
the driver must remove the command from its queue of pending
commands. Similarly, if a device reset succeeds, then all commands
queued for the given device must be removed from the queue.
Ralph Campbell [Tue, 9 May 2006 17:50:28 +0000 (10:50 -0700)]
IB: Fix display of 4-bit port counters in sysfs
The code to display local_link_integrity_errors and
excessive_buffer_overrun_errors in
/sys/class/infiniband/<hca>/ports/<n>/counters/
uses the wrong shift to extract the 4 bit values.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
Linus Torvalds [Tue, 9 May 2006 17:18:35 +0000 (10:18 -0700)]
Merge branch 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/netdev-2.6
* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/netdev-2.6:
[PATCH] bcm43xx: Fix access to non-existent PHY registers
[PATCH] bcm43xx: Fix array overrun in bcm43xx_geo_init
[PATCH] bcm43xx: check for valid MAC address in SPROM
[PATCH] ieee80211: Fix A band channel count (resent)
[PATCH] bcm43xx: fix iwmode crash when down
[PATCH] softmac: make non-operational after being stopped
[PATCH] softmac: don't reassociate if user asked for deauthentication
spidernet: enable support for bcm5461 ethernet phy
spidernet: introduce new setting
Fix RTL8019AS init for Toshiba RBTX49xx boards
au1000_eth.c: use ether_crc() from <linux/crc32.h>
sky2: version 1.3
Add more support for the Yukon Ultra chip found in dual core centino laptops.
sky2: synchronize irq on remove
sky2: dont write status ring
sky2: edge triggered workaround enhancement
sky2: use mask instead of modulo operation
sky2: tx ring index mask fix
sky2: status irq hang fix
sky2: backout NAPI reschedule
Marcelo Tosatti [Fri, 5 May 2006 20:09:29 +0000 (17:09 -0300)]
[PATCH] ppc32/8xx: Fix r3 trashing due to 8MB TLB page instantiation
Instantiation of 8MB pages on the TLB cache for the kernel static
mapping trashes r3 register on !CONFIG_8xx_CPU6 configurations.
This ensures r3 gets saved and restored.
Signed-off-by: Marcelo Tosatti <marcelo@kvack.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
Paul Mackerras [Tue, 9 May 2006 06:00:59 +0000 (16:00 +1000)]
powerpc/32: Define an is_kernel_addr() to fix ARCH=ppc compilation
My commit 6bfd93c32a5065d0e858780b3beb0b667081601c broke the ARCH=ppc
compilation by using the is_kernel_addr() macro in asm/uaccess.h.
This fixes it by defining a suitable is_kernel_addr() for ARCH=ppc.
Linus Torvalds [Tue, 9 May 2006 00:41:05 +0000 (17:41 -0700)]
Merge git://oss.sgi.com:8090/xfs-2.6
* git://oss.sgi.com:8090/xfs-2.6:
[XFS] Fix a possible metadata buffer (AGFL) refcount leak when fixing an
[XFS] Fix a project quota space accounting leak on rename.
[XFS] Fix a possible forced shutdown due to mishandling write barriers
Jens Osterkamp [Thu, 4 May 2006 09:59:41 +0000 (05:59 -0400)]
spidernet: enable support for bcm5461 ethernet phy
A newer board revision changed the type of ethernet phy.
Moreover, this generalizes the way that a phy gets switched
into fiber mode when autodetection is not available.
Sergei Shtylyov [Mon, 8 May 2006 20:58:28 +0000 (00:58 +0400)]
Fix RTL8019AS init for Toshiba RBTX49xx boards
Ensure that 8-bit mode is selected for the on-board Realtek RTL8019AS chip
on Toshiba RBHMA4x00, get rid of the duplicate #ifdef's when setting
ei_status.word16.
The chip's datasheet says that the PSTOP register shouldn't exceed 0x60 in
8-bit mode -- ensure this too.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
au1000_eth.c: use ether_crc() from <linux/crc32.h>
since the au1000 driver already selects the CRC32 routines, simply replace
the internal ether_crc() implementation with the semantically equivalent
one from <linux/crc32.h>
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org> Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Add more support for the Yukon Ultra chip found in dual core centino laptops.
The newest Yukon Ultra chipset's require more special tweaks.
They seem to be like the Yukon XL chipsets. This code is transliterated
from the latest SysKonnect driver; I don't have any Ultra hardware.
Signed-off-by: Stephe Hemminger <shemminger@osdl.org> Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Need to make the edge-triggered workaround timer faster to get marginally
better peformance. The test_and_set_bit in schedule_prep() acts as a barrier
already. Make it a module parameter so that laptops who are concerned
about power can set it to 0; and user's stuck with broken BIOS's
can turn the driver into pure polling.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Mask for transmit ring status was picking up bits from the
unused sync ring. They were always zero, so far...
Also, make sure to remind self not to make tx ring too big.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
The status interrupt flag should be cleared before processing,
not afterwards to avoid race. Need to process in poll routine
even if no new interrupt status. This is a normal occurrence when
more than 64 frames (NAPI weight) are processed in one poll routine.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
"It was discussed on mips list but apparently the fix was bogus. I
will not have time to look into it so mips can carry this local fix
until we get a proper fix in mainline."
Andi Kleen [Mon, 8 May 2006 13:17:31 +0000 (15:17 +0200)]
[PATCH] x86_64: Move ondemand timer into own work queue
Taking the cpu hotplug semaphore in a normal events workqueue
is unsafe because other tasks can wait for any workqueues with
it hold. This results in a deadlock.
Move the DBS timer into its own work queue which is not
affected by other work queue flushes to avoid this.
Andi Kleen [Mon, 8 May 2006 13:17:28 +0000 (15:17 +0200)]
[PATCH] x86_64: Avoid EBDA area in early boot allocator
Based on analysis&patch from Robert Hentosch
Observed on a Dell PE6850 with 16GB
The problem occurs very early on, when the kernel allocates space for the
temporary memory map called bootmap. The bootmap overlaps the EBDA region.
EBDA region is not historically reserved in the e820 mapping. When the
bootmap is freed it marks the EBDA region as usable.
If you notice in setup.c there is already code to work around the EBDA
in reserve_ebda_region(), this check however occurs after the bootmap
is allocated and doesn't prevent the bootmap from using this range.
AK: I redid the original patch. Thanks also to Jan Beulich for
spotting some mistakes.
Corey Minyard [Mon, 8 May 2006 13:17:25 +0000 (15:17 +0200)]
[PATCH] x86_64: add nmi_exit to die_nmi
Playing with NMI watchdog on x86_64, I discovered that it didn't
do what I expected. It always panic-ed, even when it didn't
happen from interrupt context. This patch solves that
problem for me. Also, in this case, do_exit() will be called
with interrupts disabled, I believe. Would it be wise to also
call local_irq_enable() after nmi_exit()?
[Yes I added it -AK]
Currently, on x86_64, any NMI watchdog timeout will cause a panic
because the irq count will always be set to be in an interrupt
when do_exit() is called from die_nmi(). If we add nmi_exit() to
the die_nmi() call (since the nmi will never exit "normally")
it seems to solve this problem. The following small program
can be used to trigger the NMI watchdog to reproduce this:
main ()
{
iopl(3);
for (;;) asm("cli");
}
Corey Minyard [Mon, 8 May 2006 13:17:22 +0000 (15:17 +0200)]
[PATCH] x86_64: fix die_lock nesting
I noticed this when poking around in this area.
The oops_begin() function in x86_64 would only conditionally claim
the die_lock if the call is nested, but oops_end() would always
release the spinlock. This patch adds a nest count for the die lock
so that the release of the lock is only done on the final oops_end().
Kimball Murray [Mon, 8 May 2006 13:17:16 +0000 (15:17 +0200)]
[PATCH] x86_64: avoid IRQ0 ioapic pin collision
The patch addresses a problem with ACPI SCI interrupt entry, which gets
re-used, and the IRQ is assigned to another unrelated device. The patch
corrects the code such that SCI IRQ is skipped and duplicate entry is
avoided. Second issue came up with VIA chipset, the problem was caused by
original patch assigning IRQs starting 16 and up. The VIA chipset uses
4-bit IRQ register for internal interrupt routing, and therefore cannot
handle IRQ numbers assigned to its devices. The patch corrects this
problem by allowing PCI IRQs below 16.
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
kbuild: Do not overwrite makefile as anohter user
kbuild: drivers/video/logo/ - fix ident glitch
kbuild: fix gen_initramfs_list.sh
kbuild modpost - relax driver data name
kbuild: removing .tmp_versions considered harmful
kbuild: fix modpost segfault for 64bit mipsel kernel
Trond Myklebust [Mon, 8 May 2006 03:02:42 +0000 (23:02 -0400)]
[PATCH] fs/locks.c: Fix lease_init
It is insane to be giving lease_init() the task of freeing the lock it is
supposed to initialise, given that the lock is not guaranteed to be
allocated on the stack. This causes lockups in fcntl_setlease().
Problem diagnosed by Daniel Hokka Zakrisson <daniel@hozac.com>
Also fix a slab leak in __setlease() due to an uninitialised return value.
Problem diagnosed by Björn Steinbrink.
Jan Beulich [Tue, 2 May 2006 10:33:20 +0000 (12:33 +0200)]
kbuild: Do not overwrite makefile as anohter user
Change the conditional of the outputmakefile rule to be evaluated entirely
in make, and add a conditional to not touch the generated makefile when e.g.
running 'make install' as root while the build was done as non-root. Also
adjust the comment describing this, and move the message printing and
redirection to mkmakefile.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
This holds the task lock (and, for ptrace_attach, the tasklist_lock)
over the actual attach event, which closes a race between attacking to a
thread that is either doing a PTRACE_TRACEME or getting de-threaded.
Thanks to Oleg Nesterov for reminding me about this, and Chris Wright
for noticing a lost return value in my first version.
Hua Zhong [Sun, 7 May 2006 01:11:39 +0000 (18:11 -0700)]
[IPV4]: Remove likely in ip_rcv_finish()
This is another result from my likely profiling tool
(dwalker@mvista.com just sent the patch of the profiling tool to
linux-kernel mailing list, which is similar to what I use).
On my system (not very busy, normal development machine within a
VMWare workstation), I see a 6/5 miss/hit ratio for this "likely".
Signed-off-by: Hua Zhong <hzhong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
[NET]: Create netdev attribute_groups with class_device_add
Atomically create attributes when class device is added. This avoids
the race between registering class_device (which generates hotplug
event), and the creation of attribute groups.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Extend the support of attribute groups in class_device's to allow
groups to be created as part of the registration process. This allows
network device's to avoid race between registration and creating
groups.
Note that unlike attributes that are a property of the class object,
the groups are a property of the class_device object. This is done
because there are different types of network devices (wireless for
example).
Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Sat, 6 May 2006 10:29:21 +0000 (11:29 +0100)]
[ARM] rtc-sa1100: fix compiler warnings and error cleanup
Fix:
drivers/rtc/rtc-sa1100.c: In function `sa1100_rtc_proc':
drivers/rtc/rtc-sa1100.c:298: warning: unsigned int format, long unsigned int arg (arg 3)
and arrange for sa1100_rtc_open() to pass the devid to free_irq()
rather than NULL.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 6 May 2006 10:26:30 +0000 (11:26 +0100)]
[ARM] Allow SA1100 RTC alarm to be configured for wakeup
The SA1100 RTC alarm can be configured to wake up the CPU
from sleep mode, and the RTC driver has been using the
API to configure this mode. Unfortunately, the code was
which sets the required bit in the hardware was missing.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
John Heffner [Sat, 6 May 2006 00:41:44 +0000 (17:41 -0700)]
[TCP]: Fix snd_cwnd adjustments in tcp_highspeed.c
Xiaoliang (David) Wei wrote:
> Hi gurus,
>
> I am reading the code of tcp_highspeed.c in the kernel and have a
> question on the hstcp_cong_avoid function, specifically the following
> AI part (line 136~143 in net/ipv4/tcp_highspeed.c ):
>
> /* Do additive increase */
> if (tp->snd_cwnd < tp->snd_cwnd_clamp) {
> tp->snd_cwnd_cnt += ca->ai;
> if (tp->snd_cwnd_cnt >= tp->snd_cwnd) {
> tp->snd_cwnd++;
> tp->snd_cwnd_cnt -= tp->snd_cwnd;
> }
> }
>
> In this part, when (tp->snd_cwnd_cnt == tp->snd_cwnd),
> snd_cwnd_cnt will be -1... snd_cwnd_cnt is defined as u16, will this
> small chance of getting -1 becomes a problem?
> Shall we change it by reversing the order of the cwnd++ and cwnd_cnt -=
> cwnd?
Absolutely correct. Thanks.
Signed-off-by: John Heffner <jheffner@psc.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
Ralf Baechle [Sat, 6 May 2006 00:19:26 +0000 (17:19 -0700)]
[NETROM/ROSE]: Kill module init version kernel log messages.
There are out of date and don't tell the user anything useful.
The similar messages which IPV4 and the core networking used
to output were killed a long time ago.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sat, 6 May 2006 00:09:13 +0000 (17:09 -0700)]
[DCCP]: Fix sock_orphan dead lock
Calling sock_orphan inside bh_lock_sock in dccp_close can lead to dead
locks. For example, the inet_diag code holds sk_callback_lock without
disabling BH. If an inbound packet arrives during that admittedly tiny
window, it will cause a dead lock on bh_lock_sock. Another possible
path would be through sock_wfree if the network device driver frees the
tx skb in process context with BH enabled.
We can fix this by moving sock_orphan out of bh_lock_sock.
The tricky bit is to work out when we need to destroy the socket
ourselves and when it has already been destroyed by someone else.
By moving sock_orphan before the release_sock we can solve this
problem. This is because as long as we own the socket lock its
state cannot change.
So we simply record the socket state before the release_sock
and then check the state again after we regain the socket lock.
If the socket state has transitioned to DCCP_CLOSED in the time being,
we know that the socket has been destroyed. Otherwise the socket is
still ours to keep.
This problem was discoverd by Ingo Molnar using his lock validator.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
[SCTP]: Prevent possible infinite recursion with multiple bundled DATA.
There is a rare situation that causes lksctp to go into infinite recursion
and crash the system. The trigger is a packet that contains at least the
first two DATA fragments of a message bundled together. The recursion is
triggered when the user data buffer is smaller that the full data message.
The problem is that we clone the skb for every fragment in the message.
When reassembling the full message, we try to link skbs from the "first
fragment" clone using the frag_list. However, since the frag_list is shared
between two clones in this rare situation, we end up setting the frag_list
pointer of the second fragment to point to itself. This causes
sctp_skb_pull() to potentially recurse indefinitely.
Proposed solution is to make a copy of the skb when attempting to link
things using frag_list.
Signed-off-by: Vladislav Yasevich <vladsilav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Sat, 6 May 2006 00:02:09 +0000 (17:02 -0700)]
[SCTP]: Allow spillover of receive buffer to avoid deadlock.
This patch fixes a deadlock situation in the receive path by allowing
temporary spillover of the receive buffer.
- If the chunk we receive has a tsn that immediately follows the ctsn,
accept it even if we run out of receive buffer space and renege data with
higher TSNs.
- Once we accept one chunk in a packet, accept all the remaining chunks
even if we run out of receive buffer space.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Mark Butler <butlerm@middle.net> Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Pitre [Fri, 5 May 2006 21:32:24 +0000 (22:32 +0100)]
[ARM] 3500/1: fix PXA27x DMA allocation priority
Patch from Nicolas Pitre
Intel PXA27x developers manual section 5.4.1.1 lists a priority
distribution for the DMA channels differently than what the code
currently assumes. This patch fixes that.
Noticed by Simon Vogl <vogl@soft.uni-linz.ac.at>
Signed-off-by: Nicolas Pitre <nico@cam.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>