]> pilppa.com Git - linux-2.6-omap-h63xx.git/log
linux-2.6-omap-h63xx.git
18 years ago[NETFILTER]: conntrack: fix race condition in early_drop
Pablo Neira Ayuso [Wed, 20 Sep 2006 19:01:06 +0000 (12:01 -0700)]
[NETFILTER]: conntrack: fix race condition in early_drop

On SMP environments the maximum number of conntracks can be overpassed
under heavy stress situations due to an existing race condition.

        CPU A                   CPU B
     atomic_read()               ...
     early_drop()                ...
        ...                  atomic_read()
   allocate conntrack      allocate conntrack
     atomic_inc()             atomic_inc()

This patch moves the counter incrementation before the early drop stage.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ctnetlink: simplify the code to dump the conntrack table
Pablo Neira Ayuso [Wed, 20 Sep 2006 19:00:45 +0000 (12:00 -0700)]
[NETFILTER]: ctnetlink: simplify the code to dump the conntrack table

Merge the bits to dump the conntrack table and the ones to dump and
zero counters in a single piece of code. This patch does not change
the default behaviour if accounting is not enabled.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: x_tables: small check_entry & module_refcount cleanup
Dmitry Mishin [Wed, 20 Sep 2006 19:00:21 +0000 (12:00 -0700)]
[NETFILTER]: x_tables: small check_entry & module_refcount cleanup

While standard_target has target->me == NULL, module_put() should be
called for it as for others, because there were try_module_get() before.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ip6table_mangle: reroute when nfmark changes in NF_IP6_LOCAL_OUT
Patrick McHardy [Wed, 20 Sep 2006 18:59:42 +0000 (11:59 -0700)]
[NETFILTER]: ip6table_mangle: reroute when nfmark changes in NF_IP6_LOCAL_OUT

Now that IPv6 supports policy routing we need to reroute in NF_IP6_LOCAL_OUT
when the mark value changes.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: xt_limit: don't reset state on unrelated rule updates
Patrick McHardy [Wed, 20 Sep 2006 18:59:25 +0000 (11:59 -0700)]
[NETFILTER]: xt_limit: don't reset state on unrelated rule updates

The limit match reinitializes its state whenever the ruleset changes,
which means it will forget about previously used credits.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ipt_TCPMSS: misc cleanup
Patrick McHardy [Wed, 20 Sep 2006 18:59:06 +0000 (11:59 -0700)]
[NETFILTER]: ipt_TCPMSS: misc cleanup

- remove debugging cruft
- remove printk for reallocation failures
- remove unused addition

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ipt_TCPMSS: remove impossible condition
Patrick McHardy [Wed, 20 Sep 2006 18:58:50 +0000 (11:58 -0700)]
[NETFILTER]: ipt_TCPMSS: remove impossible condition

Every skb must have a dst_entry at this point.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: ipt_TCPMSS: reformat
Patrick McHardy [Wed, 20 Sep 2006 18:58:35 +0000 (11:58 -0700)]
[NETFILTER]: ipt_TCPMSS: reformat

- fix whitespace error
- break lines at 80 characters
- reformat some expressions to be more readable

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: xt_conntrack: clean up overly long lines
Patrick McHardy [Wed, 20 Sep 2006 18:58:17 +0000 (11:58 -0700)]
[NETFILTER]: xt_conntrack: clean up overly long lines

Also fix some whitespace errors and use the NAT bits instead of deriving
the state manually.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: kill listhelp.h
Patrick McHardy [Wed, 20 Sep 2006 18:57:53 +0000 (11:57 -0700)]
[NETFILTER]: kill listhelp.h

Kill listhelp.h and use the list.h functions instead.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: remove unused include file
Patrick McHardy [Wed, 20 Sep 2006 18:57:09 +0000 (11:57 -0700)]
[NETFILTER]: remove unused include file

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: ipip and ip_gre encapsulation bugs
Al Viro [Tue, 19 Sep 2006 20:23:19 +0000 (13:23 -0700)]
[IPV4]: ipip and ip_gre encapsulation bugs

Handling of ipip and ip_gre ICMP error relaying is b0rken; it accesses
8bit field + 3 reserved octets as host-endian 32bit, does comparison,
subtraction and stuffs the result back.  That breaks on big-endian.

Fixed, made endian-clean.

[ Note that this effected code is permanently commented out with
  and ifdef, so this error couldn't actually cause problems for
  anyone. -DaveM ]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Add helper functions for changing important CCID2 state
Andrea Bittau [Tue, 19 Sep 2006 20:15:33 +0000 (13:15 -0700)]
[DCCP] CCID2: Add helper functions for changing important CCID2 state

Introduce methods which manipulate interesting congestion control
state such as pipe and rtt estimate.  This is useful for people
wishing to monitor the variables of CCID and instrument the code
[perhaps using Kprobes].  Personally, I am a fan of
encapsulation---that justifies this change =D.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Halve cwnd once upon multiple losses in a single RTT
Andrea Bittau [Tue, 19 Sep 2006 20:14:43 +0000 (13:14 -0700)]
[DCCP] CCID2: Halve cwnd once upon multiple losses in a single RTT

When multiple losses occur in one RTT, the window should be halved
only once [a single "congestion event"].  This is now implemented,
although not perfectly.  Slightly changed the interface for changing
the cwnd: pass hctx instead of dp.  This is required in order to allow
for change_cwnd to be called from _init().

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Allocate seq records on demand
Andrea Bittau [Tue, 19 Sep 2006 20:13:37 +0000 (13:13 -0700)]
[DCCP] CCID2: Allocate seq records on demand

Allocate more sequence state on demand.  Each time a packet is sent
out by CCID2, a record of it needs to be kept.  This list of records
grows proportionally to cwnd.  Previously, the length of this list was
hardcored and therefore the cwnd could only grow to this value (of
128).  Now, records are allocated on demand as necessary---cwnd may
grow as it wishes.  The exceptional case of when memory is not
available is not handled gracefully.  Perhaps, cwnd should be capped
at that point.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Add Kconfig option for CCID2 debug
Andrea Bittau [Tue, 19 Sep 2006 20:12:44 +0000 (13:12 -0700)]
[DCCP] CCID2: Add Kconfig option for CCID2 debug

Allow the user to choose whether or not to enable CCID2 debugging via
Kconfig.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Tell DCCP to quickly check whether cwnd is available
Andrea Bittau [Tue, 19 Sep 2006 20:10:11 +0000 (13:10 -0700)]
[DCCP] CCID2: Tell DCCP to quickly check whether cwnd is available

If not enough cwnd is available, tell the sender to check again as
soon as possible.  This will increase CPU utilization (polling
frequently for cwnd) but will improve network performance.  That is,
the sender will need to wait less before detecting the increase of
cwnd.  A better architecture would be for the CCID to call-back (or
dequeue) from DCCP when it is able to transmit traffic -- not the
other way around as it currently occurs.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[ATM]: proper prototypes in net/atm/mpc.h (and reduce ifdef clutter)
Adrian Bunk [Fri, 22 Sep 2006 21:28:11 +0000 (14:28 -0700)]
[ATM]: proper prototypes in net/atm/mpc.h (and reduce ifdef clutter)

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Initialize ssthresh to infinity
Andrea Bittau [Tue, 19 Sep 2006 20:07:20 +0000 (13:07 -0700)]
[DCCP] CCID2: Initialize ssthresh to infinity

Initialize the slow-start threshold to infinity.  This way, upon connection
initiation, slow-start will be exited only upon a packet loss.  This patch will
allow connections to quickly gain speed.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] CCID2: Fix jiffie wrap issues
Andrea Bittau [Tue, 19 Sep 2006 20:06:46 +0000 (13:06 -0700)]
[DCCP] CCID2: Fix jiffie wrap issues

Jiffies are now handled correctly (I hope) in CCID2.  If they wrap, no
problem.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] ackvec: Remove unused variables
Andrea Bittau [Tue, 19 Sep 2006 20:06:16 +0000 (13:06 -0700)]
[DCCP] ackvec: Remove unused variables

Get rid of unused variables in ackvector state.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] ackvec: Fix how DCCP_ACKVEC_STATE_NOT_RECEIVED is used
Andrea Bittau [Tue, 19 Sep 2006 20:05:35 +0000 (13:05 -0700)]
[DCCP] ackvec: Fix how DCCP_ACKVEC_STATE_NOT_RECEIVED is used

Fix the way state is masked out.  DCCP_ACKVEC_STATE_NOT_RECEIVED is
defined as appears in the packet, therefore bit shifting is not
required.  This fix allows CCID2 to correctly detect losses.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP] ACKVEC: fix ackvector length calculation
Andrea Bittau [Tue, 19 Sep 2006 20:04:54 +0000 (13:04 -0700)]
[DCCP] ACKVEC: fix ackvector length calculation

Fix ackvector length calculation upon receiving an "ack-of-ack".  This
patch avoids the ackvector from growing too large which causes it to
not be inserted into packets.

Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Fix wildcard as tunnel source
Patrick McHardy [Tue, 19 Sep 2006 19:57:34 +0000 (12:57 -0700)]
[XFRM]: Fix wildcard as tunnel source

Hashing SAs by source address breaks templates with wildcards as tunnel
source since the source address used for hashing/lookup is still 0/0.
Move source address lookup to xfrm_tmpl_resolve_one() so we can use the
real address in the lookup.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Send ACKs each 2nd received segment.
Alexey Kuznetsov [Tue, 19 Sep 2006 19:52:50 +0000 (12:52 -0700)]
[TCP]: Send ACKs each 2nd received segment.

It does not affect either mss-sized connections (obviously) or
connections controlled by Nagle (because there is only one small
segment in flight).

The idea is to record the fact that a small segment arrives on a
connection, where one small segment has already been received and
still not-ACKed. In this case ACK is forced after tcp_recvmsg() drains
receive buffer.

In other words, it is a "soft" each-2nd-segment ACK, which is enough
to preserve ACK clock even when ABC is enabled.

Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SELINUX]: Fix bug in security_sid_mls_copy
Venkat Yekkirala [Tue, 19 Sep 2006 17:24:19 +0000 (10:24 -0700)]
[SELINUX]: Fix bug in security_sid_mls_copy

The following fixes a bug where random mem is being tampered with in the
non-mls case; encountered by Jashua Brindle on a gentoo box.

Signed-off-by: Venkat Yekkirala <vyekkirala@TrustedCS.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: James Morris <jmorris@namei.org>
18 years ago[SCTP]: Cleanups
Adrian Bunk [Mon, 18 Sep 2006 07:40:38 +0000 (00:40 -0700)]
[SCTP]: Cleanups

This patch contains the following cleanups:
- make the following needlessly global function static:
  - socket.c: sctp_apply_peer_addr_params()
- add proper prototypes for the several global functions in
  include/net/sctp/sctp.h

Note that this fixes wrong prototypes for the following functions:
- sctp_snmp_proc_exit()
- sctp_eps_proc_exit()
- sctp_assocs_proc_exit()

The latter was spotted by the GNU C compiler and reported
by David Woodhouse.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Support NLM_F_EXCL when adding addresses
Thomas Graf [Mon, 18 Sep 2006 07:13:46 +0000 (00:13 -0700)]
[IPV6] address: Support NLM_F_EXCL when adding addresses

iproute2 doesn't provide the NLM_F_CREATE flag when adding addresses,
it is assumed to be implied. The existing code issues a check on
said flag when the modify operation fails (likely due to ENOENT)
before continueing to create it, this leads to a hard to predict
result, therefore the NLM_F_CREATE check is removed.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Allow address changes while device is administrative down
Thomas Graf [Mon, 18 Sep 2006 07:13:07 +0000 (00:13 -0700)]
[IPV6] address: Allow address changes while device is administrative down

Same behaviour as IPv4, using IFF_UP is a no-no anyway.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Convert address dumping to new netlink api
Thomas Graf [Mon, 18 Sep 2006 07:12:35 +0000 (00:12 -0700)]
[IPV6] address: Convert address dumping to new netlink api

Replaces INET6_IFADDR_RTA_SPACE with a new function calculating
the total required message size for all address messages.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Add put_ifaddrmsg() and rt_scope()
Thomas Graf [Mon, 18 Sep 2006 07:11:52 +0000 (00:11 -0700)]
[IPV6] address: Add put_ifaddrmsg() and rt_scope()

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Add put_cacheinfo() to dump struct cacheinfo
Thomas Graf [Mon, 18 Sep 2006 07:11:24 +0000 (00:11 -0700)]
[IPV6] address: Add put_cacheinfo() to dump struct cacheinfo

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Convert address lookup to new netlink api
Thomas Graf [Mon, 18 Sep 2006 07:10:50 +0000 (00:10 -0700)]
[IPV6] address: Convert address lookup to new netlink api

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Convert address deletion to new netlink api
Thomas Graf [Mon, 18 Sep 2006 07:10:19 +0000 (00:10 -0700)]
[IPV6] address: Convert address deletion to new netlink api

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] address: Convert address addition to new netlink api
Thomas Graf [Mon, 18 Sep 2006 07:09:49 +0000 (00:09 -0700)]
[IPV6] address: Convert address addition to new netlink api

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: Change tunables to __read_mostly
Brian Haley [Mon, 18 Sep 2006 07:05:22 +0000 (00:05 -0700)]
[NETFILTER]: Change tunables to __read_mostly

Change some netfilter tunables to __read_mostly.  Also fixed some
incorrect file reference comments while I was in there.

(this will be my last __read_mostly patch unless someone points out
something else that needs it)

Signed-off-by: Brian Haley <brian.haley@hp.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[SCTP]: Change globals to __read_mostly
Brian Haley [Mon, 18 Sep 2006 07:04:22 +0000 (00:04 -0700)]
[SCTP]: Change globals to __read_mostly

Change sctp globals to __read_mostly.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[BRIDGE]: Change sysctl tunables to __read_mostly
Brian Haley [Mon, 18 Sep 2006 07:03:41 +0000 (00:03 -0700)]
[BRIDGE]: Change sysctl tunables to __read_mostly

Change some bridge sysctl tunables to __read_mostly.

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[GENL]: Provide more information to userspace about registered genl families
Thomas Graf [Mon, 18 Sep 2006 07:01:59 +0000 (00:01 -0700)]
[GENL]: Provide more information to userspace about registered genl families

Additionaly exports the following information when providing
the list of registered generic netlink families:
  - protocol version
  - header size
  - maximum number of attributes
  - list of available operations including
      - id
      - flags
      - avaiability of policy and doit/dumpit function

libnl HEAD provides a utility to read this new information:

0x0010 nlctrl version 1
    hdrsize 0 maxattr 6
      op GETFAMILY (0x03) [POLICY,DOIT,DUMPIT]
0x0011 NLBL_MGMT version 1
    hdrsize 0 maxattr 0
      op unknown (0x02) [DOIT]
      op unknown (0x03) [DOIT]
      ....

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[RTNETLINK]: Fix netdevice name corruption
Patrick McHardy [Thu, 14 Sep 2006 03:35:36 +0000 (20:35 -0700)]
[RTNETLINK]: Fix netdevice name corruption

When changing a device by ifindex without including a IFLA_IFNAME
attribute, the ifname variable contains random garbage and is used
to change the device name.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: remove xerr_idxp from __xfrm_policy_check()
James Morris [Fri, 1 Sep 2006 07:32:12 +0000 (00:32 -0700)]
[XFRM]: remove xerr_idxp from __xfrm_policy_check()

It seems that during the MIPv6 respin, some code which was originally
conditionally compiled around CONFIG_XFRM_ADVANCED was accidently left
in after the config option was removed.

This patch removes an extraneous pointer (xerr_idxp) which is no
longer needed.

Signed-off-by: James Morris <jmorris@namei.org>
Acked-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPSEC]: output mode to take an xfrm state as input param
Jamal Hadi Salim [Fri, 1 Sep 2006 00:42:59 +0000 (17:42 -0700)]
[IPSEC]: output mode to take an xfrm state as input param

Expose IPSEC modes output path to take an xfrm state as input param.
This makes it consistent with the input mode processing (which already
takes the xfrm state as a param).

Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Fix sk->sk_filter field access
Dmitry Mishin [Thu, 31 Aug 2006 22:28:39 +0000 (15:28 -0700)]
[NET]: Fix sk->sk_filter field access

Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg
needlock = 0, while socket is not locked at that moment. In order to avoid
this and similar issues in the future, use rcu for sk->sk_filter field read
protection.

Signed-off-by: Dmitry Mishin <dim@openvz.org>
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
18 years ago[IPV6] MIP6: Fix to update IP6CB when cloned skbuff is received at HAO.
Masahide NAKAMURA [Thu, 31 Aug 2006 22:18:49 +0000 (15:18 -0700)]
[IPV6] MIP6: Fix to update IP6CB when cloned skbuff is received at HAO.

Signed-off-by: Masahide NAKAMURA <nakam@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM] STATE: Fix flusing with hash mask.
Masahide NAKAMURA [Thu, 31 Aug 2006 22:14:32 +0000 (15:14 -0700)]
[XFRM] STATE: Fix flusing with hash mask.

This is a minor fix about transformation state flushing
for net-2.6.19. Please apply it.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP]: Fix rcv mss estimate for LRO
Herbert Xu [Thu, 31 Aug 2006 22:11:02 +0000 (15:11 -0700)]
[TCP]: Fix rcv mss estimate for LRO

By passing a Linux-generated TSO packet straight back into Linux, Xen
becomes our first LRO user :) Unfortunately, there is at least one spot
in our stack that needs to be changed to cope with this.

The receive MSS estimate is computed from the raw packet size.  This is
broken if the packet is GSO/LRO.  Fortunately the real MSS can be found
in gso_size so we simply need to use that if it is non-zero.

Real LRO NICs should of course set the gso_size field in future.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[RTNETLINK]: Fix typo causing wrong skb to be freed
Thomas Graf [Thu, 31 Aug 2006 22:04:30 +0000 (15:04 -0700)]
[RTNETLINK]: Fix typo causing wrong skb to be freed

A typo introduced by myself which leads to freeing the skb
containing the netlink message when it should free the newly
allocated skb for the reply.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[AF_UNIX]: Change max_dgram_qlen sysctl to __read_mostly
Brian Haley [Thu, 31 Aug 2006 22:03:36 +0000 (15:03 -0700)]
[AF_UNIX]: Change max_dgram_qlen sysctl to __read_mostly

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Change somaxconn sysctl to __read_mostly
Brian Haley [Thu, 31 Aug 2006 22:03:02 +0000 (15:03 -0700)]
[NET]: Change somaxconn sysctl to __read_mostly

Signed-off-by: Brian Haley <brian.haley@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[PKT_SCHED] act_simple.c: make struct simp_hash_info static
Adrian Bunk [Wed, 30 Aug 2006 22:03:07 +0000 (15:03 -0700)]
[PKT_SCHED] act_simple.c: make struct simp_hash_info static

This patch makes the needlessly global struct simp_hash_info static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: add some missing #includes to various header files
Paul Moore [Wed, 30 Aug 2006 00:56:04 +0000 (17:56 -0700)]
[NetLabel]: add some missing #includes to various header files

Add some missing include files to the NetLabel related header files.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: uninline selinux_netlbl_inode_permission()
Paul Moore [Wed, 30 Aug 2006 00:55:38 +0000 (17:55 -0700)]
[NetLabel]: uninline selinux_netlbl_inode_permission()

Uninline the selinux_netlbl_inode_permission() at the request of
Andrew Morton.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: Cleanup ebitmap_import()
Paul Moore [Wed, 30 Aug 2006 00:55:11 +0000 (17:55 -0700)]
[NetLabel]: Cleanup ebitmap_import()

Rewrite ebitmap_import() so it is a bit cleaner and easier to read.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: Comment corrections.
Paul Moore [Wed, 30 Aug 2006 00:54:41 +0000 (17:54 -0700)]
[NetLabel]: Comment corrections.

Fix some incorrect comments.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: remove unused function prototypes
Paul Moore [Wed, 30 Aug 2006 00:54:17 +0000 (17:54 -0700)]
[NetLabel]: remove unused function prototypes

Removed some older function prototypes for functions that no longer exist.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NetLabel]: Correctly initialize the NetLabel fields.
Paul Moore [Wed, 30 Aug 2006 00:53:48 +0000 (17:53 -0700)]
[NetLabel]: Correctly initialize the NetLabel fields.

Fix a problem where the NetLabel specific fields of the sk_security_struct
structure were not being initialized early enough in some cases.

Signed-off-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Tidyup CCID3 list handling
Ian McDonald [Wed, 30 Aug 2006 00:50:19 +0000 (17:50 -0700)]
[DCCP]: Tidyup CCID3 list handling

As Arnaldo Carvalho de Melo points out I should be using list_entry in case
the structure changes in future. Current code functions but is reliant
on position and requires type cast.

Noticed when doing this that I have one more variable than I needed so
removing that also.

Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER] bridge: debug message fixes
Stephen Hemminger [Wed, 30 Aug 2006 00:49:31 +0000 (17:49 -0700)]
[NETFILTER] bridge: debug message fixes

If CONFIG_NETFILTER_DEBUG is enabled, it shouldn't change the
actions of the filtering. The message about skb->dst being NULL
is commonly triggered by dhclient, so it is useless. Make sure all
messages end in newline.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER] bridge: simplify nf_bridge_pad
Stephen Hemminger [Wed, 30 Aug 2006 00:48:57 +0000 (17:48 -0700)]
[NETFILTER] bridge: simplify nf_bridge_pad

Do some simple optimization on the nf_bridge_pad() function
and don't use magic constants. Eliminate a double call and
the #ifdef'd code for CONFIG_BRIDGE_NETFILTER.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER] bridge: code rearrangement for clarity
Stephen Hemminger [Wed, 30 Aug 2006 00:48:17 +0000 (17:48 -0700)]
[NETFILTER] bridge: code rearrangement for clarity

Cleanup and rearrangement for better style and clarity:
Split the function nf_bridge_maybe_copy_header into two pieces
Move copy portion out of line.
Use Ethernet header size macros.
Use header file to handle CONFIG_NETFILTER_BRIDGE differences

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Make struct sockaddr_in::sin_port __be16
Alexey Dobriyan [Tue, 29 Aug 2006 06:58:32 +0000 (23:58 -0700)]
[IPV4]: Make struct sockaddr_in::sin_port __be16

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Make struct in_addr::s_addr __be32
Alexey Dobriyan [Tue, 29 Aug 2006 06:57:56 +0000 (23:57 -0700)]
[IPV4]: Make struct in_addr::s_addr __be32

There will be relatively small increase in sparse endian warnings, but
this (and sin_port) patch is a first step to make networking code
endian clean.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: xt_CONNMARK.c build fix
Benoit Boissinot [Tue, 29 Aug 2006 00:50:37 +0000 (17:50 -0700)]
[NETFILTER]: xt_CONNMARK.c build fix

net/netfilter/xt_CONNMARK.c: In function 'target':
net/netfilter/xt_CONNMARK.c:59: warning: implicit declaration of
function 'nf_conntrack_event_cache'

The warning is due to the following .config:
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_CONNTRACK_MARK=y
# CONFIG_IP_NF_CONNTRACK_EVENTS is not set
CONFIG_IP_NF_CONNTRACK_NETLINK=m

This change was introduced by:
http://www.kernel.org/git/?p=linux/kernel/git/davem/net-2.6.19.git;a=commit;h=76e4b41009b8a2e9dd246135cf43c7fe39553aa5

Proposed solution (based on the define in
include/net/netfilter/nf_conntrack_compat.h:

Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] ROUTE: Fix dst reference counting in ip6_pol_route_lookup().
YOSHIFUJI Hideaki [Mon, 28 Aug 2006 20:19:30 +0000 (13:19 -0700)]
[IPV6] ROUTE: Fix dst reference counting in ip6_pol_route_lookup().

In ip6_pol_route_lookup(), when we finish backtracking at the
top-level root entry, we need to hold it.

Bug noticed by Mitsuru Chinen <CHINEN@jp.ibm.com>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK]: Make use of NLA_STRING/NLA_NUL_STRING attribute validation
Thomas Graf [Sun, 27 Aug 2006 03:13:18 +0000 (20:13 -0700)]
[NETLINK]: Make use of NLA_STRING/NLA_NUL_STRING attribute validation

Converts existing NLA_STRING attributes to use the new
validation features, saving a couple of temporary buffers.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK]: Improve string attribute validation
Thomas Graf [Sun, 27 Aug 2006 03:11:47 +0000 (20:11 -0700)]
[NETLINK]: Improve string attribute validation

Introduces a new attribute type NLA_NUL_STRING to support NUL
terminated strings. Attributes of this kind require to carry
a terminating NUL within the maximum specified in the policy.

The `old' NLA_STRING which is not required to be NUL terminated
is extended to provide means to specify a maximum length of the
string.

Aims at easing the pain with using nla_strlcpy() on temporary
buffers.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[UDP]: saddr_cmp function should take const socket pointers
David S. Miller [Sun, 27 Aug 2006 03:10:15 +0000 (20:10 -0700)]
[UDP]: saddr_cmp function should take const socket pointers

This also kills a warning while building ipv6:

net/ipv6/udp.c: In function ‘udp_v6_get_port’:
net/ipv6/udp.c:66: warning: passing argument 3 of ‘udp_get_port’ from incompatible pointer type

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[UDP]: Mark udp_port_rover static.
David S. Miller [Sun, 27 Aug 2006 03:06:49 +0000 (20:06 -0700)]
[UDP]: Mark udp_port_rover static.

It is not referenced outside of net/ipv4/udp.c any longer.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[UDP]: Unify UDPv4 and UDPv6 ->get_port()
Gerrit Renker [Sun, 27 Aug 2006 03:06:05 +0000 (20:06 -0700)]
[UDP]: Unify UDPv4 and UDPv6 ->get_port()

This patch creates one common function which is called by
udp_v4_get_port() and udp_v6_get_port(). As a result,
  * duplicated code is removed
  * udp_port_rover and local port lookup can now be removed from udp.h
  * further savings follow since the same function will be used by UDP-Litev4
    and UDP-Litev6

In contrast to the patch sent in response to Yoshifujis comments
(fixed by this variant), the code below also removes the
EXPORT_SYMBOL(udp_port_rover), since udp_port_rover can now remain
local to net/ipv4/udp.c.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER]: Fix nf_conntrack_ftp.c build.
David S. Miller [Sun, 27 Aug 2006 02:48:49 +0000 (19:48 -0700)]
[NETFILTER]: Fix nf_conntrack_ftp.c build.

Noticed by Adrian Bunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Use SLAB_PANIC
Alexey Dobriyan [Sun, 27 Aug 2006 02:25:52 +0000 (19:25 -0700)]
[NET]: Use SLAB_PANIC

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETLINK]: remove third bogus argument from NLA_PUT_FLAG
Johannes Berg [Sun, 27 Aug 2006 02:17:53 +0000 (19:17 -0700)]
[NETLINK]: remove third bogus argument from NLA_PUT_FLAG

This patch removes the 'value' argument from NLA_PUT_FLAG which is
unused anyway. The documentation comment was already correct so it
doesn't need an update :)

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Introduce tx buffering
Ian McDonald [Sun, 27 Aug 2006 02:16:45 +0000 (19:16 -0700)]
[DCCP]: Introduce tx buffering

This adds transmit buffering to DCCP.

I have tested with CCID2/3 and with loss and rate limiting.

Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DCCP]: Shift sysctls into feat.h
Ian McDonald [Sun, 27 Aug 2006 02:15:35 +0000 (19:15 -0700)]
[DCCP]: Shift sysctls into feat.h

This shifts further sysctls into feat.h. No change in
functionality - shifting code only.

Signed off by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET]: Use BUILD_BUG_ON() for checking size of skb->cb.
YOSHIFUJI Hideaki [Fri, 1 Sep 2006 07:29:06 +0000 (00:29 -0700)]
[NET]: Use BUILD_BUG_ON() for checking size of skb->cb.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Fix routing by fwmark
Patrick McHardy [Sat, 26 Aug 2006 23:50:20 +0000 (16:50 -0700)]
[IPV6]: Fix routing by fwmark

Fix mark comparison, also dump the mask to userspace when the mask is
zero, but the mark is not (in which case the mark is dumped, so the
mask is needed to make sense of it).

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] Congestion control (modulo lp, bic): use BUILD_BUG_ON
Alexey Dobriyan [Sat, 26 Aug 2006 00:10:33 +0000 (17:10 -0700)]
[TCP] Congestion control (modulo lp, bic): use BUILD_BUG_ON

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET_SCHED]: Add mask support to fwmark classifier
Patrick McHardy [Fri, 25 Aug 2006 23:11:42 +0000 (16:11 -0700)]
[NET_SCHED]: Add mask support to fwmark classifier

Support masking the nfmark value before the search. The mask value is
global for all filters contained in one instance. It can only be set
when a new instance is created, all filters must specify the same mask.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[DECNET]: Add support for fwmark masks in routing rules
Patrick McHardy [Fri, 25 Aug 2006 23:11:08 +0000 (16:11 -0700)]
[DECNET]: Add support for fwmark masks in routing rules

Add support for fwmark masks. For compatibility a mask of 0xFFFFFFFF is used
when a mark value != 0 is sent without a mask.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Steven Whitehouse <steve@chygwyn.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV4]: Add support for fwmark masks in routing rules
Patrick McHardy [Fri, 25 Aug 2006 23:10:14 +0000 (16:10 -0700)]
[IPV4]: Add support for fwmark masks in routing rules

Add a FRA_FWMASK attributes for fwmark masks. For compatibility a mask of
0xFFFFFFFF is used when a mark value != 0 is sent without a mask.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6]: Fix build with fwmark disabled.
David S. Miller [Fri, 25 Aug 2006 23:07:48 +0000 (16:07 -0700)]
[IPV6]: Fix build with fwmark disabled.

Based upon a patch by Brian Haley.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] ROUTE: Add support for fwmask in routing rules.
YOSHIFUJI Hideaki [Fri, 25 Aug 2006 23:05:43 +0000 (16:05 -0700)]
[IPV6] ROUTE: Add support for fwmask in routing rules.

Add support for fwmark masks.
A mask of 0xFFFFFFFF is used when a mark value != 0 is sent without a mask.

Based on patch for net/ipv4/fib_rules.c by Patrick McHardy <kaber@trash.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] ROUTE: Fix size of fib6_rule_policy.
YOSHIFUJI Hideaki [Fri, 25 Aug 2006 23:05:00 +0000 (16:05 -0700)]
[IPV6] ROUTE: Fix size of fib6_rule_policy.

It should not be RTA_MAX+1 but FRA_MAX+1.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[IPV6] ROUTE: Fix FWMARK support.
YOSHIFUJI Hideaki [Fri, 25 Aug 2006 23:04:29 +0000 (16:04 -0700)]
[IPV6] ROUTE: Fix FWMARK support.

- Add missing nla_policy entry.
- type of fwmark is u32, not u8.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Respect priority in policy lookups.
David S. Miller [Fri, 25 Aug 2006 22:46:46 +0000 (15:46 -0700)]
[XFRM]: Respect priority in policy lookups.

Even if we find an exact match in the hash table,
we must inspect the inexact list to look for a match
with a better priority.

Noticed by Masahide NAKAMURA <nakam@linux-ipv6.org>.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] tcp_bic: use BUILD_BUG_ON
Alexey Dobriyan [Fri, 25 Aug 2006 07:38:03 +0000 (00:38 -0700)]
[TCP] tcp_bic: use BUILD_BUG_ON

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[TCP] tcp_lp: use BUILD_BUG_ON
Alexey Dobriyan [Fri, 25 Aug 2006 07:37:24 +0000 (00:37 -0700)]
[TCP] tcp_lp: use BUILD_BUG_ON

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NET] in6_pton: Kill errant printf statement.
David S. Miller [Fri, 25 Aug 2006 07:27:09 +0000 (00:27 -0700)]
[NET] in6_pton: Kill errant printf statement.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[NETFILTER] NF_CONNTRACK_FTP: Use in6_pton() to convert address string.
YOSHIFUJI Hideaki [Sun, 18 Jun 2006 18:20:32 +0000 (03:20 +0900)]
[NETFILTER] NF_CONNTRACK_FTP: Use in6_pton() to convert address string.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[NET]: Add common helper functions to convert IPv6/IPv4 address string to network...
YOSHIFUJI Hideaki [Sun, 25 Jun 2006 14:54:55 +0000 (23:54 +0900)]
[NET]: Add common helper functions to convert IPv6/IPv4 address string to network address structure.

These helpers can be used in netfilter, cifs etc.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6] ROUTE: Routing by FWMARK.
YOSHIFUJI Hideaki [Mon, 21 Aug 2006 10:22:01 +0000 (19:22 +0900)]
[IPV6] ROUTE: Routing by FWMARK.

Based on patch by Jean Lorchat <lorchat@sfc.wide.ad.jp>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6] ROUTE: Routing by Traffic Class.
YOSHIFUJI Hideaki [Mon, 21 Aug 2006 10:18:57 +0000 (19:18 +0900)]
[IPV6] ROUTE: Routing by Traffic Class.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPV6] MIP6: Several obvious clean-ups.
YOSHIFUJI Hideaki [Thu, 24 Aug 2006 14:18:12 +0000 (23:18 +0900)]
[IPV6] MIP6: Several obvious clean-ups.

- Remove redundant code.  Pointed out by Brian Haley <brian.haley@hp.com>.
- Unify code paths with/without CONFIG_IPV6_MIP.
- Use NIP6_FMT for IPv6 address textual presentation.
- Fold long line.  Pointed out by David Miller <davem@davemloft.net>.

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
18 years ago[IPSEC] esp: Defer output IV initialization to first use.
David S. Miller [Fri, 22 Sep 2006 22:17:35 +0000 (15:17 -0700)]
[IPSEC] esp: Defer output IV initialization to first use.

First of all, if the xfrm_state only gets used for input
packets this entropy is a complete waste.

Secondly, it is often the case that a configuration loads
many rules (perhaps even dynamically) and they don't all
necessarily ever get used.

This get_random_bytes() call was showing up in the profiles
for xfrm_state inserts which is how I noticed this.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Extract common hashing code into xfrm_hash.[ch]
David S. Miller [Thu, 24 Aug 2006 11:50:50 +0000 (04:50 -0700)]
[XFRM]: Extract common hashing code into xfrm_hash.[ch]

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Hash policies when non-prefixed.
David S. Miller [Thu, 24 Aug 2006 11:45:07 +0000 (04:45 -0700)]
[XFRM]: Hash policies when non-prefixed.

This idea is from Alexey Kuznetsov.

It is common for policies to be non-prefixed.  And for
that case we can optimize lookups, insert, etc. quite
a bit.

For each direction, we have a dynamically sized policy
hash table for non-prefixed policies.  We also have a
hash table on policy->index.

For prefixed policies, we have a list per-direction which
we will consult on lookups when a non-prefix hashtable
lookup fails.

This still isn't as efficient as I would like it.  There
are four immediate problems:

1) Lots of excessive refcounting, which can be fixed just
   like xfrm_state was
2) We do 2 hash probes on insert, one to look for dups and
   one to allocate a unique policy->index.  Althought I wonder
   how much this matters since xfrm_state inserts do up to
   3 hash probes and that seems to perform fine.
3) xfrm_policy_insert() is very complex because of the priority
   ordering and entry replacement logic.
4) Lots of counter bumping, in addition to policy refcounts,
   in the form of xfrm_policy_count[].  This is merely used
   to let code path(s) know that some IPSEC rules exist.  So
   this count is indexed per-direction, maybe that is overkill.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Hash xfrm_state objects by source address too.
David S. Miller [Thu, 24 Aug 2006 11:00:03 +0000 (04:00 -0700)]
[XFRM]: Hash xfrm_state objects by source address too.

The source address is always non-prefixed so we should use
it to help give entropy to the bydst hash.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Kill excessive refcounting of xfrm_state objects.
David S. Miller [Thu, 24 Aug 2006 10:54:22 +0000 (03:54 -0700)]
[XFRM]: Kill excessive refcounting of xfrm_state objects.

The refcounting done for timers and hash table insertions
are just wasted cycles.  We can eliminate all of this
refcounting because:

1) The implicit refcount when the xfrm_state object is active
   will always be held while the object is in the hash tables.
   We never kfree() the xfrm_state until long after we've made
   sure that it has been unhashed.

2) Timers are even easier.  Once we mark that x->km.state as
   anything other than XFRM_STATE_VALID (__xfrm_state_delete
   sets it to XFRM_STATE_DEAD), any timer that fires will
   do nothing and return without rearming the timer.

   Therefore we can defer the del_timer calls until when the
   object is about to be freed up during GC.  We have to use
   del_timer_sync() and defer it to GC because we can't do
   a del_timer_sync() while holding x->lock which all callers
   of __xfrm_state_delete hold.

This makes SA changes even more light-weight.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Purge dst references to deleted SAs passively.
David S. Miller [Thu, 24 Aug 2006 10:30:28 +0000 (03:30 -0700)]
[XFRM]: Purge dst references to deleted SAs passively.

Just let GC and other normal mechanisms take care of getting
rid of DST cache references to deleted xfrm_state objects
instead of walking all the policy bundles.

Signed-off-by: David S. Miller <davem@davemloft.net>
18 years ago[XFRM]: Do not flush all bundles on SA insert.
David S. Miller [Thu, 24 Aug 2006 10:29:04 +0000 (03:29 -0700)]
[XFRM]: Do not flush all bundles on SA insert.

Instead, simply set all potentially aliasing existing xfrm_state
objects to have the current generation counter value.

This will make routes get relooked up the next time an existing
route mentioning these aliased xfrm_state objects gets used,
via xfrm_dst_check().

Signed-off-by: David S. Miller <davem@davemloft.net>