Patrick McHardy [Tue, 18 Dec 2007 06:39:55 +0000 (22:39 -0800)]
[NETFILTER]: nfnetlink_log: fix checks in nfulnl_recv_config
Similar to the nfnetlink_queue fixes:
The peer_pid must be checked in all cases when a logging instance exists,
additionally we must check whether an instance exists before attempting
to configure it to avoid NULL ptr dereferences.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 18 Dec 2007 06:38:20 +0000 (22:38 -0800)]
[NETFILTER]: nf_nat: pass manip type instead of hook to nf_nat_setup_info
nf_nat_setup_info gets the hook number and translates that to the
manip type to perform. This is a relict from the time when one
manip per hook could exist, the exact hook number doesn't matter
anymore, its converted to the manip type. Most callers already
know what kind of NAT they want to perform, so pass the maniptype
in directly.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: nf_conntrack_sctp: add ctnetlink support
This patch adds support for SCTP to ctnetlink.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for James Morris' connsecmark.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: add support for master tuple event notification and dumping
This patch adds support for master tuple event notification and
dumping. Conntrackd needs this information to recover related
connections appropriately.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
[NETFILTER]: ctnetlink: add support for NAT sequence adjustments
The combination of NAT and helpers may produce TCP sequence adjustments.
In failover setups, this information needs to be replicated in order to
achieve a successful recovery of mangled, related connections. This patch is
particularly useful for conntrackd, see:
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise [Tue, 18 Dec 2007 06:27:36 +0000 (22:27 -0800)]
[NETFILTER]: xt_TCPMSS: don't allow netfilter --setmss to increase mss
When terminating DSL connections for an assortment of random customers, I've
found it necessary to use iptables to clamp the MSS used for connections to
work around the various ICMP blackholes in the greater net. Unfortunately,
the current behaviour in Linux is imperfect and actually make things worse,
so I'm proposing the following: increasing the MSS in a packet can never be
a good thing, so make --set-mss only lower the MSS in a packet.
Yes, I am aware of --clamp-mss-to-pmtu, but it doesn't work for outgoing
connections from clients (ie web traffic), as it only looks at the PMTU on
the destination route, not the source of the packet (the DSL interfaces in
question have a 1442 byte MTU while the destination ethernet interface is
1500 -- there are problematic hosts which use a 1300 byte MTU). Reworking
that is probably a good idea at some point, but it's more work than this is.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 18 Dec 2007 05:47:32 +0000 (21:47 -0800)]
[NETFILTER]: ip_tables: fix compat types
Use compat types and compat iterators when dealing with compat entries for
clarity. This doesn't actually make a difference for ip_tables, but is
needed for ip6_tables and arp_tables.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Patrick McHardy [Tue, 18 Dec 2007 05:47:14 +0000 (21:47 -0800)]
[NETFILTER]: ip_tables: account for struct ipt_entry/struct compat_ipt_entry size diff
Account for size differences when dumping entries or calculating the
entry positions. This doesn't actually make any difference for IPv4
since the structures have the same size, but its logically correct
and needed for IPv6.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Tue, 4 Dec 2007 19:33:40 +0000 (20:33 +0100)]
wireless: make drivers include the TSF RX flag where appropriate
These drivers pass full mactime information to the stack, make them
indicate this via the new RX_FLAG_TSFT to get mac80211 to show this
information in monitor mode.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Wed, 12 Dec 2007 15:25:07 +0000 (10:25 -0500)]
introduce WEXT scan capabilities
Introduce scan capabilities to WEXT so that userspace can do intelligent
things with scan behavior such as handling hidden SSIDs more gracefully.
If the driver reports a specific scan capability, the driver must
respect the options specified in the iw_scan_req structure when handling
the SIOCSIWSCAN call, unless it's mode or state does not allow it to do
so, in which case it must return an error.
This version switches to Dave Kilroy's suggestion of claiming unused
padding space for the scan_capa field.
Signed-off-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Tue, 11 Dec 2007 20:33:42 +0000 (21:33 +0100)]
mac80211: conditionally include timestamp in radiotap information
This makes mac80211 include the low-level MAC timestamp
in the radiotap header if the driver indicated (by a new
RX flag) that the timestamp is valid.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 17 Dec 2007 14:58:04 +0000 (12:58 -0200)]
[DCCP]: Remove unused inline function
The function follows48(), which is a special-case of dccp_delta_seqno(),
is nowhere used in the DCCP code, thus removed by this patch.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 17 Dec 2007 14:57:43 +0000 (12:57 -0200)]
[CCID3]: Nofeedback timer according to rfc3448bis
This implements the changes to the nofeedback timer handling suggested
in draft rfc3448bis00, section 4.4. In particular, these changes mean:
* better handling of the lossless case (p == 0)
* the timestamp for computing t_ld becomes obsolete
* much more recent document (RFC 3448 is almost 5 years old)
* concepts in rfc3448bis arose from a real, working implementation
(cf. sec. 12)
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 17 Dec 2007 14:48:47 +0000 (12:48 -0200)]
[CCID3]: Implement rfc3448bis changes to feedback reception
This implements the algorithm to update the allowed sending rate X upon
receiving feedback packets, as described in draft rfc3448bis, 4.2/4.3.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 17 Dec 2007 12:25:06 +0000 (10:25 -0200)]
[CCID3]: Remove two irrelevant states in TX feedback handling
* the NO_SENT state is only triggered in bidirectional mode,
costing unnecessary processing.
* the TERM (terminating) state is irrelevant.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Mon, 17 Dec 2007 12:07:44 +0000 (10:07 -0200)]
[CCID3]: Use a function to update p_inv, and p is never used
This patch
1) concentrates previously scattered computation of p_inv into one function;
2) removes the `p' element of the CCID3 RX sock (it is redundant);
3) makes the tfrc_rx_info structure standalone, only used on demand.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sun, 16 Dec 2007 22:06:41 +0000 (14:06 -0800)]
[SCTP]: Use crc32c library for checksum calculations.
The crc32c library used an identical table and algorithm
as SCTP. Switch to using the library instead of carrying
our own table. Using crypto layer proved to have too
much overhead compared to using the library directly.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sun, 16 Dec 2007 22:04:02 +0000 (14:04 -0800)]
[PACKET]: Fix /proc/net/packet crash due to bogus private pointer
The seq_open_net patch changed the meaning of seq->private.
Unfortunately it missed two spots in AF_PACKET, which still
used the old way of dereferencing seq->private, thus causing
weird and wonderful crashes when reading /proc/net/packet.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sun, 16 Dec 2007 21:32:48 +0000 (13:32 -0800)]
[IPV4]: Switch users of ipv4_devconf(_all) to use the pernet one
These are scattered over the code, but almost all the
"critical" places already have the proper struct net
at hand except for snmp proc showing function and routing
rtnl handler.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sun, 16 Dec 2007 21:31:14 +0000 (13:31 -0800)]
[IPV4]: Store the net pointer on devinet's ctl tables
Some handers and strategies of devinet sysctl tables need
to know the net to propagate the ctl change to all the
net devices.
I use the (currently unused) extra2 pointer on the tables
to get it.
Holding the reference on the struct net is not possible,
because otherwise we'll get a net->ctl_table->net circular
dependency. But since the ctl tables are unregistered during
the net destruction, this is safe to get it w/o additional
protection.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sun, 16 Dec 2007 21:30:39 +0000 (13:30 -0800)]
[IPV4]: Pass the net pointer to the arp_req_set_proxy()
This one will need to set the IPV4_DEVCONF_ALL(PROXY_ARP), but
there's no ways to get the net right in place, so we have to
pull one from the inet_ioctl's struct sock.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Sun, 16 Dec 2007 21:30:07 +0000 (13:30 -0800)]
[IPV4]: Make __devinet_sysctl_register return an error
Currently, this function is void, so failures in creating
sysctls for new/renamed devices are not reported to anywhere.
Fixing this is another complex (needed?) task, but this
return value is needed during the namespaces creation to
handle the case, when we failed to create "all" and "default"
entries.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Pavel Emelyanov [Fri, 14 Dec 2007 19:38:04 +0000 (11:38 -0800)]
[XFRM]: Fix potential race vs xfrm_state(only)_find and xfrm_hash_resize.
The _find calls calculate the hash value using the
xfrm_state_hmask, without the xfrm_state_lock. But the
value of this mask can change in the _resize call under
the state_lock, so we risk to fail in finding the desired
entry in hash.
I think, that the hash value is better to calculate
under the state lock.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
[PPP] synchronous tty: convert dead_sem to completion
PPP synchronous tty channel driver: convert the semaphore dead_sem to a
completion
Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 14 Dec 2007 19:25:26 +0000 (11:25 -0800)]
[UDP]: Move udp_stats_in6 into net/ipv4/udp.c
Now that external users may increment the counters directly, we need
to ensure that udp_stats_in6 is always available. Otherwise we'd
either have to requrie the external users to be built as modules or
ipv6 to be built-in.
This isn't too bad because udp_stats_in6 is just a pair of pointers
plus an EXPORT, e.g., just 40 (16 + 24) bytes on x86-64.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Fri, 14 Dec 2007 01:37:55 +0000 (23:37 -0200)]
[DCCP]: Introducing CCMPS
This introduces a CCMPS field for setting a CCID-specific upper bound on the application payload
size, as is defined in RFC 4340, section 14.
Only the TX CCID is considered in setting this limit, since the RX CCID generates comparatively
small (DCCP-Ack) feedback packets. The CCMPS field includes network and transport layer header
lengths. The only current CCMPS customer is CCID4 (via RFC 4828).
A wrapper is used to allow querying the CCMPS even at times where the CCID modules may not have
been fully negotiated yet.
In dccp_sync_mss() the variable `mss_now' has been renamed into `cur_mps', to reflect that we are
dealing with an MPS, but not an MSS.
Since the DCCP code closely follows the TCP code, the identifiers `dccp_sync_mss' and
`dccps_mss_cache' have been kept, as they have direct TCP counterparts.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
To allow spaces in the name, the slab name string has been changed to
refer to the numeric CCID identifier, using the same format as before.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Fri, 14 Dec 2007 01:31:14 +0000 (23:31 -0200)]
[DCCP]: Documentation for CCID operations
This adds documentation for the ccid_operations structure.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Denis V. Lunev [Thu, 13 Dec 2007 17:47:57 +0000 (09:47 -0800)]
[IPV4]: Thresholds in fib_trie.c are used as consts, so make them const.
There are several thresholds for trie fib hash management. They are used
in the code as a constants. Make them constants from the compiler point of
view.
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Thu, 13 Dec 2007 17:47:00 +0000 (09:47 -0800)]
[IPV6] sit: Rebinding of SIT tunnels to other interfaces
This is similar to the change already done for IPIP tunnels.
Once created, a SIT tunnel can't be bound to another device.
To reproduce:
# create a tunnel:
ip tunnel add tunneltest0 mode sit remote 10.0.0.1 dev eth0
# try to change the bounding device from eth0 to eth1:
ip tunnel change tunneltest0 dev eth1
# show the result:
ip tunnel show tunneltest0
tunneltest0: ipv6/ip remote 10.0.0.1 local any dev eth0 ttl inherit
Notice the bound device has not changed from eth0 to eth1.
This patch fixes it. When changing the binding, it also recalculates the
MTU according to the new bound device's MTU.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Schmidt [Thu, 13 Dec 2007 17:46:32 +0000 (09:46 -0800)]
[IP_GRE]: Rebinding of GRE tunnels to other interfaces
This is similar to the change already done for IPIP tunnels.
Once created, a GRE tunnel can't be bound to another device.
To reproduce:
# create a tunnel:
ip tunnel add tunneltest0 mode gre remote 10.0.0.1 dev eth0
# try to change the bounding device from eth0 to eth1:
ip tunnel change tunneltest0 dev eth1
# show the result:
ip tunnel show tunneltest0
tunneltest0: gre/ip remote 10.0.0.1 local any dev eth0 ttl inherit
Notice the bound device has not changed from eth0 to eth1.
This patch fixes it. When changing the binding, it also recalculates the
MTU according to the new bound device's MTU.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 13 Dec 2007 17:30:59 +0000 (09:30 -0800)]
[IPSEC]: Fix zero return value in xfrm_lookup on error
Further testing shows that my ICMP relookup patch can cause xfrm_lookup
to return zero on error which isn't very nice since it leads to the caller
dying on null pointer dereference. The bug is due to not setting err
to ENOENT just before we leave xfrm_lookup in case of no policy.
This patch moves the err setting to where it should be.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Thu, 13 Dec 2007 14:48:19 +0000 (12:48 -0200)]
[DCCP]: Ignore feature negotiation on Data packets
This implements [RFC 4340, p. 32]: "any feature negotiation options received
on DCCP-Data packets MUST be ignored".
Also added a FIXME for further processing, since the code currently (wrongly)
classifies empty Confirm options as invalid - this needs to be resolved in
a separate patch.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Thu, 13 Dec 2007 14:41:46 +0000 (12:41 -0200)]
[DCCP]: Make code assumptions explicit
This removes several `XXX' references which indicate a missing support
for non-1-byte feature values: this is unnecessary, as all currently known
(standardised) SP feature values are 1-byte quantities.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Thu, 13 Dec 2007 14:40:40 +0000 (12:40 -0200)]
[DCCP]: Remove unused and redundant validation functions
This removes two inlines which were both called in a single function only:
1) dccp_feat_change() is always called with either DCCPO_CHANGE_L or DCCPO_CHANGE_R as argument
* from dccp_set_socktopt_change() via do_dccp_setsockopt() with DCCP_SOCKOPT_CHANGE_R/L
* from __dccp_feat_init() via dccp_feat_init() also with DCCP_SOCKOPT_CHANGE_R/L.
Hence the dccp_feat_is_valid_type() is completely unnecessary and always returns true.
2) Due to (1), the length test reduces to 'len >= 4', which in turn makes
dccp_feat_is_valid_length() unnecessary.
Furthermore, the inline function dccp_feat_is_reserved() was unfolded,
since only called in a single place.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gerrit Renker [Thu, 13 Dec 2007 14:38:11 +0000 (12:38 -0200)]
[DCCP]: Support inserting options during the 3-way handshake
This provides a separate routine to insert options during the initial handshake.
The main purpose is to conduct feature negotiation, for the moment the only user
is the timestamp echo needed for the (CCID3) handshake RTT sample.
Padding of options has been put into a small separate routine, to be shared among
the two functions. This could also be used as a generic routine to finish inserting
options.
Also removed an `XXX' comment since its content was obvious.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>