path: root/Documentation/networking
diff options
Diffstat (limited to 'Documentation/networking')
11 files changed, 282 insertions, 21 deletions
diff --git a/Documentation/networking/00-INDEX b/Documentation/networking/00-INDEX
index bbce1215434..9ad9ddeb384 100644
--- a/Documentation/networking/00-INDEX
+++ b/Documentation/networking/00-INDEX
@@ -144,6 +144,8 @@ nfc.txt
- The Linux Near Field Communication (NFS) subsystem.
- IBM PCI Pit/Pit-Phy/Olympic Token Ring driver info.
+ - Open vSwitch developer documentation.
- Overview of network interface operational states.
diff --git a/Documentation/networking/batman-adv.txt b/Documentation/networking/batman-adv.txt
index c86d03f18a5..221ad0cdf11 100644
--- a/Documentation/networking/batman-adv.txt
+++ b/Documentation/networking/batman-adv.txt
@@ -200,15 +200,16 @@ abled during run time. Following log_levels are defined:
0 - All debug output disabled
1 - Enable messages related to routing / flooding / broadcasting
-2 - Enable route or tt entry added / changed / deleted
-3 - Enable all messages
+2 - Enable messages related to route added / changed / deleted
+4 - Enable messages related to translation table operations
+7 - Enable all messages
The debug output can be changed at runtime using the file
/sys/class/net/bat0/mesh/log_level. e.g.
# echo 2 > /sys/class/net/bat0/mesh/log_level
-will enable debug messages for when routes or TTs change.
+will enable debug messages for when routes change.
diff --git a/Documentation/networking/bonding.txt b/Documentation/networking/bonding.txt
index 91df678fb7f..080ad26690a 100644
--- a/Documentation/networking/bonding.txt
+++ b/Documentation/networking/bonding.txt
@@ -196,6 +196,23 @@ or, for backwards compatibility, the option value. E.g.,
The parameters are as follows:
+ Specifies the new active slave for modes that support it
+ (active-backup, balance-alb and balance-tlb). Possible values
+ are the name of any currently enslaved interface, or an empty
+ string. If a name is given, the slave and its link must be up in order
+ to be selected as the new active slave. If an empty string is
+ specified, the current active slave is cleared, and a new active
+ slave is selected automatically.
+ Note that this is only available through the sysfs interface. No module
+ parameter by this name exists.
+ The normal value of this option is the name of the currently
+ active slave, or the empty string if there is no active slave or
+ the current mode does not use an active slave.
Specifies the 802.3ad aggregation selection logic to use. The
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index f41ea240522..1dc1c24a754 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -78,3 +78,30 @@ in software. This is currently WIP.
See header include/net/mac802154.h and several drivers in drivers/ieee802154/.
+6LoWPAN Linux implementation
+The IEEE 802.15.4 standard specifies an MTU of 128 bytes, yielding about 80
+octets of actual MAC payload once security is turned on, on a wireless link
+with a link throughput of 250 kbps or less. The 6LoWPAN adaptation format
+[RFC4944] was specified to carry IPv6 datagrams over such constrained links,
+taking into account limited bandwidth, memory, or energy resources that are
+expected in applications such as wireless Sensor Networks. [RFC4944] defines
+a Mesh Addressing header to support sub-IP forwarding, a Fragmentation header
+to support the IPv6 minimum MTU requirement [RFC2460], and stateless header
+compression for IPv6 datagrams (LOWPAN_HC1 and LOWPAN_HC2) to reduce the
+relatively large IPv6 and UDP headers down to (in the best case) several bytes.
+In Semptember 2011 the standard update was published - [RFC6282].
+It deprecates HC1 and HC2 compression and defines IPHC encoding format which is
+used in this Linux implementation.
+All the code related to 6lowpan you may find in files: net/ieee802154/6lowpan.*
+To setup 6lowpan interface you need (busybox release > 1.17.0):
+1. Add IEEE802.15.4 interface and initialize PANid;
+2. Add 6lowpan interface by command like:
+ # ip link add link wpan0 name lowpan0 type lowpan
+3. Set MAC (if needs):
+ # ip link set lowpan0 address de:ad:be:ef:ca:fe:ba:be
+4. Bring up 'lowpan0' interface
diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c
index 65968fbf1e4..ac5debb2f16 100644
--- a/Documentation/networking/ifenslave.c
+++ b/Documentation/networking/ifenslave.c
@@ -539,12 +539,14 @@ static int if_getconfig(char *ifname)
metric = 0;
} else
metric = ifr.ifr_metric;
+ printf("The result of SIOCGIFMETRIC is %d\n", metric);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFMTU, &ifr) < 0)
mtu = 0;
mtu = ifr.ifr_mtu;
+ printf("The result of SIOCGIFMTU is %d\n", mtu);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFDSTADDR, &ifr) < 0) {
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
index cb7f3148035..ad3e80e17b4 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -20,7 +20,7 @@ ip_no_pmtu_disc - BOOLEAN
default FALSE
min_pmtu - INTEGER
- default 562 - minimum discovered Path MTU
+ default 552 - minimum discovered Path MTU
route/max_size - INTEGER
Maximum number of routes allowed in the kernel. Increase
@@ -31,6 +31,16 @@ neigh/default/gc_thresh3 - INTEGER
when using large numbers of interfaces and when communicating
with large numbers of directly-connected peers.
+neigh/default/unres_qlen_bytes - INTEGER
+ The maximum number of bytes which may be used by packets
+ queued for each unresolved address by other network layers.
+ (added in linux 3.3)
+neigh/default/unres_qlen - INTEGER
+ The maximum number of packets which may be queued for each
+ unresolved address by other network layers.
+ (deprecated in linux 3.3) : use unres_qlen_bytes instead.
mtu_expires - INTEGER
Time, in seconds, that cached PMTU information is kept.
@@ -165,6 +175,9 @@ tcp_congestion_control - STRING
connections. The algorithm "reno" is always available, but
additional choices may be available based on kernel configuration.
Default is set as part of kernel configuration.
+ For passive connections, the listener congestion control choice
+ is inherited.
+ [see setsockopt(listenfd, SOL_TCP, TCP_CONGESTION, "name" ...) ]
tcp_cookie_size - INTEGER
Default size of TCP Cookie Transactions (TCPCT) option, that may be
@@ -282,11 +295,11 @@ tcp_max_ssthresh - INTEGER
Default: 0 (off)
tcp_max_syn_backlog - INTEGER
- Maximal number of remembered connection requests, which are
- still did not receive an acknowledgment from connecting client.
- Default value is 1024 for systems with more than 128Mb of memory,
- and 128 for low memory machines. If server suffers of overload,
- try to increase this number.
+ Maximal number of remembered connection requests, which have not
+ received an acknowledgment from connecting client.
+ The minimal value is 128 for low memory machines, and it will
+ increase in proportion to the memory of machine.
+ If server suffers from overload, try increasing this number.
tcp_max_tw_buckets - INTEGER
Maximal number of timewait sockets held by system simultaneously.
diff --git a/Documentation/networking/openvswitch.txt b/Documentation/networking/openvswitch.txt
new file mode 100644
index 00000000000..b8a048b8df3
--- /dev/null
+++ b/Documentation/networking/openvswitch.txt
@@ -0,0 +1,195 @@
+Open vSwitch datapath developer documentation
+The Open vSwitch kernel module allows flexible userspace control over
+flow-level packet processing on selected network devices. It can be
+used to implement a plain Ethernet switch, network device bonding,
+VLAN processing, network access control, flow-based network control,
+and so on.
+The kernel module implements multiple "datapaths" (analogous to
+bridges), each of which can have multiple "vports" (analogous to ports
+within a bridge). Each datapath also has associated with it a "flow
+table" that userspace populates with "flows" that map from keys based
+on packet headers and metadata to sets of actions. The most common
+action forwards the packet to another vport; other actions are also
+When a packet arrives on a vport, the kernel module processes it by
+extracting its flow key and looking it up in the flow table. If there
+is a matching flow, it executes the associated actions. If there is
+no match, it queues the packet to userspace for processing (as part of
+its processing, userspace will likely set up a flow to handle further
+packets of the same type entirely in-kernel).
+Flow key compatibility
+Network protocols evolve over time. New protocols become important
+and existing protocols lose their prominence. For the Open vSwitch
+kernel module to remain relevant, it must be possible for newer
+versions to parse additional protocols as part of the flow key. It
+might even be desirable, someday, to drop support for parsing
+protocols that have become obsolete. Therefore, the Netlink interface
+to Open vSwitch is designed to allow carefully written userspace
+applications to work with any version of the flow key, past or future.
+To support this forward and backward compatibility, whenever the
+kernel module passes a packet to userspace, it also passes along the
+flow key that it parsed from the packet. Userspace then extracts its
+own notion of a flow key from the packet and compares it against the
+kernel-provided version:
+ - If userspace's notion of the flow key for the packet matches the
+ kernel's, then nothing special is necessary.
+ - If the kernel's flow key includes more fields than the userspace
+ version of the flow key, for example if the kernel decoded IPv6
+ headers but userspace stopped at the Ethernet type (because it
+ does not understand IPv6), then again nothing special is
+ necessary. Userspace can still set up a flow in the usual way,
+ as long as it uses the kernel-provided flow key to do it.
+ - If the userspace flow key includes more fields than the
+ kernel's, for example if userspace decoded an IPv6 header but
+ the kernel stopped at the Ethernet type, then userspace can
+ forward the packet manually, without setting up a flow in the
+ kernel. This case is bad for performance because every packet
+ that the kernel considers part of the flow must go to userspace,
+ but the forwarding behavior is correct. (If userspace can
+ determine that the values of the extra fields would not affect
+ forwarding behavior, then it could set up a flow anyway.)
+How flow keys evolve over time is important to making this work, so
+the following sections go into detail.
+Flow key format
+A flow key is passed over a Netlink socket as a sequence of Netlink
+attributes. Some attributes represent packet metadata, defined as any
+information about a packet that cannot be extracted from the packet
+itself, e.g. the vport on which the packet was received. Most
+attributes, however, are extracted from headers within the packet,
+e.g. source and destination addresses from Ethernet, IP, or TCP
+The <linux/openvswitch.h> header file defines the exact format of the
+flow key attributes. For informal explanatory purposes here, we write
+them as comma-separated strings, with parentheses indicating arguments
+and nesting. For example, the following could represent a flow key
+corresponding to a TCP packet that arrived on vport 1:
+ in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
+ eth_type(0x0800), ipv4(src=, dst=, proto=17, tos=0,
+ frag=no), tcp(src=49163, dst=80)
+Often we ellipsize arguments not important to the discussion, e.g.:
+ in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
+Basic rule for evolving flow keys
+Some care is needed to really maintain forward and backward
+compatibility for applications that follow the rules listed under
+"Flow key compatibility" above.
+The basic rule is obvious:
+ ------------------------------------------------------------------
+ New network protocol support must only supplement existing flow
+ key attributes. It must not change the meaning of already defined
+ flow key attributes.
+ ------------------------------------------------------------------
+This rule does have less-obvious consequences so it is worth working
+through a few examples. Suppose, for example, that the kernel module
+did not already implement VLAN parsing. Instead, it just interpreted
+the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
+packet. The flow key for any packet with an 802.1Q header would look
+essentially like this, ignoring metadata:
+ eth(...), eth_type(0x8100)
+Naively, to add VLAN support, it makes sense to add a new "vlan" flow
+key attribute to contain the VLAN tag, then continue to decode the
+encapsulated headers beyond the VLAN tag using the existing field
+definitions. With this change, an TCP packet in VLAN 10 would have a
+flow key much like this:
+ eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
+But this change would negatively affect a userspace application that
+has not been updated to understand the new "vlan" flow key attribute.
+The application could, following the flow compatibility rules above,
+ignore the "vlan" attribute that it does not understand and therefore
+assume that the flow contained IP packets. This is a bad assumption
+(the flow only contains IP packets if one parses and skips over the
+802.1Q header) and it could cause the application's behavior to change
+across kernel versions even though it follows the compatibility rules.
+The solution is to use a set of nested attributes. This is, for
+example, why 802.1Q support uses nested attributes. A TCP packet in
+VLAN 10 is actually expressed as:
+ eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
+ ip(proto=6, ...), tcp(...)))
+Notice how the "eth_type", "ip", and "tcp" flow key attributes are
+nested inside the "encap" attribute. Thus, an application that does
+not understand the "vlan" key will not see either of those attributes
+and therefore will not misinterpret them. (Also, the outer eth_type
+is still 0x8100, not changed to 0x0800.)
+Handling malformed packets
+Don't drop packets in the kernel for malformed protocol headers, bad
+checksums, etc. This would prevent userspace from implementing a
+simple Ethernet switch that forwards every packet.
+Instead, in such a case, include an attribute with "empty" content.
+It doesn't matter if the empty content could be valid protocol values,
+as long as those values are rarely seen in practice, because userspace
+can always forward all packets with those values to userspace and
+handle them individually.
+For example, consider a packet that contains an IP header that
+indicates protocol 6 for TCP, but which is truncated just after the IP
+header, so that the TCP header is missing. The flow key for this
+packet would include a tcp attribute with all-zero src and dst, like
+ eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
+As another example, consider a packet with an Ethernet type of 0x8100,
+indicating that a VLAN TCI should follow, but which is truncated just
+after the Ethernet type. The flow key for this packet would include
+an all-zero-bits vlan and an empty encap attribute, like this:
+ eth(...), eth_type(0x8100), vlan(0), encap()
+Unlike a TCP packet with source and destination ports 0, an
+all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
+VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
+attribute expressly to allow this situation to be distinguished.
+Thus, the flow key in this second example unambiguously indicates a
+missing or malformed VLAN TCI.
+Other rules
+The other rules for flow keys are much less subtle:
+ - Duplicate attributes are not allowed at a given nesting level.
+ - Ordering of attributes is not significant.
+ - When the kernel sends a given flow key to userspace, it always
+ composes it the same way. This allows userspace to hash and
+ compare entire flow keys that it may not be able to fully
+ interpret.
diff --git a/Documentation/networking/packet_mmap.txt b/Documentation/networking/packet_mmap.txt
index 4acea660372..1c08a4b0981 100644
--- a/Documentation/networking/packet_mmap.txt
+++ b/Documentation/networking/packet_mmap.txt
@@ -155,7 +155,7 @@ As capture, each frame contains two parts:
/* fill sockaddr_ll struct to prepare binding */
my_addr.sll_family = AF_PACKET;
- my_addr.sll_protocol = ETH_P_ALL;
+ my_addr.sll_protocol = htons(ETH_P_ALL);
my_addr.sll_ifindex = s_ifr.ifr_ifindex;
/* bind socket to eth0 */
diff --git a/Documentation/networking/scaling.txt b/Documentation/networking/scaling.txt
index a177de21d28..579994afbe0 100644
--- a/Documentation/networking/scaling.txt
+++ b/Documentation/networking/scaling.txt
@@ -208,7 +208,7 @@ The counter in rps_dev_flow_table values records the length of the current
CPU's backlog when a packet in this flow was last enqueued. Each backlog
queue has a head counter that is incremented on dequeue. A tail counter
is computed as head counter + queue length. In other words, the counter
-in rps_dev_flow_table[i] records the last element in flow i that has
+in rps_dev_flow[i] records the last element in flow i that has
been enqueued onto the currently designated CPU for flow i (of course,
entry i is actually selected by hash and multiple flows may hash to the
same entry i).
@@ -224,7 +224,7 @@ following is true:
- The current CPU's queue head counter >= the recorded tail counter
value in rps_dev_flow[i]
-- The current CPU is unset (equal to NR_CPUS)
+- The current CPU is unset (equal to RPS_NO_CPU)
- The current CPU is offline
After this check, the packet is sent to the (possibly updated) current
@@ -235,7 +235,7 @@ CPU.
==== RFS Configuration
-RFS is only available if the kconfig symbol CONFIG_RFS is enabled (on
+RFS is only available if the kconfig symbol CONFIG_RPS is enabled (on
by default for SMP). The functionality remains disabled until explicitly
configured. The number of entries in the global flow table is set through:
@@ -258,7 +258,7 @@ For a single queue device, the rps_flow_cnt value for the single queue
would normally be configured to the same value as rps_sock_flow_entries.
For a multi-queue device, the rps_flow_cnt for each queue might be
configured as rps_sock_flow_entries / N, where N is the number of
-queues. So for instance, if rps_flow_entries is set to 32768 and there
+queues. So for instance, if rps_sock_flow_entries is set to 32768 and there
are 16 configured receive queues, rps_flow_cnt for each queue might be
configured as 2048.
diff --git a/Documentation/networking/stmmac.txt b/Documentation/networking/stmmac.txt
index 8d67980fabe..d0aeeadd264 100644
--- a/Documentation/networking/stmmac.txt
+++ b/Documentation/networking/stmmac.txt
@@ -4,14 +4,16 @@ Copyright (C) 2007-2010 STMicroelectronics Ltd
Author: Giuseppe Cavallaro <peppe.cavallaro@st.com>
This is the driver for the MAC 10/100/1000 on-chip Ethernet controllers
-(Synopsys IP blocks); it has been fully tested on STLinux platforms.
+(Synopsys IP blocks).
Currently this network device driver is for all STM embedded MAC/GMAC
-(i.e. 7xxx/5xxx SoCs) and it's known working on other platforms i.e. ARM SPEAr.
+(i.e. 7xxx/5xxx SoCs), SPEAr (arm), Loongson1B (mips) and XLINX XC2V3000
+FF1152AMT0221 D1215994A VIRTEX FPGA board.
-DWC Ether MAC 10/100/1000 Universal version 3.41a and DWC Ether MAC 10/100
-Universal version 4.0 have been used for developing the first code
+DWC Ether MAC 10/100/1000 Universal version 3.60a (and older) and DWC Ether MAC 10/100
+Universal version 4.0 have been used for developing this driver.
+This driver supports both the platform bus and PCI.
Please, for more information also visit: www.stlinux.com
@@ -277,5 +279,5 @@ In fact, these can generate an huge amount of debug messages.
6) TODO:
o XGMAC is not supported.
- o Review the timer optimisation code to use an embedded device that will be
- available in new chip generations.
+ o Add the EEE - Energy Efficient Ethernet
+ o Add the PTP - precision time protocol
diff --git a/Documentation/networking/team.txt b/Documentation/networking/team.txt
new file mode 100644
index 00000000000..5a013686b9e
--- /dev/null
+++ b/Documentation/networking/team.txt
@@ -0,0 +1,2 @@
+Team devices are driven from userspace via libteam library which is here:
+ https://github.com/jpirko/libteam