RFC1180[1]---- TCP/IP 指南

本贴最后更新于 2089 天前,其中的信息可能已经天翻地覆

Network Working Group T. Socolofsky
Request for Comments: 1180 C. Kale
Spider Systems Limited
January 1991

A TCP/IP Tutorial

Status of this Memo

This RFC is a tutorial on the TCP/IP protocol suite, focusing
particularly on the steps in forwarding an IP datagram from source
host to destination host through a router. It does not specify an
Internet standard. Distribution of this memo is unlimited.

Table of Contents

[1](https://tools.ietf.org/html/rfc1180#section-1).  Introduction................................................   [1](https://tools.ietf.org/html/rfc1180#page-1)
[2](https://tools.ietf.org/html/rfc1180#section-2).  TCP/IP Overview.............................................   [2](https://tools.ietf.org/html/rfc1180#page-2)
[3](https://tools.ietf.org/html/rfc1180#section-3).  Ethernet....................................................   [8](https://tools.ietf.org/html/rfc1180#page-8)
[4](https://tools.ietf.org/html/rfc1180#section-4).  ARP.........................................................   [9](https://tools.ietf.org/html/rfc1180#page-9)
[5](https://tools.ietf.org/html/rfc1180#section-5).  Internet Protocol...........................................  [12](https://tools.ietf.org/html/rfc1180#page-12)
[6](https://tools.ietf.org/html/rfc1180#section-6).  User Datagram Protocol......................................  [22](https://tools.ietf.org/html/rfc1180#page-22)
[7](https://tools.ietf.org/html/rfc1180#section-7).  Transmission Control Protocol...............................  [24](https://tools.ietf.org/html/rfc1180#page-24)
[8](https://tools.ietf.org/html/rfc1180#section-8).  Network Applications........................................  [25](https://tools.ietf.org/html/rfc1180#page-25)
[9](https://tools.ietf.org/html/rfc1180#section-9).  Other Information...........................................  [27](https://tools.ietf.org/html/rfc1180#page-27)

10. References.................................................. 27
11. Relation to other RFCs...................................... 27
12. Security Considerations..................................... 27
13. Authors' Addresses.......................................... 28

1. Introduction

This tutorial contains only one view of the salient points of TCP/IP,
and therefore it is the "bare bones" of TCP/IP technology. It omits
the history of development and funding, the business case for its
use, and its future as compared to ISO OSI. Indeed, a great deal of
technical information is also omitted. What remains is a minimum of
information that must be understood by the professional working in a
TCP/IP environment. These professionals include the systems
administrator, the systems programmer, and the network manager.

This tutorial uses examples from the UNIX TCP/IP environment, however
the main points apply across all implementations of TCP/IP.

Note that the purpose of this memo is explanation, not definition.
If any question arises about the correct specification of a protocol,
please refer to the actual standards defining RFC.

Socolofsky & Kale [Page 1]



RFC 1180 A TCP/IP Tutorial January 1991

The next section is an overview of TCP/IP, followed by detailed
descriptions of individual components.

2. TCP/IP Overview

The generic term "TCP/IP" usually means anything and everything
related to the specific protocols of TCP and IP. It can include
other protocols, applications, and even the network medium. A sample
of these protocols are: UDP, ARP, and ICMP. A sample of these
applications are: TELNET, FTP, and rcp. A more accurate term is
"internet technology". A network that uses internet technology is
called an "internet".

2.1 Basic Structure

To understand this technology you must first understand the following
logical structure:

                 ----------------------------
                 |    network applications  |
                 |                          |
                 |...  \ | /  ..  \ | /  ...|
                 |     -----      -----     |
                 |     |TCP|      |UDP|     |
                 |     -----      -----     |
                 |         \      /         |
                 |         --------         |
                 |         |  IP  |         |
                 |  -----  -*------         |
                 |  |ARP|   |               |
                 |  -----   |               |
                 |      \   |               |
                 |      ------              |
                 |      |ENET|              |
                 |      ---@--              |
                 ----------|-----------------
                           |
     ----------------------o---------
         Ethernet Cable

              Figure 1\.  Basic TCP/IP Network Node

This is the logical structure of the layered protocols inside a
computer on an internet. Each computer that can communicate using
internet technology has such a logical structure. It is this logical
structure that determines the behavior of the computer on the
internet. The boxes represent processing of the data as it passes
through the computer, and the lines connecting boxes show the path of

Socolofsky & Kale [Page 2]



RFC 1180 A TCP/IP Tutorial January 1991

data. The horizontal line at the bottom represents the Ethernet
cable; the "o" is the transceiver. The "*" is the IP address and the
"@" is the Ethernet address. Understanding this logical structure is
essential to understanding internet technology; it is referred to
throughout this tutorial.

2.2 Terminology

The name of a unit of data that flows through an internet is
dependent upon where it exists in the protocol stack. In summary: if
it is on an Ethernet it is called an Ethernet frame; if it is between
the Ethernet driver and the IP module it is called a IP packet; if it
is between the IP module and the UDP module it is called a UDP
datagram; if it is between the IP module and the TCP module it is
called a TCP segment (more generally, a transport message); and if it
is in a network application it is called a application message.

These definitions are imperfect. Actual definitions vary from one
publication to the next. More specific definitions can be found in
RFC 1122, section 1.3.3.

A driver is software that communicates directly with the network
interface hardware. A module is software that communicates with a
driver, with network applications, or with another module.

The terms driver, module, Ethernet frame, IP packet, UDP datagram,
TCP message, and application message are used where appropriate
throughout this tutorial.

2.3 Flow of Data

Let's follow the data as it flows down through the protocol stack
shown in Figure 1. For an application that uses TCP (Transmission
Control Protocol), data passes between the application and the TCP
module. For applications that use UDP (User Datagram Protocol), data
passes between the application and the UDP module. FTP (File
Transfer Protocol) is a typical application that uses TCP. Its
protocol stack in this example is FTP/TCP/IP/ENET. SNMP (Simple
Network Management Protocol) is an application that uses UDP. Its
protocol stack in this example is SNMP/UDP/IP/ENET.

The TCP module, UDP module, and the Ethernet driver are n-to-1
multiplexers. As multiplexers they switch many inputs to one output.
They are also 1-to-n de-multiplexers. As de-multiplexers they switch
one input to many outputs according to the type field in the protocol
header.

Socolofsky & Kale [Page 3]



RFC 1180 A TCP/IP Tutorial January 1991

     1   2 3 ...   n                   1   2 3 ...   n
      \  |      /      |               \  | |      /       ^
       \ | |   /       |                \ | |     /        |
     -------------   flow              ----------------   flow
     |multiplexer|    of               |de-multiplexer|    of
     -------------   data              ----------------   data
          |            |                     |              |
          |            v                     |              |
          1                                  1

    Figure 2\.  n-to-1 multiplexer and 1-to-n de-multiplexer

If an Ethernet frame comes up into the Ethernet driver off the
network, the packet can be passed upwards to either the ARP (Address
Resolution Protocol) module or to the IP (Internet Protocol) module.
The value of the type field in the Ethernet frame determines whether
the Ethernet frame is passed to the ARP or the IP module.

If an IP packet comes up into IP, the unit of data is passed upwards
to either TCP or UDP, as determined by the value of the protocol
field in the IP header.

If the UDP datagram comes up into UDP, the application message is
passed upwards to the network application based on the value of the
port field in the UDP header. If the TCP message comes up into TCP,
the application message is passed upwards to the network application
based on the value of the port field in the TCP header.

The downwards multiplexing is simple to perform because from each
starting point there is only the one downward path; each protocol
module adds its header information so the packet can be de-
multiplexed at the destination computer.

Data passing out from the applications through either TCP or UDP
converges on the IP module and is sent downwards through the lower
network interface driver.

Although internet technology supports many different network media,
Ethernet is used for all examples in this tutorial because it is the
most common physical network used under IP. The computer in Figure 1
has a single Ethernet connection. The 6-byte Ethernet address is
unique for each interface on an Ethernet and is located at the lower
interface of the Ethernet driver.

The computer also has a 4-byte IP address. This address is located
at the lower interface to the IP module. The IP address must be
unique for an internet.

Socolofsky & Kale [Page 4]



RFC 1180 A TCP/IP Tutorial January 1991

A running computer always knows its own IP address and Ethernet
address.

2.4 Two Network Interfaces

If a computer is connected to 2 separate Ethernets it is as in Figure
3.

            ----------------------------
            |    network applications  |
            |                          |
            |...  \ | /  ..  \ | /  ...|
            |     -----      -----     |
            |     |TCP|      |UDP|     |
            |     -----      -----     |
            |         \      /         |
            |         --------         |
            |         |  IP  |         |
            |  -----  -*----*-  -----  |
            |  |ARP|   |    |   |ARP|  |
            |  -----   |    |   -----  |
            |      \   |    |   /      |
            |      ------  ------      |
            |      |ENET|  |ENET|      |
            |      ---@--  ---@--      |
            ----------|-------|---------
                      |       |
                      |    ---o---------------------------
                      |             Ethernet Cable 2
       ---------------o----------
         Ethernet Cable 1

         Figure 3\.  TCP/IP Network Node on 2 Ethernets

Please note that this computer has 2 Ethernet addresses and 2 IP
addresses.

It is seen from this structure that for computers with more than one
physical network interface, the IP module is both a n-to-m
multiplexer and an m-to-n de-multiplexer.

Socolofsky & Kale [Page 5]



RFC 1180 A TCP/IP Tutorial January 1991

     1   2 3 ...   n                   1   2 3 ...   n
      \  | |      /    |                \  | |      /       ^
       \ | |     /     |                 \ | |     /        |
     -------------   flow              ----------------   flow
     |multiplexer|    of               |de-multiplexer|    of
     -------------   data              ----------------   data
       / | |     \     |                 / | |     \        |
      /  | |      \    v                /  | |      \       |
     1   2 3 ...   m                   1   2 3 ...   m

    Figure 4\.  n-to-m multiplexer and m-to-n de-multiplexer

It performs this multiplexing in either direction to accommodate
incoming and outgoing data. An IP module with more than 1 network
interface is more complex than our original example in that it can
forward data onto the next network. Data can arrive on any network
interface and be sent out on any other.

                       TCP      UDP
                         \      /
                          \    /
                      --------------
                      |     IP     |
                      |            |
                      |    ---     |
                      |   /   \    |
                      |  /     v   |
                      --------------
                       /         \
                      /           \
                   data           data
                  comes in         goes out
                 here               here

        Figure 5\.  Example of IP Forwarding a IP Packet

The process of sending an IP packet out onto another network is
called "forwarding" an IP packet. A computer that has been dedicated
to the task of forwarding IP packets is called an "IP-router".

As you can see from the figure, the forwarded IP packet never touches
the TCP and UDP modules on the IP-router. Some IP-router
implementations do not have a TCP or UDP module.

2.5 IP Creates a Single Logical Network

The IP module is central to the success of internet technology. Each
module or driver adds its header to the message as the message passes

Socolofsky & Kale [Page 6]



RFC 1180 A TCP/IP Tutorial January 1991

down through the protocol stack. Each module or driver strips the
corresponding header from the message as the message climbs the
protocol stack up towards the application. The IP header contains
the IP address, which builds a single logical network from multiple
physical networks. This interconnection of physical networks is the
source of the name: internet. A set of interconnected physical
networks that limit the range of an IP packet is called an
"internet".

2.6 Physical Network Independence

IP hides the underlying network hardware from the network
applications. If you invent a new physical network, you can put it
into service by implementing a new driver that connects to the
internet underneath IP. Thus, the network applications remain intact
and are not vulnerable to changes in hardware technology.

2.7 Interoperability

If two computers on an internet can communicate, they are said to
"interoperate"; if an implementation of internet technology is good,
it is said to have "interoperability". Users of general-purpose
computers benefit from the installation of an internet because of the
interoperability in computers on the market. Generally, when you buy
a computer, it will interoperate. If the computer does not have
interoperability, and interoperability can not be added, it occupies
a rare and special niche in the market.

2.8 After the Overview

With the background set, we will answer the following questions:

When sending out an IP packet, how is the destination Ethernet
address determined?

How does IP know which of multiple lower network interfaces to use
when sending out an IP packet?

How does a client on one computer reach the server on another?

Why do both TCP and UDP exist, instead of just one or the other?

What network applications are available?

These will be explained, in turn, after an Ethernet refresher.

Socolofsky & Kale [Page 7]



RFC 1180 A TCP/IP Tutorial January 1991

3. Ethernet

This section is a short review of Ethernet technology.

An Ethernet frame contains the destination address, source address,
type field, and data.

An Ethernet address is 6 bytes. Every device has its own Ethernet
address and listens for Ethernet frames with that destination
address. All devices also listen for Ethernet frames with a wild-
card destination address of "FF-FF-FF-FF-FF-FF" (in hexadecimal),
called a "broadcast" address.

Ethernet uses CSMA/CD (Carrier Sense and Multiple Access with
Collision Detection). CSMA/CD means that all devices communicate on
a single medium, that only one can transmit at a time, and that they
can all receive simultaneously. If 2 devices try to transmit at the
same instant, the transmit collision is detected, and both devices
wait a random (but short) period before trying to transmit again.

3.1 A Human Analogy

A good analogy of Ethernet technology is a group of people talking in
a small, completely dark room. In this analogy, the physical network
medium is sound waves on air in the room instead of electrical
signals on a coaxial cable.

Each person can hear the words when another is talking (Carrier
Sense). Everyone in the room has equal capability to talk (Multiple
Access), but none of them give lengthy speeches because they are
polite. If a person is impolite, he is asked to leave the room
(i.e., thrown off the net).

No one talks while another is speaking. But if two people start
speaking at the same instant, each of them know this because each
hears something they haven't said (Collision Detection). When these
two people notice this condition, they wait for a moment, then one
begins talking. The other hears the talking and waits for the first
to finish before beginning his own speech.

Each person has an unique name (unique Ethernet address) to avoid
confusion. Every time one of them talks, he prefaces the message
with the name of the person he is talking to and with his own name
(Ethernet destination and source address, respectively), i.e., "Hello
Jane, this is Jack, ..blah blah blah...". If the sender wants to
talk to everyone he might say "everyone" (broadcast address), i.e.,
"Hello Everyone, this is Jack, ..blah blah blah...".

Socolofsky & Kale [Page 8]



RFC 1180 A TCP/IP Tutorial January 1991

4. ARP

When sending out an IP packet, how is the destination Ethernet
address determined?

ARP (Address Resolution Protocol) is used to translate IP addresses
to Ethernet addresses. The translation is done only for outgoing IP
packets, because this is when the IP header and the Ethernet header
are created.

4.1 ARP Table for Address Translation

The translation is performed with a table look-up. The table, called
the ARP table, is stored in memory and contains a row for each
computer. There is a column for IP address and a column for Ethernet
address. When translating an IP address to an Ethernet address, the
table is searched for a matching IP address. The following is a
simplified ARP table:

              ------------------------------------
              |IP address       Ethernet address |
              ------------------------------------
              |223.1.2.1        08-00-39-00-2F-C3|
              |223.1.2.3        08-00-5A-21-A7-22|
              |223.1.2.4        08-00-10-99-AC-54|
              ------------------------------------
                  TABLE 1\.  Example ARP Table

The human convention when writing out the 4-byte IP address is each
byte in decimal and separating bytes with a period. When writing out
the 6-byte Ethernet address, the conventions are each byte in
hexadecimal and separating bytes with either a minus sign or a colon.

The ARP table is necessary because the IP address and Ethernet
address are selected independently; you can not use an algorithm to
translate IP address to Ethernet address. The IP address is selected
by the network manager based on the location of the computer on the
internet. When the computer is moved to a different part of an
internet, its IP address must be changed. The Ethernet address is
selected by the manufacturer based on the Ethernet address space
licensed by the manufacturer. When the Ethernet hardware interface
board changes, the Ethernet address changes.

4.2 Typical Translation Scenario

During normal operation a network application, such as TELNET, sends
an application message to TCP, then TCP sends the corresponding TCP
message to the IP module. The destination IP address is known by the

Socolofsky & Kale [Page 9]



RFC 1180 A TCP/IP Tutorial January 1991

application, the TCP module, and the IP module. At this point the IP
packet has been constructed and is ready to be given to the Ethernet
driver, but first the destination Ethernet address must be
determined.

The ARP table is used to look-up the destination Ethernet address.

4.3 ARP Request/Response Pair

But how does the ARP table get filled in the first place? The answer
is that it is filled automatically by ARP on an "as-needed" basis.

Two things happen when the ARP table can not be used to translate an
address:

 1\. An ARP request packet with a broadcast Ethernet address is sent
    out on the network to every computer.

 2\. The outgoing IP packet is queued.

Every computer's Ethernet interface receives the broadcast Ethernet
frame. Each Ethernet driver examines the Type field in the Ethernet
frame and passes the ARP packet to the ARP module. The ARP request
packet says "If your IP address matches this target IP address, then
please tell me your Ethernet address". An ARP request packet looks
something like this:

            ---------------------------------------
            |Sender IP Address   223.1.2.1        |
            |Sender Enet Address 08-00-39-00-2F-C3|
            ---------------------------------------
            |Target IP Address   223.1.2.2        |
            |Target Enet Address           |
            ---------------------------------------
                 TABLE 2\.  Example ARP Request

Each ARP module examines the IP address and if the Target IP address
matches its own IP address, it sends a response directly to the
source Ethernet address. The ARP response packet says "Yes, that
target IP address is mine, let me give you my Ethernet address". An
ARP response packet has the sender/target field contents swapped as
compared to the request. It looks something like this:

Socolofsky & Kale [Page 10]



RFC 1180 A TCP/IP Tutorial January 1991

            ---------------------------------------
            |Sender IP Address   223.1.2.2        |
            |Sender Enet Address 08-00-28-00-38-A9|
            ---------------------------------------
            |Target IP Address   223.1.2.1        |
            |Target Enet Address 08-00-39-00-2F-C3|
            ---------------------------------------
                 TABLE 3\.  Example ARP Response

The response is received by the original sender computer. The
Ethernet driver looks at the Type field in the Ethernet frame then
passes the ARP packet to the ARP module. The ARP module examines the
ARP packet and adds the sender's IP and Ethernet addresses to its ARP
table.

The updated table now looks like this:

               ----------------------------------
               |IP address     Ethernet address |
               ----------------------------------
               |223.1.2.1      08-00-39-00-2F-C3|
               |223.1.2.2      08-00-28-00-38-A9|
               |223.1.2.3      08-00-5A-21-A7-22|
               |223.1.2.4      08-00-10-99-AC-54|
               ----------------------------------
               TABLE 4\.  ARP Table after Response

4.4 Scenario Continued

The new translation has now been installed automatically in the
table, just milli-seconds after it was needed. As you remember from
step 2 above, the outgoing IP packet was queued. Next, the IP
address to Ethernet address translation is performed by look-up in
the ARP table then the Ethernet frame is transmitted on the Ethernet.
Therefore, with the new steps 3, 4, and 5, the scenario for the
sender computer is:

 1\. An ARP request packet with a broadcast Ethernet address is sent
    out on the network to every computer.

 2\. The outgoing IP packet is queued.

 3\. The ARP response arrives with the IP-to-Ethernet address
    translation for the ARP table.

Socolofsky & Kale [Page 11]



RFC 1180 A TCP/IP Tutorial January 1991

 4\. For the queued IP packet, the ARP table is used to translate the
    IP address to the Ethernet address.

 5\. The Ethernet frame is transmitted on the Ethernet.

In summary, when the translation is missing from the ARP table, one
IP packet is queued. The translation data is quickly filled in with
ARP request/response and the queued IP packet is transmitted.

Each computer has a separate ARP table for each of its Ethernet
interfaces. If the target computer does not exist, there will be no
ARP response and no entry in the ARP table. IP will discard outgoing
IP packets sent to that address. The upper layer protocols can't
tell the difference between a broken Ethernet and the absence of a
computer with the target IP address.

Some implementations of IP and ARP don't queue the IP packet while
waiting for the ARP response. Instead the IP packet is discarded and
the recovery from the IP packet loss is left to the TCP module or the
UDP network application. This recovery is performed by time-out and
retransmission. The retransmitted message is successfully sent out
onto the network because the first copy of the message has already
caused the ARP table to be filled.

5. Internet Protocol

The IP module is central to internet technology and the essence of IP
is its route table. IP uses this in-memory table to make all
decisions about routing an IP packet. The content of the route table
is defined by the network administrator. Mistakes block
communication.

To understand how a route table is used is to understand
internetworking. This understanding is necessary for the successful
administration and maintenance of an IP network.

The route table is best understood by first having an overview of
routing, then learning about IP network addresses, and then looking
at the details.

5.1 Direct Routing

The figure below is of a tiny internet with 3 computers: A, B, and C.
Each computer has the same TCP/IP protocol stack as in Figure 1.
Each computer's Ethernet interface has its own Ethernet address.
Each computer has an IP address assigned to the IP interface by the
network manager, who also has assigned an IP network number to the
Ethernet.

Socolofsky & Kale [Page 12]



RFC 1180 A TCP/IP Tutorial January 1991

                      A      B      C
                      |      |      |
                    --o------o------o--
                    Ethernet 1
                    IP network "development"

                   Figure 6\.  One IP Network

When A sends an IP packet to B, the IP header contains A's IP address
as the source IP address, and the Ethernet header contains A's
Ethernet address as the source Ethernet address. Also, the IP header
contains B's IP address as the destination IP address and the
Ethernet header contains B's Ethernet address as the destination
Ethernet address.

            ----------------------------------------
            |address            source  destination|
            ----------------------------------------
            |IP header          A       B          |
            |Ethernet header    A       B          |
            ----------------------------------------
   TABLE 5\.  Addresses in an Ethernet frame for an IP packet
                          from A to B

For this simple case, IP is overhead because the IP adds little to
the service offered by Ethernet. However, IP does add cost: the
extra CPU processing and network bandwidth to generate, transmit, and
parse the IP header.

When B's IP module receives the IP packet from A, it checks the
destination IP address against its own, looking for a match, then it
passes the datagram to the upper-level protocol.

This communication between A and B uses direct routing.

5.2 Indirect Routing

The figure below is a more realistic view of an internet. It is
composed of 3 Ethernets and 3 IP networks connected by an IP-router
called computer D. Each IP network has 4 computers; each computer
has its own IP address and Ethernet address.

Socolofsky & Kale [Page 13]



RFC 1180 A TCP/IP Tutorial January 1991

      A      B      C      ----D----      E      F      G
      |      |      |      |   |   |      |      |      |
    --o------o------o------o-  |  -o------o------o------o--
    Ethernet 1                 |  Ethernet 2
    IP network "development"   |  IP network "accounting"
                               |
                               |
                               |     H      I      J
                               |     |      |      |
                             --o-----o------o------o--
                              Ethernet 3
                              IP network "factory"

           Figure 7\.  Three IP Networks; One internet

Except for computer D, each computer has a TCP/IP protocol stack like
that in Figure 1. Computer D is the IP-router; it is connected to
all 3 networks and therefore has 3 IP addresses and 3 Ethernet
addresses. Computer D has a TCP/IP protocol stack similar to that in
Figure 3, except that it has 3 ARP modules and 3 Ethernet drivers
instead of 2. Please note that computer D has only one IP module.

The network manager has assigned a unique number, called an IP
network number, to each of the Ethernets. The IP network numbers are
not shown in this diagram, just the network names.

When computer A sends an IP packet to computer B, the process is
identical to the single network example above. Any communication
between computers located on a single IP network matches the direct
routing example discussed previously.

When computer D and A communicate, it is direct communication. When
computer D and E communicate, it is direct communication. When
computer D and H communicate, it is direct communication. This is
because each of these pairs of computers is on the same IP network.

However, when computer A communicates with a computer on the far side
of the IP-router, communication is no longer direct. A must use D to
forward the IP packet to the next IP network. This communication is
called "indirect".

This routing of IP packets is done by IP modules and happens
transparently to TCP, UDP, and the network applications.

If A sends an IP packet to E, the source IP address and the source
Ethernet address are A's. The destination IP address is E's, but
because A's IP module sends the IP packet to D for forwarding, the
destination Ethernet address is D's.

Socolofsky & Kale [Page 14]



RFC 1180 A TCP/IP Tutorial January 1991

            ----------------------------------------
            |address            source  destination|
            ----------------------------------------
            |IP header          A       E          |
            |Ethernet header    A       D          |
            ----------------------------------------
   TABLE 6\.  Addresses in an Ethernet frame for an IP packet
                     from A to E (before D)

D's IP module receives the IP packet and upon examining the
destination IP address, says "This is not my IP address," and sends
the IP packet directly to E.

            ----------------------------------------
            |address            source  destination|
            ----------------------------------------
            |IP header          A       E          |
            |Ethernet header    D       E          |
            ----------------------------------------
   TABLE 7\.  Addresses in an Ethernet frame for an IP packet
                     from A to E (after D)

In summary, for direct communication, both the source IP address and
the source Ethernet address is the sender's, and the destination IP
address and the destination Ethernet address is the recipient's. For
indirect communication, the IP address and Ethernet addresses do not
pair up in this way.

This example internet is a very simple one. Real networks are often
complicated by many factors, resulting in multiple IP-routers and
several types of physical networks. This example internet might have
come about because the network manager wanted to split a large
Ethernet in order to localize Ethernet broadcast traffic.

5.3 IP Module Routing Rules

This overview of routing has shown what happens, but not how it
happens. Now let's examine the rules, or algorithm, used by the IP
module.

 For an outgoing IP packet, entering IP from an upper layer, IP must
 decide whether to send the IP packet directly or indirectly, and IP
 must choose a lower network interface.  These choices are made by
 consulting the route table.

 For an incoming IP packet, entering IP from a lower interface, IP
 must decide whether to forward the IP packet or pass it to an upper
 layer.  If the IP packet is being forwarded, it is treated as an

Socolofsky & Kale [Page 15]



RFC 1180 A TCP/IP Tutorial January 1991

 outgoing IP packet.

 When an incoming IP packet arrives it is never forwarded back out
 through the same network interface.

These decisions are made before the IP packet is handed to the lower
interface and before the ARP table is consulted.

5.4 IP Address

The network manager assigns IP addresses to computers according to
the IP network to which the computer is attached. One part of a 4-
byte IP address is the IP network number, the other part is the IP
computer number (or host number). For the computer in table 1, with
an IP address of 223.1.2.1, the network number is 223.1.2 and the
host number is number 1.

The portion of the address that is used for network number and for
host number is defined by the upper bits in the 4-byte address. All
example IP addresses in this tutorial are of type class C, meaning
that the upper 3 bits indicate that 21 bits are the network number
and 8 bits are the host number. This allows 2,097,152 class C
networks up to 254 hosts on each network.

The IP address space is administered by the NIC (Network Information
Center). All internets that are connected to the single world-wide
Internet must use network numbers assigned by the NIC. If you are
setting up your own internet and you are not intending to connect it
to the Internet, you should still obtain your network numbers from
the NIC. If you pick your own number, you run the risk of confusion
and chaos in the eventuality that your internet is connected to
another internet.

5.5 Names

People refer to computers by names, not numbers. A computer called
alpha might have the IP address of 223.1.2.1. For small networks,
this name-to-address translation data is often kept on each computer
in the "hosts" file. For larger networks, this translation data file
is stored on a server and accessed across the network when needed. A
few lines from that file might look like this:

223.1.2.1 alpha
223.1.2.2 beta
223.1.2.3 gamma
223.1.2.4 delta
223.1.3.2 epsilon
223.1.4.2 iota

Socolofsky & Kale [Page 16]



RFC 1180 A TCP/IP Tutorial January 1991

The IP address is the first column and the computer name is the
second column.

In most cases, you can install identical "hosts" files on all
computers. You may notice that "delta" has only one entry in this
file even though it has 3 IP addresses. Delta can be reached with
any of its IP addresses; it does not matter which one is used. When
delta receives an IP packet and looks at the destination address, it
will recognize any of its own IP addresses.

IP networks are also given names. If you have 3 IP networks, your
"networks" file for documenting these names might look something like
this:

223.1.2 development
223.1.3 accounting
223.1.4 factory

The IP network number is in the first column and its name is in the
second column.

From this example you can see that alpha is computer number 1 on the
development network, beta is computer number 2 on the development
network and so on. You might also say that alpha is development.1,
Beta is development.2, and so on.

The above hosts file is adequate for the users, but the network
manager will probably replace the line for delta with:

223.1.2.4 devnetrouter delta
223.1.3.1 facnetrouter
223.1.4.1 accnetrouter

These three new lines for the hosts file give each of delta's IP
addresses a meaningful name. In fact, the first IP address listed
has 2 names; "delta" and "devnetrouter" are synonyms. In practice
"delta" is the general-purpose name of the computer and the other 3
names are only used when administering the IP route table.

These files are used by network administration commands and network
applications to provide meaningful names. They are not required for
operation of an internet, but they do make it easier for us.

5.6 IP Route Table

How does IP know which lower network interface to use when sending
out a IP packet? IP looks it up in the route table using a search
key of the IP network number extracted from the IP destination

Socolofsky & Kale [Page 17]



RFC 1180 A TCP/IP Tutorial January 1991

address.

The route table contains one row for each route. The primary columns
in the route table are: IP network number, direct/indirect flag,
router IP address, and interface number. This table is referred to
by IP for each outgoing IP packet.

On most computers the route table can be modified with the "route"
command. The content of the route table is defined by the network
manager, because the network manager assigns the IP addresses to the
computers.

5.7 Direct Routing Details

To explain how it is used, let us visit in detail the routing
situations we have reviewed previously.

                    ---------         ---------
                    | alpha |         | beta  |
                    |    1  |         |  1    |
                    ---------         ---------
                         |               |
                 --------o---------------o-
                  Ethernet 1
                  IP network "development"

           Figure 8\.  Close-up View of One IP Network

The route table inside alpha looks like this:

 --------------------------------------------------------------
 |network      direct/indirect flag  router   interface number|
 --------------------------------------------------------------
 |development  direct                  1               |
 --------------------------------------------------------------
              TABLE 8\.  Example Simple Route Table

This view can be seen on some UNIX systems with the "netstat -r"
command. With this simple network, all computers have identical
routing tables.

For discussion, the table is printed again without the network number
translated to its network name.

Socolofsky & Kale [Page 18]



RFC 1180 A TCP/IP Tutorial January 1991

 --------------------------------------------------------------
 |network      direct/indirect flag  router   interface number|
 --------------------------------------------------------------
 |223.1.2      direct                  1               |
 --------------------------------------------------------------
       TABLE 9\.  Example Simple Route Table with Numbers

5.8 Direct Scenario

Alpha is sending an IP packet to beta. The IP packet is in alpha's
IP module and the destination IP address is beta or 223.1.2.2. IP
extracts the network portion of this IP address and scans the first
column of the table looking for a match. With this network a match
is found on the first entry.

The other information in this entry indicates that computers on this
network can be reached directly through interface number 1. An ARP
table translation is done on beta's IP address then the Ethernet
frame is sent directly to beta via interface number 1.

If an application tries to send data to an IP address that is not on
the development network, IP will be unable to find a match in the
route table. IP then discards the IP packet. Some computers provide
a "Network not reachable" error message.

5.9 Indirect Routing Details

Now, let's take a closer look at the more complicated routing
scenario that we examined previously.

Socolofsky & Kale [Page 19]



RFC 1180 A TCP/IP Tutorial January 1991

      ---------           ---------           ---------
      | alpha |           | delta |           |epsilon|
      |    1  |           |1  2  3|           |   1   |
      ---------           ---------           ---------
           |               |  |  |                |
   --------o---------------o- | -o----------------o--------
    Ethernet 1                |     Ethernet 2
    IP network "Development"  |     IP network "accounting"
                              |
                              |     --------
                              |     | iota |
                              |     |  1   |
                              |     --------
                              |        |
                            --o--------o--------
                                Ethernet 3
                                IP network "factory"

         Figure 9\.  Close-up View of Three IP Networks

The route table inside alpha looks like this:


|network direct/indirect flag router interface number|

|development direct 1 |
|accounting indirect devnetrouter 1 |
|factory indirect devnetrouter 1 |

                  TABLE 10\.  Alpha Route Table

For discussion the table is printed again using numbers instead of
names.


|network direct/indirect flag router interface number|

|223.1.2 direct 1 |
|223.1.3 indirect 223.1.2.4 1 |
|223.1.4 indirect 223.1.2.4 1 |

           TABLE 11\.  Alpha Route Table with Numbers

The router in Alpha's route table is the IP address of delta's
connection to the development network.

Socolofsky & Kale [Page 20]



RFC 1180 A TCP/IP Tutorial January 1991

5.10 Indirect Scenario

Alpha is sending an IP packet to epsilon. The IP packet is in
alpha's IP module and the destination IP address is epsilon
(223.1.3.2). IP extracts the network portion of this IP address
(223.1.3) and scans the first column of the table looking for a
match. A match is found on the second entry.

This entry indicates that computers on the 223.1.3 network can be
reached through the IP-router devnetrouter. Alpha's IP module then
does an ARP table translation for devnetrouter's IP address and sends
the IP packet directly to devnetrouter through Alpha's interface
number 1. The IP packet still contains the destination address of
epsilon.

The IP packet arrives at delta's development network interface and is
passed up to delta's IP module. The destination IP address is
examined and because it does not match any of delta's own IP
addresses, delta decides to forward the IP packet.

Delta's IP module extracts the network portion of the destination IP
address (223.1.3) and scans its route table for a matching network
field. Delta's route table looks like this:


|network direct/indirect flag router interface number|

|development direct 1 |
|factory direct 3 |
|accounting direct 2 |

                 TABLE 12\.  Delta's Route Table

Below is delta's table printed again, without the translation to
names.


|network direct/indirect flag router interface number|

|223.1.2 direct 1 |
|223.1.3 direct 3 |
|223.1.4 direct 2 |

          TABLE 13\.  Delta's Route Table with Numbers

The match is found on the second entry. IP then sends the IP packet
directly to epsilon through interface number 3. The IP packet
contains the IP destination address of epsilon and the Ethernet

Socolofsky & Kale [Page 21]



RFC 1180 A TCP/IP Tutorial January 1991

destination address of epsilon.

The IP packet arrives at epsilon and is passed up to epsilon's IP
module. The destination IP address is examined and found to match
with epsilon's IP address, so the IP packet is passed to the upper
protocol layer.

5.11 Routing Summary

When a IP packet travels through a large internet it may go through
many IP-routers before it reaches its destination. The path it takes
is not determined by a central source but is a result of consulting
each of the routing tables used in the journey. Each computer
defines only the next hop in the journey and relies on that computer
to send the IP packet on its way.

5.12 Managing the Routes

Maintaining correct routing tables on all computers in a large
internet is a difficult task; network configuration is being modified
constantly by the network managers to meet changing needs. Mistakes
in routing tables can block communication in ways that are
excruciatingly tedious to diagnose.

Keeping a simple network configuration goes a long way towards making
a reliable internet. For instance, the most straightforward method
of assigning IP networks to Ethernet is to assign a single IP network
number to each Ethernet.

Help is also available from certain protocols and network
applications. ICMP (Internet Control Message Protocol) can report
some routing problems. For small networks the route table is filled
manually on each computer by the network administrator. For larger
networks the network administrator automates this manual operation
with a routing protocol to distribute routes throughout a network.

When a computer is moved from one IP network to another, its IP
address must change. When a computer is removed from an IP network
its old address becomes invalid. These changes require frequent
updates to the "hosts" file. This flat file can become difficult to
maintain for even medium-size networks. The Domain Name System helps
solve these problems.

6. User Datagram Protocol

UDP is one of the two main protocols to reside on top of IP. It
offers service to the user's network applications. Example network
applications that use UDP are: Network File System (NFS) and Simple

Socolofsky & Kale [Page 22]



RFC 1180 A TCP/IP Tutorial January 1991

Network Management Protocol (SNMP). The service is little more than
an interface to IP.

UDP is a connectionless datagram delivery service that does not
guarantee delivery. UDP does not maintain an end-to-end connection
with the remote UDP module; it merely pushes the datagram out on the
net and accepts incoming datagrams off the net.

UDP adds two values to what is provided by IP. One is the
multiplexing of information between applications based on port
number. The other is a checksum to check the integrity of the data.

6.1 Ports

How does a client on one computer reach the server on another?

The path of communication between an application and UDP is through
UDP ports. These ports are numbered, beginning with zero. An
application that is offering service (the server) waits for messages
to come in on a specific port dedicated to that service. The server
waits patiently for any client to request service.

For instance, the SNMP server, called an SNMP agent, always waits on
port 161. There can be only one SNMP agent per computer because
there is only one UDP port number 161. This port number is well
known; it is a fixed number, an internet assigned number. If an SNMP
client wants service, it sends its request to port number 161 of UDP
on the destination computer.

When an application sends data out through UDP it arrives at the far
end as a single unit. For example, if an application does 5 writes
to the UDP port, the application at the far end will do 5 reads from
the UDP port. Also, the size of each write matches the size of each
read.

UDP preserves the message boundary defined by the application. It
never joins two application messages together, or divides a single
application message into parts.

6.2 Checksum

An incoming IP packet with an IP header type field indicating "UDP"
is passed up to the UDP module by IP. When the UDP module receives
the UDP datagram from IP it examines the UDP checksum. If the
checksum is zero, it means that checksum was not calculated by the
sender and can be ignored. Thus the sending computer's UDP module
may or may not generate checksums. If Ethernet is the only network
between the 2 UDP modules communicating, then you may not need

Socolofsky & Kale [Page 23]



RFC 1180 A TCP/IP Tutorial January 1991

checksumming. However, it is recommended that checksum generation
always be enabled because at some point in the future a route table
change may send the data across less reliable media.

If the checksum is valid (or zero), the destination port number is
examined and if an application is bound to that port, an application
message is queued for the application to read. Otherwise the UDP
datagram is discarded. If the incoming UDP datagrams arrive faster
than the application can read them and if the queue fills to a
maximum value, UDP datagrams are discarded by UDP. UDP will continue
to discard UDP datagrams until there is space in the queue.

7. Transmission Control Protocol

TCP provides a different service than UDP. TCP offers a connection-
oriented byte stream, instead of a connectionless datagram delivery
service. TCP guarantees delivery, whereas UDP does not.

TCP is used by network applications that require guaranteed delivery
and cannot be bothered with doing time-outs and retransmissions. The
two most typical network applications that use TCP are File Transfer
Protocol (FTP) and the TELNET. Other popular TCP network
applications include X-Window System, rcp (remote copy), and the r-
series commands. TCP's greater capability is not without cost: it
requires more CPU and network bandwidth. The internals of the TCP
module are much more complicated than those in a UDP module.

Similar to UDP, network applications connect to TCP ports. Well-
defined port numbers are dedicated to specific applications. For
instance, the TELNET server uses port number 23. The TELNET client
can find the server simply by connecting to port 23 of TCP on the
specified computer.

When the application first starts using TCP, the TCP module on the
client's computer and the TCP module on the server's computer start
communicating with each other. These two end-point TCP modules
contain state information that defines a virtual circuit. This
virtual circuit consumes resources in both TCP end-points. The
virtual circuit is full duplex; data can go in both directions
simultaneously. The application writes data to the TCP port, the
data traverses the network and is read by the application at the far
end.

TCP packetizes the byte stream at will; it does not retain the
boundaries between writes. For example, if an application does 5
writes to the TCP port, the application at the far end might do 10
reads to get all the data. Or it might get all the data with a
single read. There is no correlation between the number and size of

Socolofsky & Kale [Page 24]



RFC 1180 A TCP/IP Tutorial January 1991

writes at one end to the number and size of reads at the other end.

TCP is a sliding window protocol with time-out and retransmits.
Outgoing data must be acknowledged by the far-end TCP.
Acknowledgements can be piggybacked on data. Both receiving ends can
flow control the far end, thus preventing a buffer overrun.

As with all sliding window protocols, the protocol has a window size.
The window size determines the amount of data that can be transmitted
before an acknowledgement is required. For TCP, this amount is not a
number of TCP segments but a number of bytes.

8. Network Applications

Why do both TCP and UDP exist, instead of just one or the other?

They supply different services. Most applications are implemented to
use only one or the other. You, the programmer, choose the protocol
that best meets your needs. If you need a reliable stream delivery
service, TCP might be best. If you need a datagram service, UDP
might be best. If you need efficiency over long-haul circuits, TCP
might be best. If you need efficiency over fast networks with short
latency, UDP might be best. If your needs do not fall nicely into
these categories, then the "best" choice is unclear. However,
applications can make up for deficiencies in the choice. For
instance if you choose UDP and you need reliability, then the
application must provide reliability. If you choose TCP and you need
a record oriented service, then the application must insert markers
in the byte stream to delimit records.

What network applications are available?

There are far too many to list. The number is growing continually.
Some of the applications have existed since the beginning of internet
technology: TELNET and FTP. Others are relatively new: X-Windows and
SNMP. The following is a brief description of the applications
mentioned in this tutorial.

8.1 TELNET

TELNET provides a remote login capability on TCP. The operation and
appearance is similar to keyboard dialing through a telephone switch.
On the command line the user types "telnet delta" and receives a
login prompt from the computer called "delta".

TELNET works well; it is an old application and has widespread
interoperability. Implementations of TELNET usually work between
different operating systems. For instance, a TELNET client may be on

Socolofsky & Kale [Page 25]



RFC 1180 A TCP/IP Tutorial January 1991

VAX/VMS and the server on UNIX System V.

8.2 FTP

File Transfer Protocol (FTP), as old as TELNET, also uses TCP and has
widespread interoperability. The operation and appearance is as if
you TELNETed to the remote computer. But instead of typing your
usual commands, you have to make do with a short list of commands for
directory listings and the like. FTP commands allow you to copy
files between computers.

8.3 rsh

Remote shell (rsh or remsh) is one of an entire family of remote UNIX
style commands. The UNIX copy command, cp, becomes rcp. The UNIX
"who is logged in" command, who, becomes rwho. The list continues
and is referred to collectively to as the "r" series commands or the
"r*" (r star) commands.

The r* commands mainly work between UNIX systems and are designed for
interaction between trusted hosts. Little consideration is given to
security, but they provide a convenient user environment.

To execute the "cc file.c" command on a remote computer called delta,
type "rsh delta cc file.c". To copy the "file.c" file to delta, type
"rcp file.c delta:". To login to delta, type "rlogin delta", and if
you administered the computers in a certain way, you will not be
challenged with a password prompt.

8.4 NFS

Network File System, first developed by Sun Microsystems Inc, uses
UDP and is excellent for mounting UNIX file systems on multiple
computers. A diskless workstation can access its server's hard disk
as if the disk were local to the workstation. A single disk copy of
a database on mainframe "alpha" can also be used by mainframe "beta"
if the database's file system is NFS mounted on "beta".

NFS adds significant load to a network and has poor utility across
slow links, but the benefits are strong. The NFS client is
implemented in the kernel, allowing all applications and commands to
use the NFS mounted disk as if it were local disk.

8.5 SNMP

Simple Network Management Protocol (SNMP) uses UDP and is designed
for use by central network management stations. It is a well known
fact that if given enough data, a network manager can detect and

Socolofsky & Kale [Page 26]



RFC 1180 A TCP/IP Tutorial January 1991

diagnose network problems. The central station uses SNMP to collect
this data from other computers on the network. SNMP defines the
format for the data; it is left to the central station or network
manager to interpret the data.

8.6 X-Window

The X Window System uses the X Window protocol on TCP to draw windows
on a workstation's bitmap display. X Window is much more than a
utility for drawing windows; it is entire philosophy for designing a
user interface.

9. Other Information

Much information about internet technology was not included in this
tutorial. This section lists information that is considered the next
level of detail for the reader who wishes to learn more.

 o administration commands: arp, route, and netstat
 o ARP: permanent entry, publish entry, time-out entry, spoofing
 o IP route table: host entry, default gateway, subnets
 o IP: time-to-live counter, fragmentation, ICMP
 o RIP, routing loops
 o Domain Name System

10. References

[1] Comer, D., "Internetworking with TCP/IP Principles, Protocols,
and Architecture", Prentice Hall, Englewood Cliffs, New Jersey,
U.S.A., 1988.

[2] Feinler, E., et al, DDN Protocol Handbook, Volume 2 and 3, DDN
Network Information Center, SRI International, 333 Ravenswood
Avenue, Room EJ291, Menlow Park, California, U.S.A., 1985.

[3] Spider Systems, Ltd., "Packets and Protocols", Spider Systems
Ltd., Stanwell Street, Edinburgh, U.K. EH6 5NG, 1990.

11. Relation to other RFCs

This RFC is a tutorial and it does not UPDATE or OBSOLETE any other
RFC.

12. Security Considerations

There are security considerations within the TCP/IP protocol suite.
To some people these considerations are serious problems, to others
they are not; it depends on the user requirements.

Socolofsky & Kale [Page 27]



RFC 1180 A TCP/IP Tutorial January 1991

This tutorial does not discuss these issues, but if you want to learn
more you should start with the topic of ARP-spoofing, then use the
"Security Considerations" section of RFC 1122 to lead you to more
information.

13. Authors' Addresses

Theodore John Socolofsky
Spider Systems Limited
Spider Park
Stanwell Street
Edinburgh EH6 5NG
United Kingdom

Phone:
from UK 031-554-9424
from USA 011-44-31-554-9424
Fax:
from UK 031-554-0649
from USA 011-44-31-554-0649

EMail: TEDS@SPIDER.CO.UK

Claudia Jeanne Kale
12 Gosford Place
Edinburgh EH6 4BJ
United Kingdom

Phone:
from UK 031-554-7432
from USA 011-44-31-554-7432

EMail: CLAUDIAK@SPIDER.CO.UK

声明

本文是一篇关于 TCP/IP 协议组件的 RFC,聚焦于一个 IP 数据包经过一个路由器从源主机发送到目标主机的具体步骤。本文没有定义任何新的互联网标准。你可以不受限制的发行本 RFC(Request for Comments 1180)。

目录

  • 01 简介
  • 02 TCP/IP 概要
  • 03 以太网
  • 04 ARP
  • 05 IP
  • 06 UDP
  • 07 TCP
  • 08 上层应用
  • 09 补充信息
  • 10 参考文献
  • 11 与其他 RFC 的关系
  • 12 安全建议
  • 13 联系作者

01 简介

本文阐述了一个 TCP/IP 最显著的特点,它也是 TCP/IP 技术的本质。本文忽略了 TCP/IP 的发展历程、商业案例、未来路线图以及与 ISO/OSI 参考模型之间的比较,事实上许多的技术细节也被忽略掉了,仅最低限度地保留了在 TCP/IP 环境中工作的专业人士必须要了解的知识,这些专业人士包括系统管理员、程序员以及网络管理员。

本文使用的样例来自 Unix 环境,但是 TCP/IP 是通用的。

注意:本文的目的是解释,而非定义。如果你对文中某个协议的说明有任何的质疑,请查阅相关的 RFC 并以事实标准为准。

下一节是 TCP/IP 的概览,接下来再单独对每一个组件进行详细讲解。

02 TCP/IP 概要

术语“TCP/IP”常常意味着每一样与 TCP、IP 技术相关的,甚至不属于 TCP、IP 范畴的东西。它指的可能是其它一些协议,也有可能是上层应用甚至网络媒介。例如常见的 UDP、ARP 以及 ICMP 协议,常见的应用:Telnet、FTP 以及 RCP。有一个更准确的叫法:“互联网技术(internet technology)”。一个网络如果使用了互联网技术则可以称之为互联网。

02.01 基础结构

想要理解这门技术,你首先要理解下图的逻辑结构。

▲ 图 1

这就是互联网中一台主机的内部网络协议的层状逻辑结构图。每一台使用互联网技术相连的主机内部都有这样的逻辑结构。也正是这样的逻辑结构决定了主机在互联网上的表现行为。图中的矩形框代表数据经过主机时的处理过程,连接矩形框的线条代表数据通过的路径。最底端的水平线条代表连接网络的物理线缆,符号“o”代表调制解调器,“*”代表 IP 地址,“@”代表 Mac 地址。理解这个逻辑结构是理解互联网技术的关键,它也自始至终地贯穿于本文。

02.02 术语

在网络中传输的每个单元的数据叫什么名字取决于它存在于哪一层协议栈。简单的来讲,数据如果在网卡以下可以称之为以太帧,如果在网卡与 IP 模块之间可以称之为 IP 包,如果在 IP 模块与 UDP 模块之间可以称之为 UDP 数据报,如果在 IP 模块与 TCP 模块之间可以称之为 TCP 段(通常叫 TCP 消息),数据如果在上层应用中则称之为应用消息。

这些定义并不完美,事实上在不同的版本中定义是千变万化的。更具体的定义可以参考 RFC1122 的 1.3.3 节。

本文中,驱动是指可以直接与网络硬件进行通信的软件,模块是指可以与驱动或应用或其他模块进行通信的软件。

术语“驱动、模块、以太帧、IP 包、UDP 数据报、TCP 消息、应用消息”会根据需要适时使用于本文。

02.03 数据流

让我们化身为数据向着图 1 所示的各层协议栈穿流而下吧!一个应用如果使用 TCP 协议(传输控制协议“Transmission Control Protocol”),数据会经过应用与 TCP 模块;如果使用 UDP 协议(用户数据报协议“User Datagram Protocol”),则经过应用与 UDP 模块。FTP(文件传输协议“File Transfer Protocol”)就是一个典型的使用 TCP 的应用,从上往下它依次用到 FTP/TCP/IP/ENET。SNMP(简单网络管理协议“Simple Network Management Protocol”)则是一个常见的使用 UDP 的应用,从上往下它依次用到 SNMP/UDP/IP/ENET。

TCP 模块、UDP 模块以及以太网卡都是多对一的多路复用器,多路复用器可以把多路输入整合为一路输出;此外它们还是一对多的逆向多路复用器,可以根据协议头的类型字段把一路输入发散为多路输出。

▲ 图 2

一个以太帧经网卡驱动向上传递的时候,数据可以同时发往 ARP 和 IP,此时以太帧的类型字段的值决定了该数据应该发往 ARP 还是 IP 模块。

一个 IP 包经 IP 模块向上传递的时候,可以同时发往 TCP 和 UDP,IP 包头部的协议字段的值决定了该数据应该发往 TCP 还是 UDP 模块。

UDP 数据报经 UDP 模块向上传递的时候,数据发往上层应用则基于 UDP 数据报头部的端口字段的值;TCP 消息经 TCP 模块向上传递的时候,数据发往上层应用则基于 TCP 消息头部的端口字段的值。

多路复用向下传输数据的实现非常简单,因为对于任何一个开始的节点,往下走都只有一条路。数据每流经一层都会被添加上该层的协议头,这样目标主机也可以方便的反向解码数据。

不管是 TCP 还是 UDP,数据都会汇聚到 IP 模块,再往下,发送给底层的网卡驱动。

尽管互联网技术支持诸多的网络媒介,以太网却是 IP 使用的最广泛、最普及的物理网络,这也是本文采用以太网作为样例的原因。图 1 的主机只连接到一个网络,在一个以太网中,每个网络接口都有一个唯一的 6 字节的 Mac 地址,它位于网卡驱动底部的接口上。

该主机还有一个 4 字节的 IP 地址,位于 IP 模块底部的接口上。在同一个网络中,每一个 IP 地址都是唯一的。

一台运行中的主机总是清楚的知道自己的 IP 地址与 Mac 地址。

02.04 两张网卡

一台主机可以同时连接到两个独立的网络,如图 3:

▲ 图 3

注意,该主机有两个 Mac 地址和两个 IP 地址。

从上图的逻辑结构中,我们可以看到:这台主机有两张网卡,而 IP 模块则变成了多对多的多路复用器/逆向多路复用器。

▲ 图 4

IP 模块自动调节输入数据与输出数据,这样就实现了两个方向上的多路复用。一个 IP 模块连接多个网络要比我们第一个样例复杂,因为它能够在不同的网络间转发数据。数据可以从任一网络入站,再转发到另一个网络。

▲ 图 5

IP 包从一个网络发往另一个网络的过程叫做转发 IP 包。一台主机如果专注于转发 IP 包的工作,可以称之为 IP 路由器。

我们可以从图 5 中看到:通过 IP 路由器转发的 IP 包是到达不了 TCP 或者 UDP 模块的,有些 IP 路由器甚至压根就没有 TCP 和 UDP 模块。

02.05 IP 创造了互联网

IP 模块是互联网技术取得成功的关键。当数据往下层流转时,每一层的模块/驱动把自己的相关信息添加到数据形成消息头。当数据向上流转发往上层应用时,每一层的模块/驱动则会剥去相应的消息头。IP 消息头包含有 IP 地址,它从众多的物理网络中创建了一个单独的逻辑网络。内部相互连接的物理网络就是名称“互联网(internet)”的由来。限制在一个 IP 包范围以内的一个连接着的物理网络的集合,就叫做“互联网(internet)”;

02.06 物理网络的独立性

IP 向上层应用隐藏了底部的网络硬件设施。如果你发明了一个新的物理网络,只需要新开发一个可以连接到 IP 的底层驱动,就可以让它运转起来。上层应用原封不动也不会因为硬件的改变而变得脆弱。

02.07 联网

网络中的两台主机可以相互通信,我们就说它“联网”了。互联网技术在主机上得到了很好的实现,我们就说它“可联网”。通常我们的联网需求都得到了很好的满足,因为市场上几乎所有的主机都具备联网能力。如果一台主机无法联网,又无法添加联网组件,它很有可能就卖不掉了。

02.08 概览小结

有了以上的知识,我们还需要回答以下问题:

  • 当我们发送一个出站的 IP 包时,目标主机的 Mac 地址是如何确定的呢?
  • 在发送出站 IP 包时,底层有多个网络接口,IP 是如何确定应该使用哪一个的呢?
  • 为什么既有 TCP 又有 UDP,只要一个不行吗?
  • 上层应用具备哪些特性才算是可靠应用?

所有这些疑问,在我们学习了以太网之后都会一一解答。

03 以太网

这一节是以太网技术的简介

一个以太帧包含有目标主机 Mac 地址、源主机 Mac 地址、类型字段以及数据。

Mac 地址有 6 字节长。每一台网络设备都有自己的 Mac 地址,并且监听目标地址是自己 Mac 地址的以太帧。所有的网络设备还监听一个目标地址是“FF-FF-FF-FF-FF-FF”(16 进制)的以太帧,这个地址叫做广播地址。

以太网使用的是载波监听多点接入/碰撞检测(CSMA/CD“Carrier Sense and Multiple Access with Collision Detection”)技术。载波监听多点接入/碰撞检测意味着所有的设备都在同一个媒介上通信,因此同一时间只有一个设备能够进行数据发送,而其余的设备此时只能接收数据。如果同一时间两台设备都试图发送数据,那么传输冲突就会被检测到,此时两台设备在尝试再次发送数据之前都会等待随机(短暂的)一段时间。

03.01 把 CSMA/CD 类比到人

以太网技术的一个很好的类比就是一群人在一个狭小、漆黑的屋子里谈话。在这个类比中,网络媒介由网络线缆中传输的电信号变成为了空气中传播的声音。

屋子里每一个人都能听到其他人说的话(载波监听“Carrier Sense”);每一个人都有相同的说话能力(多点接入“Multiple Access”),但是他们都不会发表冗长的、啰嗦的演讲,因为他们都特别有礼貌;要是谁没有礼貌,他就会被请出这间屋子(关闭设备)。

其他人在说话的时候,没有人会插嘴。但是同一时间如果有两个人都开口说话,这两人都是能够觉察到的,因为他们听到了不是自己说的话(碰撞检测“Collision Detection”)。当这两人意识到这个情况后,他们就会停下来等待一段时间,然后其中一人会再说,另一个人听到谈话后他会等待其把话说完再开始自己的讲话。

屋子中,每一个人都有一个唯一的名字(Mac 地址唯一)以避免给大家带来困扰。每一次谈话,发言人都会给自己的谈话内容加上前言:我是谁,这话是对谁讲的(源主机 Mac 地址、目标主机 Mac 地址),即:“你好 Jane,我是 Jack,吧啦吧啦……”。如果发言人想对所有人谈话,他就会说“大家好(广播地址)!我是 Jack,吧啦吧啦……”。

04 ARP

当发送一个出站 IP 包的时候,它是如何确定目标主机的 Mac 地址的呢?

地址解析协议“ARP (Address Resolution Protocol)”负责把 IP 地址翻译成为 Mac 地址。只有 IP 包出站的时候需要进行 IP 地址-->Mac 地址的翻译,此时也是构建 IP 头和以太帧头的时候。

04.01 ARP 翻译表

翻译其实就是一个表的查询操作。该表的表名叫 ARP 表,它存在于每台主机的内存中。表中有一个列存有 IP 地址,还有一个列存有 Mac 地址,把一个 IP 地址翻译成为 Mac 地址,实质就是在表中检索目标 IP 地址,取其对应的 Mac 地址。下面是一张简化的 ARP 表:

▲ 表 1

第一列按照人们的习惯,用十进制数字表示 4 字节的 IP 地址,中间用句点分隔;第二列 Mac 地址,6 字节,十六进制表示,中间用减号或冒号分隔。

ARP 表必须要有。因为 IP 地址和 Mac 地址是两个独立的存在,你无法通过一个算法或者公式把 IP 地址翻译成为 Mac 地址。IP 地址是网络管理员根据主机在网络中的位置指派的,当主机在网络中的位置发生变化,IP 地址随之改变。Mac 地址是网络硬件制造商根据以太网地址空间使用许可来确定的,Mac 地址随硬件接口的改变而改变。

04.02 典型的 ARP 翻译场景

网络上一个普通的主机间通信,比如 Telnet 远程主机:Telnet 消息传递到 TCP,TCP 传递相应的 TCP 消息到 IP。在这个过程中,Telnet、TCP、IP 都是知道目标主机 IP 地址的。这个节点上,IP 包已经构造完毕并且准备发往下一层(网卡驱动),但是首先,必须把目标主机的 Mac 地址确定下来。

ARP 表就是被用作查找与目标主机 IP 地址相匹配的 Mac 地址的。

04.03 ARP 请求/响应对

那么最开始的时候 ARP 表中的数据又是从哪里来的呢?答案就是 ARP 模块根据需要自动插入的。

当 ARP 表无法翻译一个 IP 地址时,会发生两件事情:

  1. 一个 ARP 的请求包会通过广播地址发给网络上的所有主机;
  2. 出站的 IP 包进入暂存队列。

其他主机收到广播帧以后,检查其类型字段并发往 ARP 模块。ARP 请求包含有提问“你的 IP 地址与我目标主机的 IP 地址相同吗?如果相同请告诉我你的 Mac 地址”。ARP 请求包看起来就像这样:

▲ 表 2

各主机的 ARP 模块检查自己的 IP 地址是否与 ARP 请求包中的目标地址相同,如果相同,它便会做出响应“是的,我就是你要找的。请记录我的 Mac 地址”。ARP 响应包与 ARP 请求包有相似的结构,只是发送主机和目标主机的位置对调了:

▲ 表 3

发送广播的主机收到响应以后,检查其类型字段并发往 ARP 模块。ARP 模块根据 ARP 响应包中的内容添加一条记录到 ARP 表当中。

更新后的 ARP 看起来就像这样:

▲ 表 4

04.04 尾声

毫秒之间 ARP 请求/响应对就完成了 ARP 表中新的 IP 地址到 Mac 地址映射关系的自动插入。刚才第二步我们提到,出站的 IP 包进入了暂存队列。接下来,通过查询 ARP 表便实现了从 IP 地址到 Mac 地址的翻译,最后以太帧发至网卡。因此,一个新的翻译,还有步骤 3、4、5,对发送主机而言:

  1. 一个 ARP 请求包通过广播地址发给网络上的每台主机;
  2. 出站 IP 包进入暂存队列;
  3. ARP 响应包返回,发送主机更新 ARP 表;
  4. 完成暂存队列中 IP 包的翻译;
  5. 以太帧发至网卡。

简而言之,当 ARP 不能完成一个翻译时,让 IP 包进入暂存队列,通过 ARP 请求/响应迅速实现 ARP 表更新,最后完成队列中待发送 IP 包的翻译。

每一台主机的每张网卡都有独立的 ARP 表。如果目标主机不存在,便不会有 ARP 响应返回,因此发送 ARP 请求的主机不会更新 ARP 表。这时 IP 就会忽略掉这个出站的 IP 包。上层协议是区分不了一张坏掉的网卡和一台不存在的主机之间的差别的。

有些 IP 和 ARP 不会在等待 ARP 响应的时候暂存待发送的 IP 包,而是采用忽略/恢复策略:忽略 IP 包,TCP 模块或者采用 UDP 协议的上层应用意识到有丢包,重发该 IP 包。这个策略是通过超时与重发来实现的。重发的数据总会发送成功,因为第一次的出站消息已经触发广播,ARP 请求/响应会完成 ARP 表更新。

05 IP(Internet Protocol)

IP 模块是互联网技术的核心,而 IP 模块的核心是 IP 路由表,IP 模块就是使用这张运行在内存中的表来确定 IP 包的去向应该经由哪条路线。IP 路由表的内容是网络管理员设置的,如果设置不当会阻塞网络通信。

理解路由表的使用实质就是理解互联网是如何工作的,想要很好地管理和维护网络,就必须强化对它的理解。

学习路由表,我们分三步走:先看路由概要,再学习 IP 地址,最后研究其中的细节。

05.01 直接路由

下图展示了一个简单的互联网:有 3 台主机(A、B、C),使用了图 1 中的 TCP/IP 协议栈。每台主机的网卡都有自己的 Mac 地址,网络管理员还为每台主机指派了相应的 IP 地址和网络号。

▲ 图 6

当主机 A 发送一个 IP 包到主机 B 时,IP 包头含有源主机的 Mac 地址、IP 地址(A 的 Mac 地址和 IP 地址)以及目标主机的 Mac 地址、IP 地址(B 的 Mac 地址和 IP 地址)。

▲ 表 5

在这个样例中,IP 包的组装过程属于常规开销:它本身对以太网服务没有什么帮助,但是生成、传输和解析报文头却增加了 CPU 和网络宽带的开销。

当 B 主机的 IP 模块收到来自 A 的 IP 包,它会检查目标 IP 地址是否跟自己的匹配,如果是,便向上一层协议栈传递该数据。

这就是 A、B 两台主机直连的路由过程。

05.02 非直接路由

下图更加贴近实际的互联网:它有 3 个以太网和 3 个互联网并通过名为 D 的 IP 路由器相连。每个互联网中都包含有 4 台主机,每台主机都有自己的 IP 地址和 Mac 地址。

▲ 图 7

除了 D 以外,每台主机都使用了图 1 中的 TCP/IP 协议栈。D 是一台 IP 路由器,它有 3 个 IP 地址、3 个 Mac 地址、同时连接到了 3 个网络,它使用了图 3 中的协议栈,只不过它有 3 张网卡和 3 个 ARP 模块。注意:D 只有一个 IP 模块。

网络管理员已经为每一个网络定义了网络号:一个唯一的数字。上图只展示了网络名称没有展示网络号。

当主机 A 发送一个 IP 包给主机 B 时,路由过程跟之前的直接路由完全是一样的。任何一个单独的网络上的主机间通信都跟之前讨论的直接路由样例相同。

主机 D 与 A、D 与 E、D 与 H 都是直接通信,因为他们每一组主机(DA、DE、DH)都在一个网络上。

但是当主机 A 需要与 IP 路由器 D 另一端的主机通信时,就无法直接通信了。A 必须通过 D 来转发,IP 包才能到达另一个网络。这样的通信过程就叫做非直接路由。

IP 包的路由选择是由 IP 模块完成的,这个操作对 TCP、UDP 以及上层应用是透明的。

当主机 A 发送一个 IP 包给主机 E 时,源主机的 IP 地址、Mac 地址就是 A 的 IP 地址和 Mac 地址,目标主机 IP 地址是 E 的 IP 地址,但是由于需要通过 D 来转发,此时目标主机的 Mac 地址是 D 的 Mac 地址。

▲ 表 6

D 的 IP 模块收到这个 IP 包以后,检查其目标主机 IP 地址,发现不是发给自己的,于是直接转发给了 E。

▲ 表 7

简单的说,直接通信中源主机 IP 地址、Mac 地址就是发送者的 IP 地址和 Mac 地址,目标主机的 IP 地址和 Mac 地址就是接收者的 IP 地址和 Mac 地址。非直接通信就不完全是这样的了。

这个样例非常简单,现实生活中的网络要复杂得多:N 个 IP 路由器,各种类型的物理网络……这个样例中的网络划分适用于把一个大的网络切割为几个小的网络以避免全网广播。

05.03 路由规则

前一节已经揭示了路由的大体过程,接下来我们继续研究 IP 模块用到的规则或算法。

  • 对于一个出站的 IP 包,当数据从上层协议到达以后,IP 模块必须判断它是直发还是非直发才能进一步确定应该走哪一个底层网络。这一系列操作是通过查询路由表来实现的。
  • 对于一个从底层网络来的入站 IP 包,IP 模块则须判断它是转发还是向上层协议传递。如果是转发,就视它为一个出站的 IP 包。
  • 一个入站的 IP 包到达以后,它就不会再发回它原来的网络了。

这些操作在 IP 包往下一层协议传递和查询 ARP 表之前就必须完成。

05.04 IP 地址

网络管理员根据主机所在的网络指派其 IP 地址。4 字节的 IP 地址由网络位和地址位构成。例如表 1 中,主机 IP 地址是 223.1.2.1,其中网络位是 223.1.2,主机地址位是 1

4 字节的 IP 地址中,哪些用作网络位,哪些用作地址位是由高位地址(第一字节的前几位)来确定的。本文样例中 IP 地址都是 C 类地址,C 类地址第一字节的前三位一定是 110,接下来的 21 位是网络位,最末 8 位是主机地址位。C 类地址可以至多含有 2,097,152 个网络,每个网络中至多可以有 254 台主机。

IP 地址是由 NIC(网络信息中心“Network Information Center”)来管理的。所有接入因特网的互联网都必须使用 NIC 指派的网络号。如果你组建了一个自己的互联网,即便不打算接入因特网,你还是应该从 NIC 中获取自己的网络号。因为如果你随意指定网络号,万一某天你的网络要与其他网络相连,就会有混乱和干扰的风险。

05.05 主机名与网络名

人们常常通过名称而非号码来区分主机。假设一台名为 alpha 的主机,IP 地址是 223.1.2.1。在小型网络中,各主机上的 host 文件保存了主机名称与主机地址间的映射关系。在大型网络中有专门的文件存储在特定的服务器上方便客户机随时调用。摘录几行这个文件的数据如下:

223.1.2.1 alpha
223.1.2.2 beta
223.1.2.3 gamma
223.1.2.4 delta
223.1.3.2 epsilon
223.1.4.2 iota

IP 地址为第一列,主机名称为第二列。

正常情况下,同一个网络中的所有主机都可以使用同一个 host 文件。你可能注意到了,为何主机 delta 有 3 个 IP 地址却只有一条记录?因为 delta 可以通过 3 个地址中的任意一个来访问。当 delta 收到一个 IP 包时,它会用它的每一个 IP 地址与 IP 包的目标地址进行匹配。

网络也是有名字的。如果你有 3 个网络,那么保存网络名字的 networks 文件看起来就像这样:

223.1.2 development
223.1.3 accounting
223.1.4 factory

网络号为第一列,网络名称为第二列。

从上面可以看出,alpha 是 development 网络上的 1 号主机,beta 是 development 网络上的 2 号主机,因此你还可以称之为:development.1development.2

对于普通用户而言,前面的 host 文件已经够用了。但是对于网络管理员,delta 那一行可能会被替换为三行:

223.1.2.4 devnetrouter delta
223.1.3.1 facnetrouter
223.1.4.1 accnetrouter

host 文件中的三行新记录,为 delta 的每一个 IP 地址都添加了一个直观的名字。事实上,第一个 IP 地址有两个名字(devnetrouterdelta),这两个名字指同一台主机。按照惯例,delta 是针对普通用户的主机名,另外 3 个名字是给管理员维护路由表用的。

这些文件被网络命令和上层应用用到,可以返回直观的网络名/主机名,不用过多的操作网络,但极大的方便了人们的工作。

05.06 IP 路由表

在发送一个出站 IP 包的时候,IP 是如何确定应该走哪一个底层网络的呢?IP 模块会从 IP 包中获取目标主机 IP 地址,并以此为关键字在路由表中检索相应的网络号。

每一条路由在路由表中都存有一条记录。路由表的前几列分别是:网络号,直发标记,路由器 IP 地址,网络接口号。IP 模块使用此表来确定每一个出站 IP 包的去向。

大部分主机上的路由表都可以使用 route 命令进行修改。路由表的内容由网络管理员维护,因为网络管理员负责指派主机的 IP 地址。

05.07 直接路由的细节

为了解释路由过程具体是怎样的,以前面讲的直接路由为例,我们再来详细探讨一下其中的细节。

▲ 图 8

alpha 内部的路由表看起来就像这样:

▲ 表 8

在 Unix 操作系统中,可以使用命令 netstat -r 查询这张视图。样例中这样简单的网络,所有的主机都使用相同的路由表。

为了便于讨论,我们把视图中的网络名称替换为网络号:

▲ 表 9

05.08 直接路由场景

当 alpha 发送一个 IP 包到 beta 时,IP 包在 alpha 的 IP 模块,目标地址是 beta 或者 223.1.2.2。IP 模块从目标地址网络位中取出网络号,并在路由表的第一列中检索这个网络,再返回第一条匹配的记录。

另外,我们还可以从这条记录中得知:这个网络中的主机都可以通过网卡 1 直接通信。alpha 查询 ARP 表得到 beta 的 Mac 地址,这样以太帧就通过网卡 1 直接发往了 beta 主机。

如果一个上层应用发送数据的目标地址不在 development 网络上,在路由表中找不到能够匹配的记录,那么 IP 会忽略掉这个 IP 包。有些主机会返回“目标主机不可达”的错误信息。

05.09 非直接路由的细节

现在,我们再近距离地观察一下之前提到的更复杂的非直接路由过程。

▲ 图 9

alpha 内部的路由表看起来就像这样:

▲ 表 10

为了方便讨论,我们还是把网络名称替换为网络号:

▲ 表 11

alpha 的路由表中路由器指向了 delta 连接“development”网络的 IP 地址。

05.10 非直接路由场景

当 alpha 发送一个 IP 包到 epsilon 时。IP 包在 alpha 的 IP 模块,目标地址是 223.1.3.2。IP 从目标地址网络位中取出网络号(223.1.3),并在路由表的第一列检索这个网络,表中第二条记录匹配上了,故返回此条记录。

这条记录表明,网络 223.1.3 上的主机都可以通过路由器 devnetrouter 访问。故 alpha 的 IP 模块查询 ARP 表得到路由器 devnetrouter 的 Mac 地址,IP 包便通过网卡 1 直接发往 devnetrouter,此时 IP 包中目标主机的 IP 地址仍然是 epsilon 的 IP。

IP 包穿过 delta 的 development 网络接口后,向上传递给 delta 的 IP 模块。检查其目标 IP 地址,由于不能跟自己的任意一个 IP 匹配上,delta 便会转发该 IP 包。

delta 的 IP 模块从目标地址网络位中取出网络号,在路由表中检索这个网络。delta 的路由表看起来就像这样:

▲ 表 12

同样把网络名替换为网络号:

▲ 表 13

表中第二条记录匹配上了,返回。delta 的 IP 模块便通过网卡 3 直接发送 IP 包至 epsilon。此时 IP 包目标主机 IP 地址与 Mac 地址都是 epsilon。

IP 包到达 epsilon 以后向上传递,IP 模块进行目标 IP 地址检查,发现正好匹配上自己的地址,便再传递给上层协议。

05.11 小结

当一个 IP 包穿过一个大型网络,到达目标主机之前,它可能会经过许多的路由器。路径的选择不是由某个中央服务器来决定的,而是不断的查询途经的各个路由器的路由表。每一台路由器都只决定下一个跃点的位置,并且依赖下一个跃点按照同样的方式来发送 IP 包。

05.12 管理网络

在大型网络中,保证每一台主机的路由表都正确是一项艰巨的任务,由于需求不断的变化,网络管理员可能会不停的修改网络配置。路由表出错就会阻塞网络通信,而排错又极其的痛苦、乏味。

所以保持一个简单的网络配置对构建一个可靠的网络是大有裨益的。例如,给以太网分配网络号最直观的方法就是给每一个以太网单独分配一个网络号。

某些协议或应用的帮助信息也是非常有用的。ICMP(“Internet 控制报文协议”Internet Control Message Protocol)就会反馈某些路由问题。在小型网络中,各主机上的路由表是由网络管理员手动添加的;在大型网络中这个手工操作,可以使用路由协议自动分发路由信息到整个网络。

当一台主机从一个网络迁移到另一个网络时,其 IP 地址必须重新分配;当一台主机从一个网络上撤走时,它的 IP 地址也必然不可用了。这些变化需要频繁的更新 host 文件,即便是在中等规模的网络中,维护这个文件也是很困难的。DNS 便应运而生了。

06 UDP(用户数据报协议“User Datagram Protocol”)

UDP 是 IP 协议栈上层的两个主要协议之一。它直接为上层应用提供服务。常见使用 UDP 的应用有:网络文件系统(NFS “Network File System”),简单网络管理协议(SNMP “Simple Network Management Protocol”)。UDP 可以看作 IP 的一个接口。

UDP 属于无连接协议,因此它无法保证数据报一定能发送成功。相互通信的主机其 UDP 模块并不会建立端到端的连接,它只负责推送出站的数据报和接收入站的数据报。

UDP 为 IP 添加了两个新的特性:一是不同的应用使用不同的端口,实现了信道充分利用;二是使用校验码实现了数据的完整性校验。

06.01 端口

数据是怎么从客户端到达服务器的呢?
应用与 UDP 模块间的通信是通过端口建立起来的。端口是从〇开始的一个有限整数(0 ~ 65535)。一个应用要对外提供服务,它便会监听一个指定的端口,它会在该端口耐心的等待客户端发起请求。

例如,SNMP 服务器,又叫 SNMP 代理。它的监听端口是 161。一台主机上只可能有一个 SNMP 代理,因为一台主机只有一个 161 端口。161 是一个保留端口,一个在互联网内部已指定用途的端口,这是众所周知的。当一个 SNMP 客户端想要连接服务器时,它便会把请求发往 SNMP 代理的 161 号端口。

远端主机收到应用通过 UDP 发送的数据都是一个个独立的数据包。例如,一个应用通过 UDP 发送了 5 个数据包,那么远端主机上的应用也要从 UDP 中接收 5 个数据包。发送方与接收方的每个数据包大小完全是一样的。

UDP 会保存应用定义的消息边界。UDP 不会把两个不同应用的消息合并为一个数据包,也不会把一个应用的消息拆分为几个。

06.02 校验码

一个入站的 IP 包,IP 头的类型字段会指明它采用了哪种协议,如果是 UDP,IP 便会把它传递给 UDP 模块。当 UDP 模块收到来自 IP 的数据报以后,它会检查其校验码。如果校验码为〇,意味着这个校验码不是发送者的计算结果,这个数据报可以忽略掉。发送主机的 UDP 模块可以开启或关闭生成校验码。如果一个网络上只有两台主机通过 UDP 通信,这时你可以不用校验数据,但还是建议你总是启用校验,因为谁也不知道何时路由表会改变,网络会变得不可靠。

如果校验码是有效的,UDP 会检查是否有应用绑定在目标端口上。如果有,一个应用消息便进入队列等待应用读取,否则忽略这个数据报。如果 UDP 数据报的入站速度超过了应用读取消息的速度、队列中待读取的消息也达到了最大值,那么直到队列中有空间为止,后来的 UDP 数据报都会被忽略掉。

07 TCP(传输控制协议“Transmission Control Protocol”)

TCP 提供了与 UDP 完全不同的另一种服务。TCP 是面向连接的,字节流的,保证交付(可靠)的一种协议。

对可靠性要求较高(不能有超时或重发)的应用常选用 TCP 协议。最常见的两个使用 TCP 的应用是 FTP(文件传输协议“File Transfer Protocol”)和 Telnet。还有不少受欢迎的应用诸如 X-Window 图形化界面(X-Window System),rcp (remote copy)以及 r 家族命令(r-series commands)也是采用 TCP。TCP 的高可靠性也是有代价的:它需要更多的 CPU 开销和更大的网络带宽。TCP 的内部构造要比 UDP 复杂得多。

跟 UDP 一样,TCP 也通过端口与应用通信。不同的应用对应不同的端口,例如:Telnet 服务对应的端口是 23。Telnet 客户端通过 TCP 协议连接指定主机的 23 号端口可以轻松地访问服务器。

当应用第一次使用 TCP 发消息时,客户端与服务器端的 TCP 模块便开始建立通信了。这一对端到端的 TCP 模块通过定义虚拟电路来维持连接状态。这个虚拟电路会同时给两台主机带来开销。虚拟电路采用全双工模式工作,即数据可以在两个方向上同时进行传输操作。本地应用往端口中写入数据,数据穿过网络被远端应用读取。

TCP 对字节流的封包是任意的,它不会记录两次写操作间的边界。例如:一个应用在指定的 TCP 端口执行了 5 次写操作,远端应用有可能需要 10 次读操作才能取完所有数据,也有可能只用 1 次读操作就把所有数据取完了。两端的读操作与写操作的次数、大小都没有直接联系。

TCP 使用滑动窗口机制,自带超时与重发功能。远端主机 TCP 必须发送回执确认收到的出站数据。回执由传输数据捎带过来。两端都能控制对端的数据流量,这样可以防止缓冲区溢出。

跟所有采用滑动窗口机制的协议一样,TCP 也有窗口大小。窗口大小决定了在要求一次确认回执之前可以传输多少数据。对 TCP 而言,窗口数据的大小不是以 TCP 消息为单位,而是以字节为单位。

08 上层应用

为什么既有 TCP 又有 UDP,只要一个不行吗?

答案是它们各自满足了不同的需求。许多上层应用都只采用了 TCP/UDP 中的一种协议。在编程的时候你也只需要选取一种能满足自己需求的协议就行了。如果你需要可靠的数据流服务,TCP 是最好的选择;如果需要数据报服务,UDP 是最好的选择;如果需要建立长连接,TCP 是最好的选择;如果需要低延迟传输数据,UDP 是最好的选择。如果你的需求不在以上范围以内,那该选哪种协议还真不好说。但是应用可以通过优化编程来弥补协议上的不足。例如,你的应用采用 UDP 协议,又想拥有高可靠性,那么在编程的时候就应该足够健壮;如果你选择了 TCP 协议,又想要优化写速度,那么在传输数据时就应往比特流中插入标记以拆分数据。

什么样的应用才算是可靠应用呢?

可靠的应用有太多的特点了,而且这些特点还在不断的增长中。
有些应用在创世纪之初就已经存在了:TELNET、FTP;还有一些是后来诞生的:X-Window、SNMP。下面简单地介绍一下文中提到的那些应用。

08.01 TELNET

TELNET 通过 TCP 实现了远程登录。TELNET 操作有点像通过电话交换系统进行键盘拨号。用户在命令行中输入“TELNET delta”时会收到主机 delta 的一个登录提示符。
TELNET 虽然古老,但尚未被淘汰,今天它也仍然广泛存在于各种网络中。使用 TELNET 的场合往往是跨平台的登录,例如,TELNET 客户端可能在 VAX/VMS 上,而 TELNET 服务端却在 UNIX System V 上。

08.02 FTP

文件传输协议(FTP“File Transfer Protocol”),跟 Telnet 一样古老,也是采用 TCP 协议,也还广泛地活跃在各种网络中。FTP 有点像 Telnet 到远程主机之后执行的操作,只不过不是敲入常用的命令,而必须使用如显示目录(DIR/LS)之类的 FTP 的一系列简短的命令。FTP 方便我们在不同的主机间传输文件。

08.03 rsh

远程 shell(rshremsh)是 Unix 下 r 家族命令中的一员。Unix 下的复制命令 cp 演化为 rcp,登录查询命令 who 演化为 rwho。r 家族命令又叫 r系列 命令或 r* 命令,该家族命令还在不断的增长中。

r 家族命令主要工作在 Unix 环境,设计的初衷是方便受信任的主机间通信。它对安全因素考虑较少,但是对用户非常友好。

想在远程主机 delta 执行命令 cc file.c,只需要输入 rsh delta cc file.c;想复制文件 file.c 到主机 delta,只需要输入 rcp file.c delta:;想登陆到 delta,只需要输入 rlogin delta。如果你配置得当,使用这些命令甚至无需输入密码。

08.04 NFS

网络文件系统(NFS“Network File System”),最初由 Sun 公司开发,使用 UDP 协议,在多个主机挂载 Unix 文件系统上有着极好的性能。无硬盘工作站可以通过 NFS 连接到服务器分配给自己的硬盘上。主机 alpha 上数据库的一个单独的备份硬盘,可以供主机 beta 使用,如果 beta 的数据库文件系统采用的是 NFS 的话。

NFS 会占用海量的网络带宽,在低网速环境中性能也很低,好在它带来的裨益也不少。NFS 客户端已经集成到了操作系统核心当中,允许上层应用或者命令像挂载本地磁盘一样调用 NFS 挂载硬盘。

08.05 SNMP

简单网络管理协议(SNMP)使用 UDP,主要用作核心网络设备的管理。它被人们所熟知的特点就是只要有足够的信息,网络管理员就能够诊断网络故障。核心交换机/路由器通过 SNMP 从网络上的其他主机收集该信息。SNMP 仅定义数据格式,数据内容由核心网络设备或者网络管理员定义。

08.06 X-Window

图形化界面使用基于 TCP 的 X-Window 协议。X-Window 不仅仅是一个图形化界面的组件,它还有一套完整的用户接口设计理念。

09 补充信息

互联网技术的许多内容在本文中都没有提到,对于那些想要了解更多知识的读者,下面的清单列出了下一阶段需要学习的内容:

  • 管理命令:arp, route, netstat
  • ARP:permanent entry, publish entry, time-out entry, ARP 欺骗(spoofing)
  • IP 路由表:host entry, 默认网关(default gateway), 子网(subnets)
  • IP:TTL(time-to-live counter), IP 段(fragmentation), ICMP
  • RIP,路由环路(routing loops)
  • 域名系统(Domain Name System)

10 参考文献

  • [1] Comer, D., "Internetworking with TCP/IP Principles, Protocols,
    and Architecture", Prentice Hall, Englewood Cliffs, New Jersey,
    U.S.A., 1988.
  • [2] Feinler, E., et al, DDN Protocol Handbook, Volume 2 and 3, DDN
    Network Information Center, SRI International, 333 Ravenswood
    Avenue, Room EJ291, Menlow Park, California, U.S.A., 1985.
  • [3] Spider Systems, Ltd., "Packets and Protocols", Spider Systems
    Ltd., Stanwell Street, Edinburgh, U.K. EH6 5NG, 1990.

11 与其他 RFC 的关系

本文并未更新或者淘汰任何其他的 RFC!

12 安全建议

这里有一些关于 TCP/IP 协议组件的安全建议,对某些用户而言至关重要;对其他用户而言可能无关紧要,这完全取决于用户的需求。

本文并未对这些问题展开讨论,如果你想了解更多的知识,建议从 ARP 欺骗开始,再到 RFC1122-“安全建议”对各章节进行系统学习。

13 联系作者

Theodore John Socolofsky
Spider Systems Limited
Spider Park
Stanwell Street
Edinburgh EH6 5NG
United Kingdom

Phone:

  • from UK 031-554-9424
  • from USA 011-44-31-554-9424

Fax:

  • from UK 031-554-0649
  • from USA 011-44-31-554-0649

EMail: TEDS@SPIDER.CO.UK

Claudia Jeanne Kale
12 Gosford Place
Edinburgh EH6 4BJ
United Kingdom

Phone:

  • from UK 031-554-7432
  • from USA 011-44-31-554-7432

EMail: CLAUDIAK@SPIDER.CO.UK

相关帖子

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...