Notes for Computer Network

- L1-1 还有补充
- L1-2 未完成
- L2-1 未完成
- L2-2 还有订正
- L2-3 未完成
- L2-4 几乎完成,残留问题已标注, 最后没讲完!!!
- 同步问题
- L2-5
- L4-1 🤬🤬
- L4-2 🤬🤬🤬🤬🤬🤬
Chapter1 Overview
Network Composition
Protocol: (Definition in Textbook P6)
- Protocol is a state machine.
- No protocol ensures 100% reliability.
Internet includes Computing devices, Communication links and Routers.
Three key parts:
Network Edge
Host/Endpoint System: (主机/端系统)
includes Client and ServerNetwork Core
mainly the switch
two model- client/server model
- P2P model
two switching
- circuit switching
- packet switching
The core concept of switching: Statistical Multiplexing
Access Networks
Endpoint system to Edge router.- Home: DSL, Cable(HFC), FTTH…
- Enterprise(and home): LAN: Ethernet, WiFi…
- Wide-area wireless: 3G, 4G, LTE, 5G…
Physical medium: Copper(UTP), Fiber, Radio, Satellite.
Guided & Unguided
Distinct the three parts:
From ISO-OSI to modern Internet Architecture

TCP/IP Protocol Stack
Layer encapsulation: Protocol headers
More detailed:
Network Performance
Delay
Transmission Delay (‘The toll station’)
Push data into the link:
Bandwidth ↑ , Transmission Delay ↓Propagation Delay (‘The car on the highway’)
Data run from an end to another
a graph to demonstrate:Queueing Delay
Packets wait for transmissionPersistent queueing causes packet loss.
Little’s Law:for length of queue(average), for arriving rate(average), for waiting time. Processing Delay
Negligible
Loss
We care about loss rate most.Throughput
Therefore(the one is bottleneck link)
Link Layer
Traditional: supported by adaptor(hardware) and driver(software)
New demand: GPU directed linked net interface ‘Hardwarize’
Different types of links
- Wired point-to-point links
- Wired multiple access links (LANs)
- Wireless links (WiFi)
Service
- Framing: encapsulate into frames and extract
- Link access
- point-to-point or broadcast
- Multiple Access Control(MAC): avoid the collision(different nodes speak simutaneously) no out-of-band
use MAC protocol:- Channel partitioning: Divide channel into pieces
- Taking turns
- Random access: “Recover” from collisions
- Reliable delivery
Flow Control: Ensuring the sender not overwhelm the receiver i.e. preventing buffer overflow
- Stop and Wait
good at large frames
- Sliding window
- the sender’s window is the safe region, between the sent(know) and the done(told by receiver)+capability
- the receiver’s window is between done+capability and accepted
the decode field of the frame No. is
- Stop and Wait
- Error Detection & correction: Data + EDC (Redundancy)
- Parity checking
detect: 1D, D+C: 2D (as below)
- Parity checking
LAN
Early: Token Ring
Ethernet:
- topology:
- early: bus broadcast
- modern: switch
Preamble: Where does a bit begin? Physical Layer copes with it.
- Framing: Where does a frame begin?
- sentinel bits (必考)
- topology:
Lab note
Mininet
sudo mn -h : help
sudo mn : Open CLI
> help
> nodes : display nodes
> net : display links
> dump : detailed info
> [node] [command] [args] : run command in a certain node
> [node] xterm : open a seperate commandline for [node]
> exit
sudo mn -c : clean up
the given exmaple could not run compatibly in Py3…Wireshark:
Quick CheckSwitchyard
Testcase syntax:
没完事上面的 (Lecture 3)
Link Access: MAC
Ideal:
- R bps for one
- R/M bps for M
- fully-decentralized
- Simple
Channel Partitioning
- TDMA(time division multiple access):
fixed time slot - FDMA(frequency division multiple access):
fixed frequency band - CDMA(code division multiple access):
Details in the future
Taking Turns
- Polling: Master & Slaves:
master asks slaves in turn- cons: polling overhead, latency, single failure
- use: Bluetooth: very weak(low-power) terminals
- Token passing:
- cons: token overhead, latency, single point failure.
- use: IBM Token Ring
Random Access
- Carrier Sense, Collision detection, Randomness
- ALOHA
ALOHA workflow 1
2
3
4
5
6
7
8
9
10
11+------------------------+
↓ |
Sender --+-- ACK --- fine |
| ↑
+-- !ACK -+- p = P --- Resend
|
+-- p = 1-P --- Wait
Enough repeatations : Abort
Receiver -- checkpass -- ACKWhat about ACK?
It is a simplified model, so we assume ACK is finished immediately and within the slot. (Once reach then accept)

effciency < 20%
slotted ALOHA
Time in uniform slots = frame transmission time (T0)
synchronized
also p=P retransmissioneffciency approach theoritical 37%
CSMA(Carrier Sense Multiple Access) (载波侦听)
still has collisions: for propogation delay- use: Ethernet
Nonpersistent CSMA (really deferential)
workflow 1
2
3
4
5Sender --- Listen --+-- Idle ---Send
↑ |
| +-- Busy --- Wait rand(t)
| ↓
+-----------------------+1-persistent CSMA (selfish)
workflow 1
2
3
4
5Sender --- Listen --+-- Idle ---Send
↑ |
| +-- Busy
| ↓
+-----------+cons: many waiting -> guarantee collision…
p-persistent CSMA
workflow 1
2
3
4
5
6
7+-------------------------------------+
↓ |
Sender --- Listen --+-- Idle --+-- p = P --- Send |
↑ | | |
| +-- Busy +-- p = 1-P --- Delay t
| ↓
+-----------+aim of p: avoid instability under heavy load
ideal value: $p=\frac{1}{Waitingstationssum}$
cons: usually gives really long delay…CSMA/CD(collison detection) (based on 1 persistent)
1
2
3
4
5
6
7
8+-- Wait rand(t) --- Send Jam & Abort
| |
Sender --- Listen --+-- Idle --- Send --- Collision? --- Yes
↑ | |
| +-- Busy No --- ACK
| ↓
+-----------+
Jam: All-1 signaldetect abnormal high voltage
- signal will attenuate: limit distance
- special judge: multiple ports activity
Minimum frame size: 2*propagation delay
, Otherwise, too late to stop transmission. (The system is not workable, but meaningless) - Therefore
1-persistent in IEEE 802.3 (Wired Ethernet)
collision handling: Exponential backoff
1 |
|

Performance
point-to-point
normalized time:
(relative) Ring
(必考)Slotted ALOHA
Every node:to send Pure ALOHA
(这里有大问题!)
Link layer discovery
what you only know initially: distinct MAC addr for each NIC
Protocols
- Discovery end-hosts
- bootstrap communication with remote
DHCP (Dynamic Host Configuration Protocol) (RFC 2131)
broadcast for DHCP server(in the past: a real person)’s help
- Procedure
- client -> broadcast:
DHCP_DISCOVER
Any helper here? - all DHCP servers -> client??存疑:
DHCP_OFFER
I’m here at IP<…> - client(choose one) -> client:
DHCP_REQUEST
I ask the helper at IP<…> to help - selected DHCP server:
DHCP_ACK
Let me tell you your configuration… - (client quit) client:
DHCP_RELEASE
I will go, unbind plz - soft quit: the binding will expire if the client do not renew/rebind it.
- client -> broadcast:

- Under failure
Now you want to ping someone with an IP, how could you know his MAC?
Why must be MAC? seem to useless in local net, but when remote MAC is for the destination, but MAC is for every next link.
Why? must what is link layer for
ARP (Address Resolution Protocol) (RFC 826)
Map IP address to MAC address (only on LAN)
Every host has a ARP table
- Procedure (Local)
- sender broadcast:
ARP_REQUEST
Who has IP<…>? I’m MAC<…> IP<…> (in MAC frame) - receiver:
ARP_REPLY
the dest is me… Hi MAC<src> My MAC is <…> (in MAC frame) [cache the { IP<src>: MAC<src> } ] - sender get it: [cache the { IP<dst>: MAC<dst> } ]
- sender broadcast:
A remote communication case
- Procedure
- client enter the LAN: DHCP get the configurations
- want to remote comm:
- use netmask to know it is a remote
- ARP get the MAC of first-hop router
- send to first-hop router…\

MAC for the next stop!
Philosophy of ARP & DHCP
- Broadcasting: Can use broadcast to make contact
- Scalable because of limited size?怎么理解
- Caching: remember the past for a while
- Store the information you learn to reduce overhead
- Soft state: eventually forget the past
- Associate a time-to-live field with the information
- … and either refresh or discard the information
- Key for robustness in the face of unpredictable change
Interconnection of LANs (still local)
- store & forward LAN frames
- exact bitwise copy
Bridge:(IEEE 802.1D)
Philosophy
- S&F: select the frames for neighbor LAN
Use the CSMA/CD to access LAN segment - Transparent: to stations, unaware of the bridge
- Plug & play, self-learning: use once plugged in (do not need much configuration)

Mechanisms
9.10
Frame broadcasting
- “broadcast storm” in extended LAN
How to solve?
Loop resolution
Spanning tree protocol (Perlman)
eliminate the loops!
make a spanning tree on the physical topology!- no configuration by ops and users
- self-healing
must make a spanning tree distributedly:
- Pick a root (agree on select the Min MAC)
- Prune (only keep the shortest path, break ties(a simple agreement: lower ID, any is OK))
Node aspect:
- All send Messages (Y, d, X)[This is actually named ‘BPDU’]: “I’m X, I elect Y with distance d from me”
- If receive message from a lower ID, update election
- If update election, send new message
- After the election, nodes determines which is the root port. (Since then do not forward the packets from them, but still listen (in case of BPDU))
Many many details:
When election ends? ususally after a constant time gap in which no new (could-make-change) BDPU is received.
What about the Synchro problem, do really need synchro? no one tell me…
how to choose port? store them all?

(必考)
- Robustness: After spanning tree is built, the root will still broadcast BPDU in certain frequency (eg. 2s), and all other nodes will flood them to all the ports including the blocked ones.
How to detect? what ambiguous!
- Flooding: forward to all the valid(on the tree) exports except for the import
- Philosophy: Waste, but learn from it! (bootstrap more precise/efficient forwarding)
- Example:
in spanning tree, I receive message of A from port p, so I know just forward to p when dst is A
cache the ‘knowledge’ with a certain timer
Address Learning
Address database:
mechanism: just as the timeout-switch lab
receive, learn;
know, forward; adapt the lab!!!!!!!!
do not know, flood;
Bridge do not know who is in his LAN originally!!
More than Bridges
Hubs: physical repeater (Not in Link Layer)
Physically star, logically bus, 2 lines per end (transmit and receive) (collion vulnerable)Bridges: inter-LAN (forward + addr learning)
2-4 ports for LANs
as aboveLayer2 Switches: inter-LAN/host (Bridge + collision free)
many many ports for LANs/hostsLayer3 Switches: Not for now, a router in nature, but not the same as router (not in Internet)
After switched
Not traditional view! All dedicated links.
- buffer and forward
- Ethernet Protocol on each link
- each link has its own collision domain
- full deplux: Seperate send and receive
- Switching: two forward process diff src and dst can happen simutaneously. (Multiplying)
- No change when adopting switch / adding more switches (scale easily)
- Station has dedicated capacity equal to original LAN (No summarize self understand)
- Interconnectable: switch – switch
- switch self-learning: like the former ones (必考)
drawbacks (in large scales) (not intepretale)
- Broadcast overload
- Lack of multiple paths
(Rubbish did not finish it!没完事)
Wireless
Philosophy
Wireless, Mobility(handoff, mobile change base station, some wired also needs)
Components
- wireless hosts:
user devices, stationary/mobile - base station:
- relay: connect wired and wireless
- cell towers, APs…
- wireless link:
- typically mobile - base station
- also used in backbone link, saving cable cost
- MAC
- different rates and dis…
Model
- Infrastucture mode: base – mobile
like master&slave
save mobiles’ power - Ad-hoc mode: hosts self-organization, route among themselves

Characteristics
Attenuation:
FSPL: Free Space Path Loss
d: distance, f: frequency, c: speed of light
Reflection, diffraction, absorption, terrain contours (urban, rural, vegetation), humidity
SNR: Signal-to-Noise Ratio
BER: Bit Error Rate
Curve:
Same physical layer: Power↑, SNR↑, BER↓
Dynamic SNR ↑↓ 👉 adapt Physical layer
Multipath propagation
Self-interference
Interference from other sources
e.g. 2.4GHz-phone-motors-microwave
- Broadcast medium
- Anybody in proximity can hear and interfere
- Cannot receive while transmitting
- Our own (or nearby) transmission is deafening our receiver
Half-duplex - Recently may full duplex
- Our own (or nearby) transmission is deafening our receiver
- transmission always not intact
- Multiple wireless senders and receivers create many problems
Mechanism WiFi - IEEE 802.11 Wireless LANs
Architecture
- station:device with IEEE 802.11 MAC and phy layer
- AP(Access Point): wired – wireless
- BSS(Basic Service Set): a cell coordinated by AP
- ESS(Extended Service Set): multiple BSSs interconnected by DS(Distribution System)(switches, wired network, wireless network)
ESS appears like a single logical LAN
Portals for Internet access.


对比有线帧和无线帧
Association
- 802.11b: spectrum divided into 11 channels
AP admin (?) : choose channel.🤬
host associate with an AP - Scanning
- Passive
- Positive
Then DHCP!
- Passive
MAC
- Hidden terminal problem:
A & C do not know each other so they keep twittering. - **Solution: 4 Frame Exchange**
- Exposed terminal problem: A sends to B, C sends to D, A cannot send to B
thinking inference but actually not.
- Signal fading:
A & C can not hear each other with inference of B
🤬Really Ambiguous Terms:
Actually according to Wikipedia , Medium Access Control, Media Access Control, Multiple Access Control, MAC are exactly the same thing!


3-level IFS
We use shorter IFS for high priority so that the high-order one can steal the turn at the gap between the lower ones.
- SIFS (Short Inter Frame Space)
- Shortest gap (IFS), highest priority
- “all immediate responses”:
- LLC PDU
- Poll response: as below
- ACK/CTS/RTS
- PIFS (point coordination function IFS)
- Mid-length IFS
- Used by the centralized controller in PCF scheme when issuing polls
- DIFS (distributed coordination function IFS)
- Longest IFS, for asynchronous frames contending for access ??
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
DCF(Distributed Coordination Function)
CSMA/CA
- but not CD
- difficult to detect collision for fading (can not distinguish from noise and own trans)
- can detect all hidden terminal…
- very much error, so use ACK
PCF(Point Coordination Function)
- use centralized controller to polling master (point coordinator)
PIFS(seize the gap, lock out all asynchronous frames) - for time-sensitive traffic (QoS)
- always AP be the point coordinator
- has a polling list
- poll stations in round-robin
- the IFS is shown below 👇
- a superframe is divided into CFP and CP
- CFP for PCF & CP for DCF
- theoretically, CFP can take up all the superframe, however practically 30%-50%(GPT)
🤬P38: 如何理解下面那个if..if, 间隔怎么选取?图上和ppt不一致
🤬P39: 如何理解第三条
LLC和MAC???????注意
区别MAC和MAC!!!!

🤬 超帧长度不是用beacon确定的吗,这算什么??
CSMA/CA:
No CD:
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Network Layer
every host/router has network layer
Function
encapsulate segment into datagram, and decapsulate it
Switching/Routing
determine routes(the forwarding table), shortest path…(multiple nodes)Forwarding
Move packet from input to output(within a single node)(determined by routing)
Error handling, queuing and schedulingConnection Setup
in some network arch, especially in virtual circuit switching
connection in network layer: between 2 hosts and intervening routers
connection in transport layer: between 2 processes
Service Model
- for individual packet:
- guarantee delivery
- guarantee + limited delay
- for stream:
- in-order
- minimum bandwidth
- limited of variety of inter-packet spacing
Aborted: ATM
Constant Bit Rate (CBR), Variable Bit Rate (VBR), Available Bit Rate (ABR) and Unspecified Bit Rate (UBR)
Good service, but too expensive!
Survivor: Internet
Only best-effort!
IP routers
infrastructure of teh Internet
Router capacity
has many types(Core, edge, small business)
Architecture:
ta plane.(actually lower in the protocol stack)
The lower plane is control plane.(actually higher in the protocol stack)
Input
(不确定)
Tasks
receive pkts
update IP header
Lookup:
challenge is speed!
<range, port> not <an IP, port> too much!
aggregated IPs
Longest prefix matching rule(enable precise control of routing)- high efficiency: tree data structure!
a 01 Trie!(必考:按表画树)
modern: hardwarize again: TCM 3-state chip! ASICs
- high efficiency: tree data structure!
queuing: if packet arrive faster than forwarding rate, then wait
Output

🤬这个和后面四层的那个差不多吧
Packet Classification
Simple: No classification, FIFO buffer, Drop-tail(when full, drop incoming packet)
Classify IP packets:
with Src/Dst IP, TCP Port, TOS, Type of protocol
some law issues on packet classification and schedule (Internet neutral Law)
Scheduler
must fast!
Priority scheduler: first serve nobility
Round-robin scheduler: each queue in turn
- Fair queuing (FQ): round-robin for packets of different size (fair in size or fair in sum)
- Weighted fair queueing (WFQ): serve proportional to weight
(See details at the Transport Layer Part)
Switching fabric
connect input to outputs
switching rate: often measured as $k\times input/outputlinerate
(line rate is single port rate)
- via Memory
control by a CPU, load & store, 2 memory access each datagram.
too slow! - via bus
share a bus
Bus contention: bus bandwidth will limit the switching speed - via Mesh
interconnected network like Banyan network…
initially designed for processor.
advanced design: fragment into fixed-len cells and switch
VC
VC network: Aborted by Internet but others may use it, like military
- Virtual circuit networks:
service provided on flow of pkts.
packet, but preserve resources for you
provide network-layer connection
(ATM, X.25…) - Datagram networks:
service provided on individual packets
network-layer connectionless

Connection setup
end hosts and intervening routers establish a path for virtual connection
use routing to find a suitable path
VC Implementation
- Path from source to dest
- VC number for each link along the path
- entries in forwarding table
- packet carries VC number but not dest address
- VC is changed at each hop
- switch on the path maintain state for each passing connection:
forwarding table list the new VC number for the incoming VC number
<Input port, Incoming VC, Output port, Outgoing VC>
Compare: datagram network forwarding table: <Dst IP address, Subnet Mask, output port> - Link/switch resources can be allocated to VC
Dedicated resources means predictable quality of service🤬这就不是三层了二层吧y用的属于甚至都是switch
Control:signaling protocols信令协议
Addressing
- Physical network address (Layer2)
- Inter-network address (Layer3)
unique address for all the end system and router - Application address (Layer4)
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Scope
- Global:
non-amibiguity through the world
a router may have more than one - Network attachment address
unique address for each interface in a network (Layer2) - Port address
just unique in a single host/router
Mode
- Unicast
- Broadcast
- Multicast
- Anycast
Point to some dst, just go to 1,
determined by routing

IP Protocol
- Internetworking: the nets could have different archtecture

- IP layer entity resides on each host and router
- Provides connectionless service (i.e. datagram mechanism)
Operations in General
Routing
- maintain static/dynamic routing tables
- adopt different policies (based on different algorithm)
- Problem: datagram may loop indefinitely
🤬为什么是TCPneed? ppt2-15
14 source routing???
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Fragmentation & Reassembly
packet length must fit the MTU(Maximum Transmission Unit)
- fragment at host and router
- at host: try to figure out min of MTU on the path to avoid fragmentation
- at router: if too large, fragment(host can not 100% avoid fragmentation)
- reassembly only at host
- router can not reassemble, for not all the fragments may pass a certain router
- may fail if some fragments get lost
- set the timer when the first fragment arrives
- if not all fragments arrive until timeout, discard all
Error Control
Due to best-effort, should offer some error control
Router should inform the src if the packet us discarded (checksum failure, TTL expired…) (Use ICMP
)
so src could inform higher layer protocol, so that higher layer protocol could offer better service quality.
Flow Control
- Allows routers to limit rate of incoming data
- Router discards packets when buffer is full
- May send source quench packets to sending host
- Using ICMP (refer to Router-assisted congestion control)
IP Function (According to Packet Format)
Header
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Max length of datagram:
Fragmentation (total length, identification, flags, fragment offset)
each link has its own MTU, router may fragment the datagram
only be reassembled at the destination
Only could be deliver to the higher layer when all the fragments arrive, for the higher layer protocol header will not be duplicated in each fragment. We can only parse the higher layer protocol information when all the frag
- identification of the originated datagram (so that we could group the fragments and reassemble them)
<Src addr, Dst addr, Upper layer, Data Unit ID> - Data length field: only the data part
- Offset: Unit: 8Byte
More flag
: indicate if the last fragment
Procedure
- prepare a enough buffer
- receive fragments and insert them to proper position
- from Offset 0 to a false-
More flag
datagram - reassemble them
Addressing (src address, dst address)
An example: Distribute address
netid is unique can administered by several organizations (ARIN,RIPE,APNIC…)
hostid is assigned within designated organizations
Network sum | Addr sum per nwk | ||
---|---|---|---|
typeA | All allocated | ||
typeB | All allocated | ||
typeC |
‘-2’: all-0 for network addr, all-1 for broadcast
127.0.0.1 loopback addr. “localhost”
Example: Subnet from inside and outside
Subnet和Subnet Mask略去了,没时间了,待补档
CIDR(Classless Inter-Domain Routing)
ISP can be regard as a set of subnets
Hierarchy: ISP -> subnets -> hosts
Hierarchy addressing supports route aggregation: more efficient routing broadcast
IP Suite
NAT(Network Address Translation)
translate when interfaces with broader Internet
enable inside and outside use different IP addr sets
Purpose
- act as firewall (only from inside to outside, hide internal IP)
- save IP addresses
- Isolate ISP/Organization change (just change the NAT)
Types
- Static NAT: 1:1 mapping (for server)
- Dynamic NAT: pool of public IP, assign for private (for client PCs in Intranet)
- Single-Address NAT: all private IP map to one public IP
Overloading/Masquerading/Network Address Port Translation (NAPT)🤬
1:1
Single Address (most common)
NAT identify the inside hosts’ packets by 4-layer address(port)
NAT那里的Skype部分,什么东西。🤬
NAT hinders P2P
Relaying (in Skype)
P9
ICMP(Internet Control Message Protocol, RFC792) (4 layer)
- transfer err and ctrl msgs
- encapsulated in datagram
- protocol type = 1
- not reliable
Format


ICMP types: maybe do not need to recite… ⏩ppt P13
some formats…
Use
Ping
- test dest reachability
- calculate round-trip time
- count hops (TTL)
traceroute:
- send UDP packets with increasing TTL(1,2,3…)
- reveal the nodes by TTL expired message
- suffer from dynamic routing
Path MTU
- send packets with DF(do not fragment) bit set
- router send back parameter unintelligible if too large
- decrease size and try until no error
- suffer from dynamic routing (main change during/after your test)
Mobile IP
mobile device’s IP pretend to be unchanged.
P24
Triangle:
P26
function
- Discovery
Tunnel : IP packet in IP packet?
IPv6
Motivation
- address exhaustion
- poor processing speed of header
- poor QoS support
- No fragmentation at routers
- ???
RFCs
4 to 6 Evolution

Routing
building routing tables for datagram networks
choose path and build routing tables for VC networks??
Characteristics
- efficiency
- resilience
- stability
- …
Design ELements (5)
Performance criteria

- minimum hop
- least cost
- what is cost?
- Minimum delay: queue length
- Largest throughput: 1/throughput
- what is cost?
Decision time & place
- time
- for each packet - datagram network
- when configure VC - VC network
- place(3 sorts)
- centralized coordination
- source routing
- distributed(by switches)
Network info source & info update timing
- source
- Local information
- Adjacent switches
- All switches in the network
- timing
- Update periodically
- Upon major changes in switches or links
- Fixed (manual configuration)
Strategy
Central(static)
- single fixed route for each src-dest
- least-cost algo
- reconfig when major change
🤬这个具体是什么?是P12的图吗?
Distributed: flooding
do not need any info
just flood the packet
- duplicates: many receive many copies
- cycle problem: may cause storm
- handle: hop count
- handle: hop count
Features
- very robust
- at least one packet will take the best route
can set up VC - visit all switch (useful to distribute information)
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
P15
Distributed: random
- do not need network info
- select one outgoing path for retransmission
- random / round & robin / probability calc
- suitable for strongly-connected network
- always not optimal
Distributed: Adaptive
- Used by almost all packet switching networks
- change as conditions on the network change
- Requires network info
Tradeoff: high quality info – overhead - Aid in congestion control???
An Isolated Example (only use local info)
Least Cost Algos (Use global info)
as below
Classic Algo
*Prim & Kruskal
Dijkstra
26页图即可
- complexity:
best - may happen oscillation:
cost = amount of carried traffic🤬如何解决震动问题?
Bellman-Ford
再细看
What if when link coat change?
“count to infinity” problem(这个存在于Dji上吗)
poisened reverse:
需要详细整理一下
Comparison
39页图怎么理解?
两种都是静态动态杰克
Scaled routing
flat routing is not practical
Hierarchical Routing
before 7 ..

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Peering Center is used to connect ISPs.
ASP’s AS could connect to Tier-1 ISP’s AS.
IGP(Interior Gateway Protocol): Intra-AS
- focus on performance
- AS-specific
RIP(Routing Information Protocol): distance vector(必考)
p13
#hop
router has Layer4/5 info in Ctrl Plane(for router’s own communication)
OSPF(Open Shortest Path First): link state(必考)
- each router can see the whole network
calculate the shortest path with Dijkstra
make a routing table: < dest, next hop >
Feature
- Hierarchical OSPF
AS is divided into areas(32-bit ID)
EGP(Exterior Gateway Protocol): Inter-AS
- focus on policy
11 的俩图
BGP(Border Gateway Protocol)
support AS’s autonomy and privacy
relationships
A –customer–> B
A –provider–> B
A <–peer–> B
AS advertise its best routes for IP/IP-prefix
BGP not picking shortest-path routes
best route based on policy (maybe least-cost)Selective route ad
AS could choose whether to ad its route to dest
Details
Sessions
- iBGP
Distribute external routes internally
Multicasting (必考)

- Duplicate packets & unicast v.s. multicast
D&U → a waste of bandwidth! - Multicast address → class D in IPv4
- Application: Multimedia, teleconference, distributed computing, etc.
make a Mcast minimal spanning tree and only duplicate only along the spanning tree (hard, NPC, no scalability)
Router’s responsibility:
- learn group membership
- identify links
- establish routing state
A concept: Mcast group
- the edge router is responsible for joining the Mcast group.
translate the IP or not(Unicast/Mcast) according to IGMP ProtocolP15/16
IGMP
P18/19
Operations
- Hosts
- Send reports to routers to subscribe to (join) and unsubscribe from (unjoin) multicast group
- Host need not explicitly unjoin group when leaving ???
- Routers
- Sends query info at regular intervals
- Host belonging to a mcast group must reply to query
P21-26
he
- On receipt, hosts start random timers (0~0s) for each multicast group they’re in.
- when count down, report Membership Report(TTL=1) , others hear so do not need report in this group (stop their timers).
- Routers hear all, if a group has no report, time out it.
v2
v3
- support Group-and-source specific query: The receiver can claim that it only has interest in specific Mcast source.
note 27 with 28!
before 34?????????
Mcast Routing
Group-shared tree and Source-based tree
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Choose
- PIM: Protocol Independent Multicast
- Not dependent on any specific underlying unicast routing algorithm (works with all)
- Sparse mode
- Group-shared tree, use center-based approach
- Group members widely dispersed, bandwidth not plentiful
- Dense mode
- Flood and prune: source-based tree, reverse path forwarding (Nearly same as DVMRP)
- group members densely packed, bandwidth more plentiful
Group-shared
MST with constraints(NPC)
NPC never works on Internet!
P42
- ..
- choose a router as center (it is also a problem to choose a good center (best:NPC)!)
Other routers to join:
- choose a router as center (it is also a problem to choose a good center (best:NPC)!)
- Edge router sends unicast join-msg addressed to center router
- join-msg processed by intermediate routers and forwarded towards center
- join-msg either hits existing tree branch for this center, or arrives at center
- Path taken by join-msg becomes new branch of tree for this router
Source-based
- Shortest path tree, reverse path *, use with(?) OSPF(Dijkstra)
- First make a broadcast tree:
P39
Now it is a broadcast tree. - Then pruning: The no-interest router report Prune upstream (if has no downstream group member) (How to wait?🤬)
- First make a broadcast tree:
DVMRP & MOSPF(buyaoqiu?)
Multicast routing based on shortest-path unicast protocol(undertand again)
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Application Layer Mcast
The Fairy of IP Mcast: …
So we implement Mcast in the application layer.
Transport Layer
peer-to-peer communication
routers have Layer4/5, however they do not talk with hosts.
break up messages to segments / reassemble segments to messages
motivations
- IP could not tell which application
- IP only provides best-effort service
- delay, loss, duplication, out-of-order
- traffic volume control
Role
Communication between processes(Compulsory)
- mux&demux from/to application processes
- Use Ports
- Multiplexing (Mux)
- Gather and combining data chunks from different applications → network layer
- Demultiplexing (Demux)
- Delivering correct data to corresponding sockets from a multiplexed stream
- Multiplexing (Mux)
provide common end-to-end service(optional)
- reliable, in-order, well-paced…
- facing diverse requirements of applications
socket: software interface (abstraction) to the transport layer for a process
Ports: 16-bit number
- well-known ports: 0-1023 (eg. 80 for webpage)
- registered ports: 1024-49151
- dynamic ports: 49152-65535
OS stores mapping between sockets and ports
- For UDP ports (SOCK_DGRAM)
- OS manage: <HostIP, Port>
socket
- OS manage: <HostIP, Port>
- For TCP ports (SOCK_STREAM)
- OS manage <HostIP, HostPort, RemoteIP, RemotePort>
socket
- OS manage <HostIP, HostPort, RemoteIP, RemotePort>
- For UDP ports (SOCK_DGRAM)

Demultiplexing
host uses IP addresses & port numbers to direct segment to appropriate socket
Connectionless demultiplexing
UDP like, all ports could exchange with a socket.
Only take dst socket into account when demultiplexing.
Connection-oriented demux
- like TCP
- socket id: <SrcIP, SrcPort, DstIP, DstPort>
- server host may support many simultaneous TCP sockets:
- web servers have different sockets for each connecting client
non-persistent HTTP will have different socket for each request

or like this (🤬有什么区别吗。。。)
Reliable transport
All bad things can happen…
- cheaksum -> detect corruption
- ACK & NACK -> detect loss
- Sequence number -> identify packets
- retransmission -> recover loss/corruption
- flow control -> prevent overflow
- timeoout -> decide when to retransmit
- forward error correction
- network encoding -> repair errors
Corruption handle
Handle:
Start from: Detect error: ACK/NACK
ACK still get corrupt: ACK/NACK with sequential number
packet get lost and so that no ACK
ACK get lost and cause timeout
Still have problem: no packet loss, delay cause duplicate.
early days: no formal methods! constant packups
Flow control (General Basis)
Stop-and-wait protocol
sender spin-wait ACK/NACK
or until TIMEOUT

network protocol significantly limits use of physical resources!
Pipeline protocols
sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts
- increasing range of sequence numbers
- buffering at sender and/or receiver


When n-pipeline’s n is proper, it can increase the utilization by n times! (until 1 ideally)
Sliding window: compare with the one above!🤬
Sliding window often called “packets in flight”
Acknowledgement
Cumulative ACKs: next in-order expected
Selective ACKs:
ACK individually acknowledges correctly received packet
Resending
Go-back-N(GBN)
Discards out-of-order packets (pkts other than
) cumulative acknowledgements
Sender sets timer for
if timeout, retransmit
better in low error rate, waste bandwidth
better for hardware, and only hardware can handle high-speed!
complex protocol and data structure: not a system way!
Selective-Repeat(SR)
treat the packets individually, but window front edge is determined by the smallest unACKed ID.
retransmit only packet
on timeout complex book-keeping: timer per packet
low speed, high error rate, low and expensive bandwidth: 5G
UDP(User Datagram Protocol) RFC 768
lightweight, provides the minimal service
- Best effort(may lost or out-of-order)
- connectionless, each segment is independent
- use in streaming multimedia apps, DNS, SNMP
Features
- no connection establishment
- simple, small header
- no congestion control
Format

Checksum
split as 16-bit integers, add them up, then take the 1’s complement
No error detected does not mean No error!
checksum is optional (source port is optional too)
Note: checksum是check哪些,见后面的TCP
💱Frontier:UDP is reviving due to oligopoly of Internet companies.
UDP+X protocol is used in these companies to provide better service.
UDP+QUIC is used in Google’s Chrome.
TCP(Transmission Control Protocol) RFC 793
必须记下包头格式
- Reliable, In-order and Byte stream(assume there is an incoming byte stream)
- Almost all the techs: checksum, seqnumber, sliding window, cumulative ack, out-of-order buffer, fast retrans, timeout estimate…
TCP functions (compared to header domains if exists)
必须掌握这个结构!
🤬大端存储还是小端存储?
Mux and DMux (Src and Dst Port)
Nothing much to say.
Content Reliable
Pseudo header + header + overhead: A blog
Order Reliable (Seq Number and ACK)
Early days: “Stream of bytes” service offer service for TTYs
Seq number is ‘Byte(not packet) offset’ (TCP is a byte stream)
pay attention that TCP is full-duplex, so each pkt should have both seqno and ACK domain. There is no implicit ACK pkt, but exchange the ACK knowledge in back-and0-forth data packets(piggybacked). You should not be misled by the diagram above.
TCP segment length:
1
2
3
4|------------------------MTU------------------------------|
|---------------IP data-------------------|-----IP Hdr----|
|------TCP data (segment)-----|--TCP Hdr--|
|-------------MSS-------------|So $MSS = MTU - len(IP
Hdr) - len(TCPHdr)$
MTU = 1500B for Ethernet
TCP >= 20BTCP Datagram Sent time:
注意加一减一,ACK是上一个这些事🤬
at receiver, user can choose GBN-like or SR-like(drop/buffer out-of-order pkts). The TCP protocol does not restrict it.
- Adopt cumulative + SR-like bitwise(for example)
ACK the smallest seqno unACKed packet, buffer the out-of-order packets. 🤬ppt上各种GBNlike,在哪呢?那不是cumu导致的吗,和GBN有什么关系?混为一谈! - Delay ACK:
- receiver will not ACK a in-order pkt, but wait another in-order pkt(for 500ms).(Otherwise, the acknowledgement actually never ‘accumulate’ for you always ACK a packet at once).
- If receiving a out-of-order packet, send dupACK at once.
- If the in-order packet fill the gap(unACKed holes in the sequence due to out-of-order) and is the first unACKed pkt, ACK at once.(If it fills the gap but not the lowest one, follow rule2)
- Deduction: receiver will send duplicate ACKs when a packet is lost
sender will know- packet is lost
- link is still alive(receive receive some other packets)
- Introduces fast retransmit: duplicate ACKs trigger early retransmission
TCP will retransmit the pkt the dupACKs point to when 3 duplicate ACKs are received - And a single timer for timeout(only time the first sent but unACKed pkt, namely reset it each time sender pkt LHS moves) retrans the first unACKed pkt 🤬如果就重传一个,那还有什么可说的书P164(namely first pkt in the window) on timeout.
How do we pick a timeout value?so should be **a bit longer** than (proportional to) RTT So how to measure RTT? See below:
measure RTT
- measure RTT: exp weighted average of sampled RTT
Notice: Usually the sample is taken every unretransed ACK (GPT) - Problem: Ambiguous measurement(tell ACK and retrans-ACK)
- Algos: as below
- Karn/Partridge Algorithm
- handle ambiguous: ignore the packet as long as it is retransed
- Insensitive to RTT variation 🤬
- handle ambiguous: ignore the packet as long as it is retransed
- Jacobson/Karels Algorithm
- take the variance of RTT into account: use DevRTT (exp average of Deviation)
- take the variance of RTT into account: use DevRTT (exp average of Deviation)
Header Length (Hdrlen)
Number of 4-byte words
HdrLen == 5 if no optional regions.
Control (Flags)

Connection Establishment
2-way handshake
2 SYNs to establish- retrans lost SYN, ignore duplicate SYN once connected
- Problem: slipped segments from old connections
- handle: set a new initial seq number(ISN) far from the previous for a new connection, SYN is in SYN i+1 form (pay attention to +1)
- Obsolete SYN
The A->B SYN and B->A SYN have different seqno. This may be used to implement Full-duplex comm.
- handle: 3-way
3-way handshake
🤬为什么非要加1? 0号包怎么你了! 而且3号包一定要seq number吗((3)seqno多少?GPT说三号包seqno=ISN_A+1)
Hens and Eggs Problem:
- SYN is lost: network/server is busy
- so no SYN-ACK
- so need to retransmits SYN on timeout
- but we do not know how to set timeout! (No any RTT)
handle:
- Default: 3 secs (RFC 1122 & 2988) (some use 6s)
- Sometimes shorter in data center
A SYN loss in Web scene
- User clicks on a hypertext link
- Browser creates a socket and does a “connect”
- “connect” triggers OS to transmit a SYN
- SYN is lost…
- 3-6 seconds of delay(very long)
- User may become impatient and retry
- triggers an “abort”
- Browser creates a new socket and another “connect”
- Can be effective in some cases
Connection Tear Down
Normal termination
🤬这里还是存在问题,当B对最后一个剩余任务发出ACK,那么如果它立即发出FIN,则可能会这个ACK没有被A收到, 从而引发发送端的TimeOut,但发送端在没TimeOut时候就已经收到了FIN(1/2 RTT) 那么如何处理?
TIME_WAIT相关(GPT)
等待足够的时间以允许所有数据包在网络中老化:
TIME_WAIT 状态持续约 2 倍的最大报文段生存时间(Maximum Segment Lifetime, MSL)。MSL 是任何给定的分组被认为在网络中自然消失前能存活的最长时间,通常这个时间设定是2分钟。因此,TIME_WAIT 通常持续 4 分钟(2MSL)。这一过程确保老的重复数据包不会在同一 TCP 端口上的新连接中被误认为是新的数据包。

Can combine FIN and ACK if no pkts left.
Abrupt termination
State Machine
🤬需要掌握到什么程度?

Flow control (Advertised Window)
So far the window size is constant (buffer size).
Left is sender, and right is receiver.
if consumer is slow, buffer will be full.
- straightforward handle: stop sending ACK
- however sender could not tell loss/flow ctrl (so not practical)
🤬简化模型吧,现实中如果来源不止一个,怎么回复RWND?
- handle: Credit scheme
receiver setsRWND
(advertised window) in ACKs
sender ensures in flight <=RWND
Left is receiver, right is sender.Additional window from
(W is RWND, AN is ACK number) - Benefit:
Decouple flow control from ACK
Each octet has a sequence number - lost ACK/CREDIT little harm: resync the protocol in future pkts
(for the ‘right pointer’(AN+W) of the window will never go left; and the sender always follow the newest one; and the new ACK/CREDIT does not depend on the old ones.)
- Benefit:
Example of credit
Problem: credit allocation deadlock
—-[ ]—-→ : closing rcv-window —-[ ]—x→ : to reopen, but lost - A thinks window is closed, B thinks it is open and wait
Handle: Use Window timer
if expired, send something (could retrans last segment)
The term is calledPersist Timer
only be set when the permitted window left is 0.Deadlock: due to piggybacked ACK, when RWND=0, sender wait for receiver, but receiver have nothing to send to sender, so it is stuck.
Handle: sender still send 1 byte when RWND = 0
🤬这地方没解决。什么时候ACK?这两个deadlock问题分别来自ppt和教材,但是方案有冲突,GPT给出的是混合体(如下)?另外丢失ACK的问题在不是等于0,而是sender用光了剩下的时候也存在啊?
TCP的Persist Timer用于处理零窗口大小(Zero Window)的情况,即当接收方通告的窗口大小为零时,发送方将无法发送更多数据。这种情况下,发送方需要知道何时可以再次开始发送数据,但由于窗口大小为零,它不能发送任何新的数据来探测窗口是否已经变大。Persist Timer就是为了解决这个问题而设计的。
Persist Timer启动的情况如下:
接收方通告窗口大小为零:当发送方收到一个窗口更新,指示接收方的窗口大小为零时,它将停止发送数据,并启动Persist Timer。
定期窗口探测:在Persist Timer超时后,发送方会发送一个小的探测报文(通常是1字节的数据),这个探测报文被称为窗口探测(Window Probe)。接收方收到窗口探测后,即使其窗口仍然为零,也应该响应,告诉发送方当前的窗口大小。
窗口大小更新:如果接收方的窗口大小从零变为非零,它会发送一个窗口更新给发送方。发送方收到这个更新后,会停止Persist Timer,并根据新的窗口大小恢复数据传输。
Congestion Control
TCP initially has no congestion control, then causes congestion collapse in 1986.
Jacobson’s fix
Adapt the window size in response to congestion
- pragmatic and effective
- required no changes to routers and apps
- extensively researched and improved upon
Congestion in a simple modeled view.
Real-world problem (much more complicated)
3 must-answered questions in Congestion.
- how to detect congestion.
- who take care of congestion
- how to handle congestion
❌Start from no care: Many packets drop(like UDP)
❌Reservations: negotiate to pre-arrange bandwidth before sending pkts
Low utilization❌Pricing: do not drop high-bidders (capitalism!)
requires payment model✔️Dynamic:
- hosts infer congestionand adjust
- network reports and hosts adjust
- or combine the two above
- generality is proved to be powerful
- not presume business model, traffic features, app requirements… therefore it is universal
- assume good citizenship
- simple to implement but suboptimal, messy dynamics(means hard to predict behaviour in real use)
TCP congestion control overall:
$sendingrate \approx \frac{WindowSize}{RTT}sizerate$
Congestion Window | FlowControl Window |
---|---|
CWND | RWND |
without overflowing routers | without overflowing receiver |
computed by sender | determined by recver and reported to sender |
We assume
in the course
so we do not care about Flow control when discussing Congestion control
In this course we talk about CWND in unit of MSS
Real implementation CWND is in unit of bytes
3 key issues when trying to control congestion
- Discovering the available (bottleneck) bandwidth
- Adjusting to variations in bandwidth
- Sharing bandwidth between flows
Detect congestion:
- packet delays: too much noise (delay often varies greatly) ❌
- router tells: too expensive ❌
- packet loss: ✔️
Fail-safe (But still has non-congestion loss, like in wireless)
no more detection overhead (already has to detect for reliable delivery)
Just like the what we talked above:- Duplicate ACKs: isolated loss
- Still getting ACKs
- Timeout: much more serious
- Not enough dupacks
- Must have suffered several losses
- Duplicate ACKs: isolated loss
will adjust rate differently for each case
Adjust rate:
How to adapt sender?
General: ACK new data then rate↑, loss then rate↓
- Discovering available bottle bandwidth: Slow start
start slow for safety but ramp up quickly for efficiency! - Adjusting to variations in bandwidth: Congestion avoidance
- increase rate slowly and focus on stability
- AIMD(Additive Increase Multiplicative Decrease)
Slow start
Start with CWND = 1
increase exponentially (
) until first loss Math:
for each ACK
So per RTT??? therefore (🤬有点奇怪,这里指数不对吧,当pipeline的时候,增长应该是连续的而不是离散的) 这样看还是离散的,到后期才会变成e指数
When
, stop slow start and enter congestion avoidance. or there is a loss
Collision Avoidance
- focus on maintenance. (when CWND>ssthresh)
- Additive increase
for each ACK
- Math:
therefore (Additive increase) - ‘CWND is increased by one only if all segments in a CWND have been acknowledged’
When loss happens
- Multiplicative decrease
- On 3 dupACKs (no matter in CA or SS)
- directly go to Collision Avoidance
- On timeout event
- Initiate Slow Start
- On 3 dupACKs (no matter in CA or SS)
A diagram to show SS,CA and Timeout
ssthresh 的初值是什么??
‘Sawtooth’
Problem: CA is too slow when recovering from isolated loss.
Please refer to the example
Solution: Fast recovery
- On 3 dupACKs (no matter in CA or SS)
- go to Fast recovery
- In fast recovery
for each dup ACK
- exit fast recovery
When receiving a new ACK(directly go to CA)


🤬这个也没涉及Fast-recovery啊…
Share bandwidth
Efficiency (fully utilize)& Fairness
Now prove AIMD to a proper choice.(compared with AIAD,MIAD,MIMD)
A model to evaluate the performance.
given a source state, we could calculate all the following states on the graph.
AIAD: does not converge to fairness
MIMD: does not converge to fairness
AIMD: converge to fairness
MIAD: does not converge to fairness and efficiency
State machine
It is never implemented in hardware…
for it is too complicated to support high frequency…
TCP flavors
- TCP-Tahoe
on 3 dupACK
s and timeout - TCP-Reno
on 3 dupACK
son timeout - TCP-NewReno
TCP-Reno + Fast recovery - TCP-SACK
Use selctive ACK but not cumulative ACK
All the TCP in the Internet corporates well, for they are “generally the same”
- increase CWND on good news
- decrease CWND on bad news
TCP Performance Metrics & Problems

Packet drop rate,
Throughput,
- if a flow has higher RTT, it will have low throughput. unfair!
- Implication for High-speed TCP:
not good for Large and Fat pipe: high bandwidth and high delay🤬怎么算的?
- Too high demand to under-layer robustness
it is impossible to transmit so much data without error! - Too long time from a drop to reach peak performance
- Handle:
- past a threshold speed then increase CWND (additive constant depend on CWND)
- Multiple connections(make many streams, Works up till know, like IDM)
- Router-assisted
- Too high demand to under-layer robustness
Old days: Best way for Long&Fat Pipe: Disks in Trucks!
- Given the speed swings from W/2 - W
so why dont we just start average it statistically?
Equation-based congestion control
ignore all the events, just set the transmission rate to a result of function, keep monitoring the drop rate and adjust the rate accordingly. - Other loss reasons(like wireless)
TCP cannot tell congestion/corrpution loss - Not friendly for short flows
most flows are short flow.- Short flows never leave slow start
- Too few packets to trigger dupACKs
so isolated loss may also need to waiting for timeout and severely impact the flow completion time
- Every flow try to overshoots capacity before get a drop.
So a large flow will dominate the small ones. all the small ones have to bear a large delay.
means that delays are large, and are large for everyone (opposite to the basic idea of multiplexing) - host control, so easy to cheat
- increase CWND faster
- use large init CWND
- use multiple connections
It is hard to supervise the network of cheating with only control on the host side.
So…👇
Router-assisted Congestion Control
Compared to router based control, host based is more like a hack. Now we introduce the router’s help and it can solve many problems.
Choke Packet
From router/destination to source
Backpressure
for Propagation time > transmission time case(long distance or high speed link), from router to source is not effective
Warning Bit
Explicit Congestion Notification(ECN)
a bit in IP header, set by congested router
- if data pkt has ECN bit set, receiver sends ACK with ECN bit set
- routers can set ECN bit with diverse strategies
trade-off between utilization and delay (more precise strategy means more delay) - ECN semantics can be like loss
end host can react to ECN as if it saw a loss
Advantages:
- distinguish corruption with congestion;
- Can serve as an early indicator of congestion to avoid delays
- Easy (easier) to incrementally deploy
Today: (RFC3168) using ToS/DSCP bits in the IP header, Common in datacenters
- For business: Whenever I get an ECN bit set(it means still want to send if keep receiving ECN), I will be charged.
RED (Random Early Discard)
Handle TCP global synchronization problem
Traffic burst simultaneously fills queues so pkts lost, TCP connections enter slow start simultaneously, so traffic drops, network underutilized, and burst and underutilize and burst…
RED: Randomly discard pkts before buffer becomes completely full (over a certain threshold)
calculate
: discard : queue the packet : probobality discard
Fairness:
- Routers classify packets into “flows” (assume flows are TCP connections)
- Each flow has its own FIFO queue in router
- Router serves flows fairly
- when line is free, take packet from next flow in a fair order
Shen me ta ma de jiao zuo fairness!
Max-Min fairness
- Calculate the average bandwidth requirement
- offer full service to the flow under average
- Remove the served flow
- Loop: jmp to 1, until no fully-served flow
🔱If you don’t get full demand, no one gets more than you
When all packets are the same size, it equals round-robin
🤬如何不serve all?以个数为单位??包不一样大为什么不行?
cope with packets with different sizes
Ideal: bitwise fairness (but no practical)
approximate it
Fair Queuing
- compute when the last bit of each packet leaves the router if flows are served bit-by-bit (share the bandwidth equally)
- serve packets in the increasing order of their deadlines
WFQ: more general, flows have different bandwidth share.
today some form of WFQ implemented in almost all routers.
pros:
- cheating flow cannot occupy more bandwidth
- doesn’t have RTT-based unfairness (
unfairness) - Flows can pick any rate adjustment scheme they want 🤬怎么理解?
cons:
- more complex than FIFO, per flow queue, per packet bookkeeping.
Today’s routers use more coarse-grained WFQ
FQ does not eliminate congestion, but manages congestion.
A case
Congestion still exists, but links are used more efficiently.
round-table discussion
Fairness have different standard. flow sum, congest hop sum…
Flow have different definition. TCP connection, src-dst pair, src itself…
reduce congestion: router can add ‘fair share(n.)’ to packet to inform end-host what rate they should send at… (still need policing) (Any end-host automatic behaviour needs policing!)
Multimedia
This part of content seem to be deleted in the Eighth Version

Analog → Digital Signal
- Audio
- telephone: 8K Samples/sec
Example quantization: 1 B/Sample
CD: 44K Samples - Bit rates
CD 1.411Mbps
MP3: 96,128,160 kbps
Internet Telephony: 5.3 kbps↑
- telephone: 8K Samples/sec
- Video
video: sequence of imgs, image: bit map
coding using redundancy: spatial and temporal- spatial: neighbor same RGB
- temporal: not change RGB between frames in the same place
- MPEG1 for CD-ROM 1.5Mbps
MPEG2 for DVD 3-6Mbps
MPEG4 for Internet videos <1Mbps
Services
- Streaming, stored audio/video
- streaming: can begin playout before downloading entire file
- stored (at server): can transmit faster than audio/video will be rendered (could be stored/buffered at client)
- e.g. YouTube, Netflix, Hulu
- Conversational voice/video over IP
- interactive nature of human-to-human conversation limits delay tolerance
- e.g., Skype
- Streaming live audio, video
- e.g., live sporting event (⚽)


- challenges: variable nwk delay(jitter), interactivity(pause, fast-fwd, rewind, jump), packet loss and retrans
- client-side buffering and playout delay:
compensate for network-added delay, delay jitter

(CBR
means constant bit rate)
if keeping
Trade-off: Long-short pre-cache time
too much wait – video lag
Streaming
with UDP 🤬RTP (RFC 2316)
server sends at constant rate
- (ususally
) - sender don’t care the congestion rate
- 2-5s short playout delay
- error recovery: application-level, time permitting🤬
- but always blocked by firewalls nowadays.
with HTTP

- via HTTP GET🤬
- send at maximum possible rate under TCP
- fill rate fluctuates due to TCP congestion control, retransmissions (in-order delivery)
- larger playout delay: smooth TCP delivery rate🤬
- HTTP/TCP passes more easily through firewalls
with DASH(Dynamic Adaptive Streaming over HTTP)
server:
- divides video file into multiple chunks
- each chunk stored, encoded at different rates
- manifest file(.mpd): provides different URLs for different chunks
could offer diff coding rates, diff URLs for each chunk.
client:
- periodically measures server-to-client bandwidth
- consulting manifest and requests one chunk at a time
- choose maximum coding rate sustainable given current bandwidth
- can alter the choice at different time according to bandwidth
“intelligence” at client: client determines
- when to request chunk (so no buffer starvation/overflow)
- what encoding rate to request (quality according to bandwidth available)
- where to request chunk (can request from URL server that is “close” to client or has high available bandwidth)
Voice-over-IP (VoIP)
VoIP end-end-delay requirement (maintain “conversational” aspect)
- higher delays noticeable, interactivity
- <150msec: good ; > 400msec: bad (max tolerable)
- includes app-level (packetization,playout) & network delays
session initialization???
value-added services: call forwarding, screening (means filtering the calls, like blacklist), recording
emergency services:911
Loss
- network loss: IP datagram lost due to network congestion
- delay loss: IP datagram arrives too late
- delays: processing, queueing in network; endsystem (sender, receiver) delays
- tolerance: between
and
(depending on voice encoding, loss concealment…)
A case: Skype
Problem: Alice,Bob behind NATs
NAT denies outside→inside connection
Handle: relay
A,B maintain open connection with their SNs (servo)
callee’s SN connect callee by initially initiated connection
Proprietary application-layer protocol 🤬?
QoS (Quality of Service)
- differentiated new traffic in Internet
- low bandwidth & high addtional value: Secret call, Multiplayer video game, email…
- high bandwidth & low addtional value: file download, web videos…
- within QoS
ISA(Integrated Services Architecture)
a distinguishable stream of IP packets → a flow
- With the same QOS parameters
- ID: <SrcIP, DstIP, port, protocol type>
- Unidirectional, can be multicast
Architecture
ISA Functions on a router
- Routing Algorithm
- Link cost based on QOS parameters, not just delay
- Routing/forwarding based on which class of QoS the flow belongs to
- Queuing discipline
- Priority queuing
- Multiple queues, taking account of different flow requirements
- Discard policy
- Selective not just the new ones
resource reservation
- Providing QoS for individual app sessions
- routers maintain state info of allocated resources for each session (like a virtual circuit)
- could deny the request if can not guarantee
with in a intermediate router:
soft state: state which has a lifespan and regularly be refreshed/aged
Scope of the request:reservation request should be propagated
to all sender hostsdisadvantage:
- High cost: soft state for each flow, not scalable
- too complicated architecture
- RSVP will reduce to best-effort if any router does not support it in the path
- few service levels are defined, not flexible
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
DS (Differentiated Services) (RFC 2475)
- divide traffic into classes, and treat them differently
- simple func in network core but complex in edge routers
Type of Service
field in IPv4 /Traffic Class
field in IPv6- define a SLA(Service Level Agreement) between customer and ISP before the service
- APP select DS, and traffic with same DS will be treated consistently
PHB??
DS Domain

- portion of Internet with a certain set of DS policies
- provided by a single ISP or union of ISPs
- ISP configures domain edge routers
- Customer may be hosts or edge routers in other domain (as above)
- Ongoing measure of performance provided for each class (ensure the service quality)
- when handle traffic from neighbor DS domain, choose the most fit class in self policy set for the traffic.
DS Architecture
for Edge router
Per-flow (each traffic flow will be check individually) traffic management
Marking: classify the flows into different QoSs (set the header
TOS
/TC
field)Policing:
distinguish packets with in-profile and out-profile: monitor if the flow obeys the SLA, if true then in-profile otherwise then out-profile- monitor average rate, peak rate and burst size
- Shaping: shape the traffic to in-profile
- Leaky Bucket
departure: fixed-rated traffic
may drop pkts when bucket is full
- Leaky Bucket
🤬都整形了,还要标记干什么?
Marking and policing are not seperate process, when you police the flow (in/out profile) you may mark them to tell other routers.(so-called intra-class marking)
for Internal router
- Per-class traffic management (do not distinguish the pkts from different flows in the same class)
- Buffering and scheduling based on marks(header)
- in-profile will be preferred, out-profile traffic has low priority or be dropped.
Network Security
The conceptual model
Attack overview
The victim of attacks: Web browser/server/banking server,client/DNS server/Routers..
- Active easy to detect, but hard to prevent
- Passive can be prevent, but hard to detect
Cryptography: handle interception
Encryption key ->
, Decryption key ->
- Symmetric key: sender key = receiver key
- Public key crypto:
public, secret
Requirements
- Even attacker knows your crypto algo, he could not decrypt it.
- Even attacker knows some ‘cipher txt - plain txt’ pair, he could not work out the key
- Even attacker knows many cipher txts, he could not work out the key / get the plaintxt
- Should not be easily decrypt by Brute Force
Key distribution is a problem(safely get the keys), shown below.
Attacks
- CipherTxt-only attack:
- Brute force try all the keys
- Cryptanalysis:
try to deduce PlainTxt/key by analyzing the nature of crypto algo or general characteristics of PlainTxt/CipherTxt
- Known-PlainTxt attack:
- attacker knows some PlainTxt-CipherTxt pairs to deduce algo/key
- Chosen-PlainTxt attack:
- attacker can choose some PlainTxt and get the CipherTxt in some way to deduce algo/key
Methods
⛓️:这些算法能考吗
Symmetric Key Crypto

Traditional ones
- Substitution: alphabet replacement
- Caesar cipher: shift the alphabet by k
- key space size: 26
- attack: BF
- Monoalphabetic cipher: just map the alphabet to a mess up one(no repeated)
- key space size: 26!
- attack: frequency analysis(letter/word/letter pair in specific language)
- Vigenere cipher: do Caesar cipher for each letter in the PlainTxt, the shift value is determined by each letter in the key in cycle
- a letter in PlainTxt can be mapped into different letters in CipherTxt (most k possibilities)
- key space size:
- attack: try to find the
, and attack each group as Caesar cipher
- Caesar cipher: shift the alphabet by k
- Transposition (Permutation) methods:
- Rail Fence cipher: write the PlainTxt in row and read it in column
- key space size: at most
- attack: BF
- key space size: at most
- Row-Column Cipher: write the PlainTxt in row and read it in column(column sum = key length, so that each letter of the key corresponds a column), the read order of column is the alphabet order of the key
- attack: can just try.
- pure transposition can be easily attacked by Known-PlainTxt attack
- Rail Fence cipher: write the PlainTxt in row and read it in column
Block Cipher
Scheme: Feistel Cipher
allow encryption and decryption to be done with the same algorithm (hardware/software implementation)
- DES: Data Encryption Standard
- key space size:
- attack: no backdoor, just BF
not so slow for modern comupters(let alone the specially designed computer) so useless now.
- key space size:
- TDES:Triple DES
- 3 keys, 3 rounds
- key space size:
- EDE(Encrypt-Decrypt-Encrypt) or EEE(Encrypt-Encrypt-Encrypt)
- too slow, block size small
- AES: Advanced Encryption Standard
- 128 bits block, 128/192/256 bits key
- 10/12/14 rounds(for 128/192/256 bits key)
- key space size:
/ /
⛓️ AES Details
Public Key Crypto
based on mathematical algorithms
Asymmetric key: sender key
each one has a pair of keys:
- public key known to all
- private key known by self

Requirements
- algo should be public
- given algo/PUBkey, attacker could not work out PRVkey
- given the cipherTxt should not work out the PlainTxt
- given CipherTxt-PlainTxt pair and PUBkey should not work out PRVkey
RSA
message is bits, and bits can be seen as a number
- Pick 2 large prime numbers
and (a large one at least 1024 bits each, we take 1024 as an example) (Euler’s totient function) Note that this is the foundation of RSA. Given
, it is hard to work out . The algorithm of working out depends on the factorization of .(factorization has no solution up to now(though it is not proved to be ). And N is a large number( ), so attacker can not work out with Brute Force.
However, due toand are prime numbers, so that the key generation could calculate directly with the formula above.
So the core isand . If and are somehow known by attacker, then he could replay this process to break the RSA. - Pick
so that and is coprime with ( ) e is relatively small, usually 65537, so it not so hard to check if
is coprime with - Compute
so that ( is the modular multiplicative inverse of mod “模逆元”) We could use the Extended Euclidean Algorithm to work out
.
It’s complexity is, namely about 2048 for our example. It is not so slow. - so that
and - then we utilize
Proof: (A brief one, only for
, a full one ) (Euler’s theorem)
due to,
then - So
Encrypt:, and send
Decrypt:Msg
must be smaller than , otherwise, you can only get in the context of modular arithmetic.
Some considerations
and should be about the same size and should be unlated should be relatively small(easy to encrypt) should be large enough to prevent the attack(Bruteforce)
reverse of RSA: the owner could authenticate himself.
- append a signature to the message
- receiver could verify the signature by
- verify it is from A for only A knows
In practice: session keys
- RSA is really slow compared to symmetric key crypto
so…act as a secure key exchange method
- A and B exchange session key
with RSA (establish a secure connection) - A and B communicate using symmetric key crypto with symmetric key
Authentication: handle modification & Fabrication
🤬首先这几个分工写的对吗?其次MAC也不能解决Fabrication啊
- v1.0: Declare who u r
Easy to cheat. - v2.0: Declare who u r + your IP
Also easy to cheat for professional. - v3.0: Declare who u r + your IP + shared pwd
Playback Attack: record the request and play it again - v3.1: Declare who u r + your IP + encrypted shared pwd
Playback Attack still works - v4.0:
— Request–→,
←– disposable Random(i.e. nonce) —
—–→
How to exchange shared symmetric key - v5.0: use nonce + public key
— Request –→
←– Nonce—
—–→
←– ask for pub key —
—–→
then B computeknow it is Alice evidence: only Alice know
v5.0 still has a hole: man-in-the-middle attack
How to? : just because only Alice knows
, so Bob can not tell if the keys are Alice’s or not.
man-in-the-middle attack can be conducted silently
胡想:在mim攻击中 delay may have relation with size?(没意义,本来也是有关的)
Role
to ensure
- come from the right person
- have not changed message
By
- creating an authenticator
- send the message with the authenticator
- receiver verify the authenticator
handling
- Source impersonation / spoofing
- Message injection / modification
- Message re-sequencing / replaying
MAC

notice distinguishing ‘secret’(n.) and ‘secret’(adj.) key
I will refer to secret key as skey.
Notice that skey is a kind of secret(n.)
When it comes to secret(n.), I usefor short.
Usually 64-256 bits
can be sent in clear text’
MAC is
Operablity
- $fixed
length = f_{MAC}(anylength)$ - Easy to compute
- $fixed
Security:
- One-way: hard to find
- Weak Collision Resistance: hard to find
so that - Strong Collision Resistance: hard to find any
- One-way: hard to find
Use Crypto func(with skey) upon
- CBC-MAC (Cipher Block Chaining Message Authentication Code)
(n bits each)
encryption algo( is the skey of Crypto func) 什么🤬 (Initialization Vector) random n-bit block for i.e. first and last block
- CBC-MAC (Cipher Block Chaining Message Authentication Code)
Use Hash func upon
and 这里应该写错了padding bits 最多可以有512bits🤬
- MD5
The given demonstration and the diagram on Wikipedia are poor, so I make it on my own.Holes have been found in MD5.
Xiayun Wang found the high-performance algo of MD5 collision.
Wikipedia - SHA-1
- 如何把secretkey引入Hash func HMAC🤬
- MD5
Digital Signature
🤬just as reverse: RSA
A sign the document and others can prove A made the document
digital signature is Signed MAC (MAC with linked person)
🤬还是不能解决pizza prank啊
🤬所谓的MAC 和 Digital Signature,最主要的区别就是,MAC用的是对称加密,Digital S 用的是非对称加密吧
Key Distribution: handle Fabrication
2 Aims:
- Sender and receiver should share common secret
- Others should know A’s
is actually A’s
against aim1:
Record-and-playback attack
man-in-middle is a kind of Record-and-replay attack(as above)
normal record-and-playback can be prevented by nonce/timestamp
🤬这也不是重放攻击啊
against aim2:
“pizza prank”
the attacker could encrypt the request with his own
Diffie-Hellman Key Xchange(考): to share secret
A public large prime
🤬DH还是会被Pizza prank攻击吧
Trusted Certification Authority: to share with help of CA

2 kinds of keys:
- Session key: just used for a session
- Permanent key: used for a long time(for distributing session keys)
CA determine the validity of the sender and receiver, and provide the session key to them.
- security service module(SSM): working at the host, like an agent for application layer, offering end-to-end encryption and obtain the key.
- The Needham-Schroeder Protocol
keep the forward secrecy: (this part is not a subsection of Trusted Certification Authority)
means that though the key is leaked, the previous messages are still secure.
use one-time session key
- prepare a session key
- encrypt
with of the receiver - encrypt
with - send the encrypted
and - receiver decrypt to get the
with his , then decrypt the with
please refer to
🤬这里用的非对称加密,用DH呢?
Public Key Certificate: to verify the owner of public key
the key owner will register his claim to the CA, and CA will issue a certificate to the owner.
just like a signature of the key, but this signature is not encrypt with owners’ own PRVkey, but a ‘could-not-be-pretended’ CA ‘s PRVkey.
Everyone should be able to find the real CA to check the certificate.
CA could be government or some else.
Content of a certificate:
- serial number
- info about certificate owner(algorithm, key value)
- info about issuer
- valid dates
- digital signature from the issuer(what to be check)
- fingerprint/thumbprint(hash value to check the integrity of the certificate)
Real case: Secure e-mail
Confidentiality: Use session key
Integrity: use MAC(🤬HMAC?)
Combine them!
Then Alice & Bob achieve Confidentiality + Integrity (only this two)
SSL
⛓️整个P3,为什么
https的s
4 🐒
SSL is a sub layer between TCP and Apps
愤怒🤬PGP是什么东西,这翻译,陈鸣老登还有脸把这书出版?
PGP just like the email system above.
直跳13.。。
Application Layer
🐒1-4
Process: program running within a host
hjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Architectures
CS(Client-Server)
Client:
- Start as required then init contact with server first
- Host may have dynamic IP addresses
- e.g. Web: client implemented in browser; Email: in mail reader
Server:
- Run as daemon (always-on)
- Provides requested service to Client
- Host has permanent IP address
- e.g. Web server sends requested Web page, mail server delivers Email
P2P(Peer-to-Peer)
- No always-on server
- free from big-brother
- Arbitrary end systems directly communicate
- peers request service from other peers, provide service in return to other peers
- self scalability - new peers bring new service capacity, as well as new service demands
- Peers are intermittently connected and change IP addresses
- Highly scalable but difficult to manage
- Examples: Gnutella, BitTorrent, Skype
Skype
- centralized server for authentication
- decentralized P2P for communication
App: DNS (Domain Name Service)
- Translates human-readable names to IP addresses
- Distributed database implemented in hierarchy of DNS servers
- Load balancing: multiple IP addresses for one name
Goal
- Uniqueness: no conflicts
- Scalability
- Distributed, autonomous administration
- Ability to update my own (machines’) names
- Don’t have to track everybody’s updates
- Highly available
- Lookups are fast
- Perfect consistency is a non-goal
Hierarchy
- Hierarchical namespace
A tree structureTop Level Domains(if top domain is not country region, it’s the US’s!)
name is the leaf-to-root path
depth of the tree is arbitrary(<128) - Hierarchical administration
- Hierarchical server
- TLD(Top-level domain) server
- local DNS server
helper the user
Reliability
- Replicated DNS servers (primary/secondary)
- Name service available if at least one replica is up
- Queries can be load-balanced between replicas
- Usually, UDP used for queries
- Reliability, if needed, must be implemented on UDP
- Try alternate servers on timeout
- Exponential backoff when retrying same server
- Same identifier for all queries(when have multiple DNS server)
- Don’ t care which server responds
Fast lookup
caching
Performing all these queries takes time
Up to1-second latency before starting download
Caching can greatly reduce overhead
The top-level servers very rarely change
Popular sites (e.g., www.cnn.com ) visited often
Local DNS server often has the information cached
How DNS caching works
DNS servers cache responses to queries
Responses include a “time to live” (TTL) field
Server deletes cached entry after TTL expires
Attacks
DDoS Attack
Redirect Attack
- man-in-middle: intercept queries.
App: Email
- One of most heavily used apps on Internet
- SMTP(Simple Mail Transfer Protocol): simple text messages
- MIME(Multi-purpose Internet Mail Extension): other types of data
- POP: Post Office Protocol
- Msg retrieval from server, including authorization and download
IMAP: Internet Mail Access Protocol
- Manipulation of stored msgs on server
User Agent
- Composing, editing, reading mail messages
- e.g. Eudora, Outlook, Foxmail, Netscape Messenger
- Outgoing, incoming mail messages stored on server
Mail Servers (Host)
- Mailbox contains incoming mail messages for user
- Message queue of outgoing mail messages
- SMTP protocol between mail servers to send mail messages
- Post title:Notes for Computer Network
- Post author:Jackcui
- Create time:2024-02-26 13:14:44
- Post link:https://jackcuii.github.io/2024/02/26/ComputerNetwork/
- Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.