Notes for Computer Network

L1-1 还有补充
L1-2 未完成
L2-1 未完成
L2-2 还有订正
L2-3 未完成
L2-4 几乎完成，残留问题已标注，最后没讲完！！！
- 同步问题
L2-5
L4-1 🤬🤬
L4-2 🤬🤬🤬🤬🤬🤬

Chapter1 Overview

Network Composition

Protocol: (Definition in Textbook P6)

Protocol is a state machine.
No protocol ensures 100% reliability.

Internet includes Computing devices, Communication links and Routers.

Three key parts:

Network Edge
Host/Endpoint System: (主机/端系统)
includes Client and Server
Network Core
mainly the switch
two model
- client/server model
- P2P model
two switching
- circuit switching
- packet switching
The core concept of switching: Statistical Multiplexing
Access Networks
Endpoint system to Edge router.
- Home: DSL, Cable(HFC), FTTH…
- Enterprise(and home): LAN: Ethernet, WiFi…
- Wide-area wireless: 3G, 4G, LTE, 5G…
Physical medium: Copper(UTP), Fiber, Radio, Satellite.
Guided & Unguided

Distinct the three parts:

From ISO-OSI to modern Internet Architecture

TCP/IP Protocol Stack

Layer encapsulation: Protocol headers

More detailed:

Network Performance

Delay
- Transmission Delay (‘The toll station’)
  Push data into the link:
  
  Bandwidth ↑ , Transmission Delay ↓
- Propagation Delay (‘The car on the highway’)
  Data run from an end to another
  a graph to demonstrate:
- Queueing Delay
  Packets wait for transmission
  
  Persistent queueing causes packet loss.
  Little’s Law:
  for length of queue(average), for arriving rate(average), for waiting time.
- Processing Delay
  Negligible
Loss
We care about loss rate most.
Throughput

Therefore
(the one is bottleneck link)

Link Layer

Traditional: supported by adaptor(hardware) and driver(software)
New demand: GPU directed linked net interface ‘Hardwarize’

Different types of links

Wired point-to-point links
Wired multiple access links (LANs)
Wireless links (WiFi)

Service

Framing: encapsulate into frames and extract
Link access
- point-to-point or broadcast
- Multiple Access Control(MAC): avoid the collision(different nodes speak simutaneously) no out-of-band
  use MAC protocol:
  - Channel partitioning: Divide channel into pieces
  - Taking turns
  - Random access: “Recover” from collisions
Reliable delivery
- Flow Control: Ensuring the sender not overwhelm the receiver i.e. preventing buffer overflow
  - Stop and Wait good at large frames
  - Sliding window
    - the sender’s window is the safe region, between the sent(know) and the done(told by receiver)+capability
    - the receiver’s window is between done+capability and accepted
  the decode field of the frame No. is
Error Detection & correction: Data + EDC (Redundancy)
- Parity checking
  detect: 1D, D+C: 2D (as below)

LAN

Early: Token Ring
Ethernet:
- topology:
  - early: bus broadcast
  - modern: switch
Preamble： Where does a bit begin? Physical Layer copes with it.
- Framing: Where does a frame begin?
  - sentinel bits (必考)

Lab note

Mininet
sudo mn -h : help
sudo mn : Open CLI
> help
> nodes : display nodes
> net : display links
> dump : detailed info
> [node] [command] [args] : run command in a certain node
> [node] xterm : open a seperate commandline for [node]
> exit
sudo mn -c : clean up
the given exmaple could not run compatibly in Py3…
Wireshark:
Quick Check
Switchyard
Testcase syntax:

没完事上面的 (Lecture 3)

Link Access: MAC

Ideal:

R bps for one
R/M bps for M
fully-decentralized
Simple

Channel Partitioning

TDMA(time division multiple access):
fixed time slot
FDMA(frequency division multiple access):
fixed frequency band
CDMA(code division multiple access):
Details in the future

Taking Turns

Polling: Master & Slaves:
master asks slaves in turn
- cons: polling overhead, latency, single failure
- use: Bluetooth: very weak(low-power) terminals
Token passing:
- cons: token overhead, latency, single point failure.
- use: IBM Token Ring

Random Access

Carrier Sense, Collision detection, Randomness

ALOHA

ALOHA workflow           +------------------------+
           ↓                        |
  Sender --+-- ACK --- fine         |
           |                        ↑
           +-- !ACK -+- p = P --- Resend
                     |
                     +-- p = 1-P --- Wait
  
Enough repeatations : Abort

  Receiver -- checkpass -- ACK

What about ACK?
It is a simplified model, so we assume ACK is finished immediately and within the slot. (Once reach then accept)

effciency < 20%

slotted ALOHA
Time in uniform slots = frame transmission time (T0)
synchronized
also p=P retransmission

effciency approach theoritical 37%
CSMA(Carrier Sense Multiple Access) (载波侦听)
still has collisions: for propogation delay
- use: Ethernet

Nonpersistent CSMA (really deferential)

workflowSender --- Listen --+-- Idle ---Send
             ↑      |
             |      +-- Busy --- Wait rand(t)
             |                       ↓
             +-----------------------+   

1-persistent CSMA (selfish)

workflowSender --- Listen --+-- Idle ---Send
             ↑      |
             |      +-- Busy
             |           ↓
             +-----------+   

cons: many waiting -> guarantee collision…

p-persistent CSMA

workflow             +-------------------------------------+
             ↓                                     |
Sender --- Listen --+-- Idle --+-- p = P --- Send  |
             ↑      |          |                   |
             |      +-- Busy   +-- p = 1-P --- Delay t 
             |           ↓
             +-----------+   

aim of p: avoid instability under heavy load
ideal value: $p=\frac{1}{Waiting~~stations~~sum}$
cons: usually gives really long delay…

CSMA/CD(collison detection) (based on 1 persistent)

                                   +-- Wait rand(t) --- Send Jam & Abort
                                   |                      |
Sender --- Listen --+-- Idle --- Send --- Collision? --- Yes
              ↑     |                        |
              |     +-- Busy                No --- ACK
              |           ↓
              +-----------+   
Jam: All-1 signal

detect abnormal high voltage
- signal will attenuate: limit distance
- special judge: multiple ports activity

Minimum frame size: 2*propagation delay

,
Otherwise, too late to stop transmission. (The system is not workable, but meaningless)
Therefore

1-persistent in IEEE 802.3 (Wired Ethernet)

collision handling: Exponential backoff

IEEE 802.3 Workflow
                                       report error---------------- 16th time？
                                                                        |
                                  ...                                  ...
                                   |                                    |
                                   +-- Wait rand(0~2^10t) ---------- 12th time？
                                   |                                    |
                                   +-- Wait rand(0~2^10t) ---------- 11th time？
                                   |                                    |
                                  ...                                  ...
                                   |                                    |
                                   +-- Wait rand(0~8t) ------------ Third time?
                                   |                                    |
                                   +-- Wait rand(0~4t) --  ------- Second time?
                                   |                                    |
                                   +-- Wait rand(0~2t) -------------- Again?
                                   |                                    |
Sender --- Listen --+-- Idle --- Send --- Collision? --- Yes --- Send Jam & Abort
              ↑     |                        |
              |     +-- Busy                No --- ACK
              |           ↓
              +-----------+   
Jam: All-1 signal

Performance

point-to-point
normalized time: (relative)
Ring
(必考)
Slotted ALOHA
Every node: to send
Pure ALOHA

(这里有大问题!)

Link layer discovery

what you only know initially: distinct MAC addr for each NIC

Protocols

Discovery end-hosts
bootstrap communication with remote

DHCP (Dynamic Host Configuration Protocol) (RFC 2131)

broadcast for DHCP server(in the past: a real person)’s help
tell you local IP, netmask, DNS, IP of first-hop router.

Procedure
- client -> broadcast: DHCP_DISCOVER Any helper here?
- all DHCP servers -> client??存疑: DHCP_OFFER I’m here at IP<…>
- client(choose one) -> client: DHCP_REQUEST I ask the helper at IP<…> to help
- selected DHCP server: DHCP_ACK Let me tell you your configuration…
- (client quit) client：DHCP_RELEASE I will go, unbind plz
- soft quit: the binding will expire if the client do not renew/rebind it.

Under failure

Now you want to ping someone with an IP, how could you know his MAC?
Why must be MAC? seem to useless in local net, but when remote MAC is for the destination, but MAC is for every next link.
Why? must what is link layer for

ARP (Address Resolution Protocol) (RFC 826)

Map IP address to MAC address (only on LAN)
Every host has a ARP table

Procedure (Local)
- sender broadcast: ARP_REQUEST Who has IP<…>? I’m MAC<…> IP<…> (in MAC frame)
- receiver: ARP_REPLY the dest is me… Hi MAC<src> My MAC is <…> (in MAC frame) [cache the { IP<src>: MAC<src> } ]
- sender get it: [cache the { IP<dst>: MAC<dst> } ]

A remote communication case

Procedure
- client enter the LAN: DHCP get the configurations
- want to remote comm:
  - use netmask to know it is a remote
  - ARP get the MAC of first-hop router
- send to first-hop router…\

MAC for the next stop!

Philosophy of ARP & DHCP

Broadcasting: Can use broadcast to make contact
- Scalable because of limited size？怎么理解
Caching: remember the past for a while
Store the information you learn to reduce overhead
Soft state: eventually forget the past
Associate a time-to-live field with the information
… and either refresh or discard the information
Key for robustness in the face of unpredictable change

Interconnection of LANs (still local)

store & forward LAN frames
exact bitwise copy

Bridge:(IEEE 802.1D)

Philosophy

S&F: select the frames for neighbor LAN
Use the CSMA/CD to access LAN segment
Transparent: to stations, unaware of the bridge
Plug & play, self-learning: use once plugged in (do not need much configuration)

Mechanisms

9.10

Frame broadcasting

“broadcast storm” in extended LAN

How to solve?

Loop resolution

Spanning tree protocol (Perlman)
eliminate the loops!
make a spanning tree on the physical topology!
- no configuration by ops and users
- self-healing
must make a spanning tree distributedly:
1. Pick a root (agree on select the Min MAC)
2. Prune (only keep the shortest path, break ties(a simple agreement: lower ID, any is OK))
Node aspect:
- All send Messages (Y, d, X)[This is actually named ‘BPDU’]: “I’m X, I elect Y with distance d from me”
- If receive message from a lower ID, update election
- If update election, send new message
- After the election, nodes determines which is the root port. (Since then do not forward the packets from them, but still listen (in case of BPDU))
  
  Many many details:
  When election ends? ususally after a constant time gap in which no new (could-make-change) BDPU is received.

What about the Synchro problem, do really need synchro? no one tell me…
how to choose port? store them all?

(必考)

Robustness: After spanning tree is built, the root will still broadcast BPDU in certain frequency (eg. 2s), and all other nodes will flood them to all the ports including the blocked ones.
How to detect? what ambiguous!

Flooding: forward to all the valid(on the tree) exports except for the import
- Philosophy: Waste, but learn from it! (bootstrap more precise/efficient forwarding)
- Example:
  in spanning tree, I receive message of A from port p, so I know just forward to p when dst is A
  cache the ‘knowledge’ with a certain timer

Address Learning

Address database:
mechanism: just as the timeout-switch lab
receive, learn;
know, forward; adapt the lab!!!!!!!!
do not know, flood;

Bridge do not know who is in his LAN originally!!

More than Bridges

Hubs: physical repeater (Not in Link Layer)
Physically star, logically bus, 2 lines per end (transmit and receive) (collion vulnerable)
Bridges: inter-LAN (forward + addr learning)
2-4 ports for LANs
as above
Layer2 Switches: inter-LAN/host (Bridge + collision free)
many many ports for LANs/hosts
Layer3 Switches: Not for now, a router in nature, but not the same as router (not in Internet)

After switched

Not traditional view! All dedicated links.

buffer and forward
Ethernet Protocol on each link
each link has its own collision domain
full deplux: Seperate send and receive
Switching: two forward process diff src and dst can happen simutaneously. (Multiplying)
No change when adopting switch / adding more switches (scale easily)
Station has dedicated capacity equal to original LAN (No summarize self understand)
Interconnectable: switch – switch
switch self-learning: like the former ones (必考) drawbacks (in large scales) (not intepretale)
Broadcast overload
Lack of multiple paths

（Rubbish did not finish it！没完事）

Wireless

Philosophy

Wireless, Mobility(handoff, mobile change base station, some wired also needs)

Components

wireless hosts:
user devices, stationary/mobile
base station:
- relay: connect wired and wireless
- cell towers, APs…
wireless link:
- typically mobile - base station
- also used in backbone link, saving cable cost
- MAC
- different rates and dis…

Model

Infrastucture mode: base – mobile
like master&slave
save mobiles’ power
Ad-hoc mode: hosts self-organization, route among themselves

Characteristics

Attenuation:

FSPL: Free Space Path Loss

d: distance, f: frequency, c: speed of light
Reflection, diffraction, absorption, terrain contours (urban, rural, vegetation), humidity

SNR: Signal-to-Noise Ratio
BER: Bit Error Rate

Curve:
Same physical layer: Power↑, SNR↑, BER↓
Dynamic SNR ↑↓ 👉 adapt Physical layer

Multipath propagation

Self-interference

Interference from other sources

e.g. 2.4GHz-phone-motors-microwave

Broadcast medium
- Anybody in proximity can hear and interfere
Cannot receive while transmitting
- Our own (or nearby) transmission is deafening our receiver Half-duplex
- Recently may full duplex
transmission always not intact
Multiple wireless senders and receivers create many problems

Mechanism WiFi - IEEE 802.11 Wireless LANs

Architecture

station：device with IEEE 802.11 MAC and phy layer
AP(Access Point): wired – wireless
BSS(Basic Service Set): a cell coordinated by AP
ESS(Extended Service Set): multiple BSSs interconnected by DS(Distribution System)(switches, wired network, wireless network)
ESS appears like a single logical LAN
Portals for Internet access.

对比有线帧和无线帧

Association

802.11b: spectrum divided into 11 channels
AP admin (？) : choose channel.🤬
host associate with an AP
Scanning
- Passive
- Positive Then DHCP!

MAC

Hidden terminal problem: A & C do not know each other so they keep twittering. - **Solution: 4 Frame Exchange**
Exposed terminal problem: A sends to B, C sends to D, A cannot send to B thinking inference but actually not.
Signal fading: A & C can not hear each other with inference of B

🤬Really Ambiguous Terms：
Actually according to Wikipedia , Medium Access Control, Media Access Control, Multiple Access Control, MAC are exactly the same thing!

3-level IFS
We use shorter IFS for high priority so that the high-order one can steal the turn at the gap between the lower ones.

SIFS (Short Inter Frame Space)
- Shortest gap (IFS), highest priority
- “all immediate responses”:
  - LLC PDU
  - Poll response: as below
  - ACK/CTS/RTS
PIFS (point coordination function IFS)
- Mid-length IFS
- Used by the centralized controller in PCF scheme when issuing polls
DIFS (distributed coordination function IFS)
- Longest IFS, for asynchronous frames contending for access ??

JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm

DCF(Distributed Coordination Function)

CSMA/CA

but not CD
- difficult to detect collision for fading (can not distinguish from noise and own trans)
- can detect all hidden terminal…
very much error, so use ACK

PCF(Point Coordination Function)

use centralized controller to polling master (point coordinator)
PIFS(seize the gap, lock out all asynchronous frames)
for time-sensitive traffic (QoS)
always AP be the point coordinator
- has a polling list
- poll stations in round-robin
- the IFS is shown below 👇
- a superframe is divided into CFP and CP
  - CFP for PCF & CP for DCF
  - theoretically, CFP can take up all the superframe, however practically 30%-50%(GPT)

🤬P38: 如何理解下面那个if..if, 间隔怎么选取？图上和ppt不一致
🤬P39: 如何理解第三条

A extended description

LLC和MAC？？？？？？？注意
区别MAC和MAC！！！！

🤬 超帧长度不是用beacon确定的吗，这算什么？？

CSMA/CA:
No CD:

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Network Layer

every host/router has network layer

Function

encapsulate segment into datagram, and decapsulate it
Switching/Routing
determine routes(the forwarding table), shortest path…(multiple nodes)
Forwarding
Move packet from input to output(within a single node)(determined by routing)
Error handling, queuing and scheduling
Connection Setup
in some network arch, especially in virtual circuit switching
connection in network layer: between 2 hosts and intervening routers
connection in transport layer: between 2 processes

Service Model

for individual packet:
- guarantee delivery
- guarantee + limited delay
for stream:
- in-order
- minimum bandwidth
- limited of variety of inter-packet spacing

Aborted: ATM
Constant Bit Rate (CBR), Variable Bit Rate (VBR), Available Bit Rate (ABR) and Unspecified Bit Rate (UBR)

Good service, but too expensive!

Survivor: Internet
Only best-effort!

IP routers

infrastructure of teh Internet
Router capacity
Number of external router “ports”
= Speed (“line rate”) of a port
has many types(Core, edge, small business)

Architecture:

ta plane.(actually lower in the protocol stack)
The lower plane is control plane.(actually higher in the protocol stack)

Input

（不确定）

Tasks

receive pkts
update IP header
Lookup:
challenge is speed!
<range, port> not <an IP, port> too much!
aggregated IPs
Longest prefix matching rule(enable precise control of routing)
- high efficiency: tree data structure!
  a 01 Trie! (必考:按表画树)
modern: hardwarize again: TCM 3-state chip! ASICs
queuing: if packet arrive faster than forwarding rate, then wait

Output

🤬这个和后面四层的那个差不多吧

Packet Classification

Simple: No classification, FIFO buffer, Drop-tail(when full, drop incoming packet)
Classify IP packets:
with Src/Dst IP, TCP Port, TOS, Type of protocol
some law issues on packet classification and schedule (Internet neutral Law)

Scheduler

must fast!
Priority scheduler: first serve nobility
Round-robin scheduler: each queue in turn
- Fair queuing (FQ): round-robin for packets of different size (fair in size or fair in sum)
- Weighted fair queueing (WFQ): serve proportional to weight

(See details at the Transport Layer Part)

Switching fabric

connect input to outputs

switching rate: often measured as $k\times input/output~~line~~raten\times line~rate$)
(line rate is single port rate)

via Memory
control by a CPU, load & store, 2 memory access each datagram.
too slow!
via bus
share a bus
Bus contention: bus bandwidth will limit the switching speed
via Mesh
interconnected network like Banyan network…
initially designed for processor.
advanced design: fragment into fixed-len cells and switch

VC

VC network: Aborted by Internet but others may use it, like military

Virtual circuit networks:
service provided on flow of pkts.
packet, but preserve resources for you
provide network-layer connection
(ATM, X.25…)
Datagram networks:
service provided on individual packets
network-layer connectionless

Connection setup

end hosts and intervening routers establish a path for virtual connection
use routing to find a suitable path

VC Implementation

Path from source to dest
VC number for each link along the path
entries in forwarding table
packet carries VC number but not dest address
VC is changed at each hop
switch on the path maintain state for each passing connection:
forwarding table list the new VC number for the incoming VC number
<Input port, Incoming VC, Output port, Outgoing VC>
Compare: datagram network forwarding table: <Dst IP address, Subnet Mask, output port>
Link/switch resources can be allocated to VC
Dedicated resources means predictable quality of service 🤬这就不是三层了二层吧y用的属于甚至都是switch

Control：signaling protocols信令协议

Addressing

Physical network address (Layer2)
Inter-network address (Layer3)
unique address for all the end system and router
Application address (Layer4)

Global:
non-amibiguity through the world
a router may have more than one
Network attachment address
unique address for each interface in a network (Layer2)
Port address
just unique in a single host/router

Mode

Unicast
Broadcast
Multicast
Anycast
Point to some dst, just go to 1,
determined by routing

IP Protocol

Internetworking: the nets could have different archtecture

IP layer entity resides on each host and router
Provides connectionless service (i.e. datagram mechanism)

Operations in General

Routing

maintain static/dynamic routing tables
adopt different policies (based on different algorithm)
Problem: datagram may loop indefinitely
🤬为什么是TCPneed? ppt2-15

14 source routing？？？

Fragmentation & Reassembly

packet length must fit the MTU(Maximum Transmission Unit)

fragment at host and router
- at host: try to figure out min of MTU on the path to avoid fragmentation
- at router: if too large, fragment(host can not 100% avoid fragmentation)
reassembly only at host
- router can not reassemble, for not all the fragments may pass a certain router
may fail if some fragments get lost
- set the timer when the first fragment arrives
- if not all fragments arrive until timeout, discard all

Error Control

Due to best-effort, should offer some error control
Router should inform the src if the packet us discarded (checksum failure, TTL expired…) (Use ICMP)
so src could inform higher layer protocol, so that higher layer protocol could offer better service quality.

Flow Control

Allows routers to limit rate of incoming data
Router discards packets when buffer is full
- May send source quench packets to sending host
- Using ICMP (refer to Router-assisted congestion control)

IP Function (According to Packet Format)

Header

Fragmentation (total length, identification, flags, fragment offset)

each link has its own MTU, router may fragment the datagram
only be reassembled at the destination

Only could be deliver to the higher layer when all the fragments arrive, for the higher layer protocol header will not be duplicated in each fragment. We can only parse the higher layer protocol information when all the frag

identification of the originated datagram (so that we could group the fragments and reassemble them)
<Src addr, Dst addr, Upper layer, Data Unit ID>
Data length field: only the data part
Offset: Unit: 8Byte
More flag: indicate if the last fragment

Procedure

prepare a enough buffer
receive fragments and insert them to proper position
from Offset 0 to a false-More flag datagram
reassemble them

Addressing (src address, dst address)

An example: Distribute address

netid is unique can administered by several organizations (ARIN,RIPE,APNIC…)
hostid is assigned within designated organizations

	Network sum
typeA	127 reserved for loopback	All allocated
typeB	()	All allocated
typeC	()

‘-2’: all-0 for network addr, all-1 for broadcast
127.0.0.1 loopback addr. “localhost”

Example: Subnet from inside and outside

Subnet和Subnet Mask略去了，没时间了，待补档

CIDR(Classless Inter-Domain Routing)

ISP can be regard as a set of subnets
Hierarchy: ISP -> subnets -> hosts
Hierarchy addressing supports route aggregation: more efficient routing broadcast

IP Suite

NAT(Network Address Translation)

translate when interfaces with broader Internet
enable inside and outside use different IP addr sets

Purpose

act as firewall (only from inside to outside, hide internal IP)
save IP addresses
Isolate ISP/Organization change (just change the NAT)

Types

Static NAT: 1:1 mapping (for server)
Dynamic NAT: pool of public IP, assign for private (for client PCs in Intranet)
Single-Address NAT: all private IP map to one public IP
Overloading/Masquerading/Network Address Port Translation (NAPT)🤬

1:1

Single Address (most common)

NAT identify the inside hosts’ packets by 4-layer address(port)

NAT那里的Skype部分，什么东西。🤬

NAT hinders P2P

Relaying (in Skype)
P9

ICMP(Internet Control Message Protocol, RFC792) (4 layer)

transfer err and ctrl msgs
encapsulated in datagram
- protocol type = 1
- not reliable

Format

ICMP types: maybe do not need to recite… ⏩ppt P13
some formats…

Use

Ping
- test dest reachability
- calculate round-trip time
- count hops (TTL)
traceroute:
- send UDP packets with increasing TTL(1,2,3…)
- reveal the nodes by TTL expired message
- suffer from dynamic routing
Path MTU
- send packets with DF(do not fragment) bit set
- router send back parameter unintelligible if too large
- decrease size and try until no error
- suffer from dynamic routing (main change during/after your test)

Mobile IP

mobile device’s IP pretend to be unchanged.
P24
Triangle:
P26

function

Discovery

Tunnel : IP packet in IP packet?

IPv6

Motivation

address exhaustion
poor processing speed of header
poor QoS support
No fragmentation at routers
???

RFCs

4 to 6 Evolution

Routing

building routing tables for datagram networks
choose path and build routing tables for VC networks??

Characteristics

efficiency
resilience
stability
…

Design ELements (5)

Performance criteria

minimum hop
least cost
- what is cost?
  - Minimum delay: queue length
  - Largest throughput: 1/throughput

Decision time & place

time
- for each packet - datagram network
- when configure VC - VC network
place(3 sorts)
- centralized coordination
- source routing
- distributed(by switches)

Network info source & info update timing

source
- Local information
- Adjacent switches
- All switches in the network
timing
- Update periodically
- Upon major changes in switches or links
- Fixed (manual configuration)

Strategy

Central(static)

single fixed route for each src-dest
least-cost algo
reconfig when major change

🤬这个具体是什么？是P12的图吗？

Distributed: flooding

do not need any info
just flood the packet

duplicates: many receive many copies
cycle problem: may cause storm
- handle: hop count

Features

very robust
at least one packet will take the best route
can set up VC
visit all switch (useful to distribute information)

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
P15

Distributed: random

do not need network info
select one outgoing path for retransmission
random / round & robin / probability calc
suitable for strongly-connected network
always not optimal

is the cost factor of the link i (can be trans rate, , or …)

Distributed: Adaptive

Used by almost all packet switching networks
change as conditions on the network change
Requires network info
Tradeoff: high quality info – overhead
Aid in congestion control???

An Isolated Example (only use local info)

Least Cost Algos (Use global info)
as below

Classic Algo

*Prim & Kruskal

Dijkstra

26页图即可

complexity: best
may happen oscillation:
cost = amount of carried traffic 🤬如何解决震动问题?

Bellman-Ford

再细看

What if when link coat change?

“count to infinity” problem（这个存在于Dji上吗）
poisened reverse:
需要详细整理一下

Comparison

39页图怎么理解？

两种都是静态动态杰克

Scaled routing

flat routing is not practical

Hierarchical Routing

before 7 ..

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Peering Center is used to connect ISPs.
ASP’s AS could connect to Tier-1 ISP’s AS.

IGP(Interior Gateway Protocol): Intra-AS

focus on performance
AS-specific

RIP(Routing Information Protocol): distance vector(必考)

p13
#hop

router has Layer4/5 info in Ctrl Plane(for router’s own communication)

OSPF(Open Shortest Path First): link state（必考）

each router can see the whole network
calculate the shortest path with Dijkstra
make a routing table: < dest, next hop >

Feature

Hierarchical OSPF
AS is divided into areas(32-bit ID)

EGP(Exterior Gateway Protocol): Inter-AS

focus on policy

11 的俩图

BGP(Border Gateway Protocol)

support AS’s autonomy and privacy

relationships

A –customer–> B
A –provider–> B
A <–peer–> B
AS advertise its best routes for IP/IP-prefix
BGP not picking shortest-path routes
best route based on policy (maybe least-cost)
Selective route ad
AS could choose whether to ad its route to dest

Details

Sessions

iBGP
Distribute external routes internally

Multicasting (必考)

Duplicate packets & unicast v.s. multicast
D&U → a waste of bandwidth!
Multicast address → class D in IPv4
Application: Multimedia, teleconference, distributed computing, etc.

make a Mcast minimal spanning tree and only duplicate only along the spanning tree (hard, NPC, no scalability)

Router’s responsibility:

learn group membership
identify links
establish routing state

A concept: Mcast group

the edge router is responsible for joining the Mcast group.
translate the IP or not(Unicast/Mcast) according to IGMP Protocol P15/16

IGMP

P18/19

Operations

Hosts
- Send reports to routers to subscribe to (join) and unsubscribe from (unjoin) multicast group
- Host need not explicitly unjoin group when leaving ???
Routers
Sends query info at regular intervals
Host belonging to a mcast group must reply to query

P21-26

he
- On receipt, hosts start random timers (0~0s) for each multicast group they’re in.
- when count down, report Membership Report(TTL=1) , others hear so do not need report in this group (stop their timers).
- Routers hear all, if a group has no report, time out it.
v2
v3
- support Group-and-source specific query: The receiver can claim that it only has interest in specific Mcast source.

note 27 with 28!
before 34?????????

Mcast Routing

Group-shared tree and Source-based tree

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Choose

PIM: Protocol Independent Multicast
Not dependent on any specific underlying unicast routing algorithm (works with all)
Sparse mode
Group-shared tree, use center-based approach
Group members widely dispersed, bandwidth not plentiful
Dense mode
Flood and prune: source-based tree, reverse path forwarding (Nearly same as DVMRP)
group members densely packed, bandwidth more plentiful

Group-shared

MST with constraints（NPC）

NPC never works on Internet!

P42

..
- choose a router as center (it is also a problem to choose a good center (best:NPC)!)
  Other routers to join:
Edge router sends unicast join-msg addressed to center router
join-msg processed by intermediate routers and forwarded towards center
join-msg either hits existing tree branch for this center, or arrives at center
Path taken by join-msg becomes new branch of tree for this router

Source-based

Shortest path tree, reverse path *, use with(?) OSPF(Dijkstra)
- First make a broadcast tree: P39
  Now it is a broadcast tree.
- Then pruning: The no-interest router report Prune upstream (if has no downstream group member) (How to wait?🤬)

DVMRP & MOSPF(buyaoqiu?)

Multicast routing based on shortest-path unicast protocol(undertand again)
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Application Layer Mcast

The Fairy of IP Mcast: …
So we implement Mcast in the application layer.

Transport Layer

peer-to-peer communication
routers have Layer4/5, however they do not talk with hosts.

break up messages to segments / reassemble segments to messages

motivations

IP could not tell which application
IP only provides best-effort service
- delay, loss, duplication, out-of-order
- traffic volume control

Role

Communication between processes(Compulsory)
- mux&demux from/to application processes
- Use Ports
  - Multiplexing (Mux)
    - Gather and combining data chunks from different applications → network layer
  - Demultiplexing (Demux)
    - Delivering correct data to corresponding sockets from a multiplexed stream
provide common end-to-end service(optional)
- reliable, in-order, well-paced…
- facing diverse requirements of applications
socket: software interface (abstraction) to the transport layer for a process
Ports: 16-bit number
- well-known ports: 0-1023 (eg. 80 for webpage)
- registered ports: 1024-49151
- dynamic ports: 49152-65535
OS stores mapping between sockets and ports
- For UDP ports (SOCK_DGRAM)
  - OS manage: <HostIP, Port> socket
- For TCP ports (SOCK_STREAM)
  - OS manage <HostIP, HostPort, RemoteIP, RemotePort> socket

Demultiplexing

host uses IP addresses & port numbers to direct segment to appropriate socket

Connectionless demultiplexing

UDP like, all ports could exchange with a socket.
Only take dst socket into account when demultiplexing.

Connection-oriented demux

like TCP
socket id: <SrcIP, SrcPort, DstIP, DstPort>
server host may support many simultaneous TCP sockets:
web servers have different sockets for each connecting client

non-persistent HTTP will have different socket for each request

or like this (🤬有什么区别吗。。。)

Reliable transport

All bad things can happen…

cheaksum -> detect corruption
ACK & NACK -> detect loss
Sequence number -> identify packets
retransmission -> recover loss/corruption
flow control -> prevent overflow
timeoout -> decide when to retransmit
forward error correction
network encoding -> repair errors

Corruption handle

Handle:

Start from: Detect error: ACK/NACK
ACK still get corrupt: ACK/NACK with sequential number
packet get lost and so that no ACK
ACK get lost and cause timeout
Still have problem: no packet loss, delay cause duplicate.

early days: no formal methods! constant packups

Flow control (General Basis)

Stop-and-wait protocol

sender spin-wait ACK/NACK or until TIMEOUT

network protocol significantly limits use of physical resources!

Pipeline protocols

sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts

increasing range of sequence numbers
buffering at sender and/or receiver

When n-pipeline’s n is proper, it can increase the utilization by n times! (until 1 ideally)

Sliding window: compare with the one above!🤬

Sliding window often called “packets in flight”

Acknowledgement

Cumulative ACKs: next in-order expected
Selective ACKs:
ACK individually acknowledges correctly received packet

Resending

Go-back-N(GBN)
- Discards out-of-order packets (pkts other than )
- cumulative acknowledgements
- Sender sets timer for
- if timeout, retransmit
- better in low error rate, waste bandwidth
  
  better for hardware, and only hardware can handle high-speed!
  complex protocol and data structure: not a system way!
Selective-Repeat(SR)
- treat the packets individually, but window front edge is determined by the smallest unACKed ID.
- retransmit only packet on timeout
- complex book-keeping: timer per packet
- low speed, high error rate, low and expensive bandwidth: 5G

UDP(User Datagram Protocol) RFC 768

lightweight, provides the minimal service

Best effort(may lost or out-of-order)
connectionless, each segment is independent
use in streaming multimedia apps, DNS, SNMP

Features

no connection establishment
simple, small header
no congestion control

Format

Checksum

split as 16-bit integers, add them up, then take the 1’s complement

No error detected does not mean No error!
checksum is optional (source port is optional too)

Note: checksum是check哪些，见后面的TCP

💱Frontier：UDP is reviving due to oligopoly of Internet companies.
UDP+X protocol is used in these companies to provide better service.
UDP+QUIC is used in Google’s Chrome.

TCP(Transmission Control Protocol) RFC 793

必须记下包头格式

Reliable, In-order and Byte stream(assume there is an incoming byte stream)
Almost all the techs: checksum, seqnumber, sliding window, cumulative ack, out-of-order buffer, fast retrans, timeout estimate…

TCP functions (compared to header domains if exists)

必须掌握这个结构！
🤬大端存储还是小端存储？

Mux and DMux (Src and Dst Port)

Nothing much to say.

Content Reliable

Pseudo header + header + overhead: A blog

Order Reliable (Seq Number and ACK)

Early days: “Stream of bytes” service offer service for TTYs
Seq number is ‘Byte(not packet) offset’ (TCP is a byte stream)

pay attention that TCP is full-duplex, so each pkt should have both seqno and ACK domain. There is no implicit ACK pkt, but exchange the ACK knowledge in back-and0-forth data packets(piggybacked). You should not be misled by the diagram above.

TCP segment length:

|------------------------MTU------------------------------|
|---------------IP data-------------------|-----IP Hdr----|
|------TCP data (segment)-----|--TCP Hdr--|
|-------------MSS-------------|

So $MSS = MTU - len(IP~~Hdr) - len(TCP~~Hdr)$
MTU = 1500B for Ethernet
TCP >= 20B

TCP Datagram Sent time:

注意加一减一，ACK是上一个这些事🤬

at receiver, user can choose GBN-like or SR-like(drop/buffer out-of-order pkts). The TCP protocol does not restrict it.

Adopt cumulative + SR-like bitwise（for example）
ACK the smallest seqno unACKed packet, buffer the out-of-order packets. 🤬ppt上各种GBNlike,在哪呢？那不是cumu导致的吗，和GBN有什么关系？混为一谈！
Delay ACK:
1. receiver will not ACK a in-order pkt, but wait another in-order pkt(for 500ms).(Otherwise, the acknowledgement actually never ‘accumulate’ for you always ACK a packet at once).
2. If receiving a out-of-order packet, send dupACK at once.
3. If the in-order packet fill the gap(unACKed holes in the sequence due to out-of-order) and is the first unACKed pkt, ACK at once.(If it fills the gap but not the lowest one, follow rule2)
Deduction: receiver will send duplicate ACKs when a packet is lost
sender will know
1. packet is lost
2. link is still alive(receive receive some other packets)
- Introduces fast retransmit: duplicate ACKs trigger early retransmission
  TCP will retransmit the pkt the dupACKs point to when 3 duplicate ACKs are received
- And a single timer for timeout(only time the first sent but unACKed pkt, namely reset it each time sender pkt LHS moves) retrans the first unACKed pkt 🤬如果就重传一个，那还有什么可说的书P164(namely first pkt in the window) on timeout.
  How do we pick a timeout value? so should be **a bit longer** than (proportional to) RTT So how to measure RTT? See below:

measure RTT

measure RTT: exp weighted average of sampled RTT
Notice: Usually the sample is taken every unretransed ACK (GPT)
Problem: Ambiguous measurement(tell ACK and retrans-ACK)
Algos: as below
Karn/Partridge Algorithm
- handle ambiguous: ignore the packet as long as it is retransed
- Insensitive to RTT variation 🤬
Jacobson/Karels Algorithm
- take the variance of RTT into account: use DevRTT (exp average of Deviation)

Header Length (Hdrlen)

Number of 4-byte words
HdrLen == 5 if no optional regions.

Control (Flags)

Connection Establishment

2-way handshake
2 SYNs to establish
- retrans lost SYN, ignore duplicate SYN once connected
- Problem: slipped segments from old connections
- handle: set a new initial seq number(ISN) far from the previous for a new connection, SYN is in SYN i+1 form (pay attention to +1)
- Obsolete SYN
  
  The A->B SYN and B->A SYN have different seqno. This may be used to implement Full-duplex comm.
- handle: 3-way
3-way handshake

🤬为什么非要加1？ 0号包怎么你了！而且3号包一定要seq number吗（(3)seqno多少？GPT说三号包seqno=ISN_A+1）
Hens and Eggs Problem:
- SYN is lost: network/server is busy
- so no SYN-ACK
- so need to retransmits SYN on timeout
- but we do not know how to set timeout! (No any RTT)
handle:
- Default: 3 secs (RFC 1122 & 2988) (some use 6s)
- Sometimes shorter in data center

A SYN loss in Web scene

User clicks on a hypertext link
- Browser creates a socket and does a “connect”
- “connect” triggers OS to transmit a SYN
SYN is lost…
- 3-6 seconds of delay(very long)
- User may become impatient and retry
triggers an “abort”
- Browser creates a new socket and another “connect”
- Can be effective in some cases

Connection Tear Down

Normal termination

🤬这里还是存在问题，当B对最后一个剩余任务发出ACK，那么如果它立即发出FIN，则可能会这个ACK没有被A收到, 从而引发发送端的TimeOut，但发送端在没TimeOut时候就已经收到了FIN(1/2 RTT) 那么如何处理？

TIME_WAIT相关（GPT）
等待足够的时间以允许所有数据包在网络中老化:
TIME_WAIT 状态持续约 2 倍的最大报文段生存时间（Maximum Segment Lifetime, MSL）。MSL 是任何给定的分组被认为在网络中自然消失前能存活的最长时间，通常这个时间设定是2分钟。因此，TIME_WAIT 通常持续 4 分钟（2MSL）。这一过程确保老的重复数据包不会在同一 TCP 端口上的新连接中被误认为是新的数据包。

Can combine FIN and ACK if no pkts left.
Abrupt termination

State Machine
🤬需要掌握到什么程度？

Flow control (Advertised Window)

So far the window size is constant (buffer size).
Left is sender, and right is receiver.

if consumer is slow, buffer will be full.

straightforward handle: stop sending ACK
- however sender could not tell loss/flow ctrl (so not practical)

🤬简化模型吧，现实中如果来源不止一个，怎么回复RWND?

handle: Credit scheme
receiver sets RWND(advertised window) in ACKs
sender ensures in flight <= RWND
Left is receiver, right is sender. Additional window from (W is RWND, AN is ACK number)
- Benefit:
  Decouple flow control from ACK
  Each octet has a sequence number
- lost ACK/CREDIT little harm: resync the protocol in future pkts
  (for the ‘right pointer’(AN+W) of the window will never go left; and the sender always follow the newest one; and the new ACK/CREDIT does not depend on the old ones.)

Example of credit

Problem: credit allocation deadlock
- —-[]—-→ : closing rcv-window
- —-[]—x→ : to reopen, but lost
- A thinks window is closed, B thinks it is open and wait
Handle: Use Window timer
if expired, send something (could retrans last segment)
The term is called Persist Timer only be set when the permitted window left is 0.
Deadlock: due to piggybacked ACK, when RWND=0, sender wait for receiver, but receiver have nothing to send to sender, so it is stuck.
Handle: sender still send 1 byte when RWND = 0

🤬这地方没解决。什么时候ACK？这两个deadlock问题分别来自ppt和教材，但是方案有冲突，GPT给出的是混合体（如下）？另外丢失ACK的问题在不是等于0，而是sender用光了剩下的时候也存在啊？

TCP的Persist Timer用于处理零窗口大小（Zero Window）的情况，即当接收方通告的窗口大小为零时，发送方将无法发送更多数据。这种情况下，发送方需要知道何时可以再次开始发送数据，但由于窗口大小为零，它不能发送任何新的数据来探测窗口是否已经变大。Persist Timer就是为了解决这个问题而设计的。

Persist Timer启动的情况如下：

接收方通告窗口大小为零：当发送方收到一个窗口更新，指示接收方的窗口大小为零时，它将停止发送数据，并启动Persist Timer。
定期窗口探测：在Persist Timer超时后，发送方会发送一个小的探测报文（通常是1字节的数据），这个探测报文被称为窗口探测（Window Probe）。接收方收到窗口探测后，即使其窗口仍然为零，也应该响应，告诉发送方当前的窗口大小。
窗口大小更新：如果接收方的窗口大小从零变为非零，它会发送一个窗口更新给发送方。发送方收到这个更新后，会停止Persist Timer，并根据新的窗口大小恢复数据传输。

Congestion Control

TCP initially has no congestion control, then causes congestion collapse in 1986.

Jacobson’s fix
Adapt the window size in response to congestion

pragmatic and effective
required no changes to routers and apps
extensively researched and improved upon

Congestion in a simple modeled view.

Real-world problem (much more complicated)

3 must-answered questions in Congestion.
- how to detect congestion.
- who take care of congestion
- how to handle congestion

❌Start from no care: Many packets drop(like UDP)
❌Reservations: negotiate to pre-arrange bandwidth before sending pkts
Low utilization
❌Pricing: do not drop high-bidders (capitalism!)
requires payment model
✔️Dynamic:
1. hosts infer congestionand adjust
2. network reports and hosts adjust
3. or combine the two above
- generality is proved to be powerful
  - not presume business model, traffic features, app requirements… therefore it is universal
  - assume good citizenship
- simple to implement but suboptimal, messy dynamics(means hard to predict behaviour in real use)

TCP congestion control overall:
$sending~~rate \approx \frac{Window~~Size}{RTT}Window~~sizesending~~rate$

Congestion Window	FlowControl Window
CWND	RWND
without overflowing routers	without overflowing receiver
computed by sender	determined by recver and reported to sender

We assume in the course
so we do not care about Flow control when discussing Congestion control

In this course we talk about CWND in unit of MSS
Real implementation CWND is in unit of bytes

3 key issues when trying to control congestion

Discovering the available (bottleneck) bandwidth
Adjusting to variations in bandwidth
Sharing bandwidth between flows

Detect congestion:

packet delays: too much noise (delay often varies greatly) ❌
router tells: too expensive ❌
packet loss: ✔️
Fail-safe (But still has non-congestion loss, like in wireless)
no more detection overhead (already has to detect for reliable delivery)
Just like the what we talked above:
- Duplicate ACKs: isolated loss
  - Still getting ACKs
- Timeout: much more serious
  - Not enough dupacks
  - Must have suffered several losses

will adjust rate differently for each case

Adjust rate:

How to adapt sender?
General: ACK new data then rate↑, loss then rate↓

Discovering available bottle bandwidth: Slow start
start slow for safety but ramp up quickly for efficiency!
Adjusting to variations in bandwidth: Congestion avoidance
- increase rate slowly and focus on stability
- AIMD(Additive Increase Multiplicative Decrease)

Slow start

Start with CWND = 1
increase exponentially () until first loss
Math: for each ACK
So per RTT ???
therefore （🤬有点奇怪，这里指数不对吧，当pipeline的时候，增长应该是连续的而不是离散的）

这样看还是离散的，到后期才会变成e指数
When , stop slow start and enter congestion avoidance.
or there is a loss

Collision Avoidance

focus on maintenance. (when CWND>ssthresh)
Additive increase
- for each ACK
- Math: therefore (Additive increase)
- ‘CWND is increased by one only if all segments in a CWND have been acknowledged’

When loss happens

Multiplicative decrease
- On 3 dupACKs (no matter in CA or SS)
  - directly go to Collision Avoidance
- On timeout event
  - Initiate Slow Start

A diagram to show SS,CA and Timeout

ssthresh 的初值是什么？？
‘Sawtooth’

Problem: CA is too slow when recovering from isolated loss.
Please refer to the example

Solution: Fast recovery

On 3 dupACKs (no matter in CA or SS)
- go to Fast recovery
In fast recovery
for each dupACK
exit fast recovery
When receiving a new ACK
(directly go to CA)

🤬这个也没涉及Fast-recovery啊…

Efficiency (fully utilize)& Fairness
Now prove AIMD to a proper choice.(compared with AIAD,MIAD,MIMD)
A model to evaluate the performance.
given a source state, we could calculate all the following states on the graph.

AIAD: does not converge to fairness

MIMD: does not converge to fairness

AIMD: converge to fairness

MIAD: does not converge to fairness and efficiency

State machine

It is never implemented in hardware…
for it is too complicated to support high frequency…

TCP flavors

TCP-Tahoe
on 3dupACKs and timeout
TCP-Reno
on 3dupACKs
on timeout
TCP-NewReno
TCP-Reno + Fast recovery
TCP-SACK
Use selctive ACK but not cumulative ACK

All the TCP in the Internet corporates well, for they are “generally the same”

increase CWND on good news
decrease CWND on bad news

TCP Performance Metrics & Problems

Packet drop rate,
Throughput,

if a flow has higher RTT, it will have low throughput. unfair!
Implication for High-speed TCP:
not good for Large and Fat pipe: high bandwidth and high delay 🤬怎么算的？
- Too high demand to under-layer robustness
  it is impossible to transmit so much data without error!
- Too long time from a drop to reach peak performance
- Handle:
  - past a threshold speed then increase CWND (additive constant depend on CWND)
  - Multiple connections(make many streams, Works up till know, like IDM)
  - Router-assisted

Old days: Best way for Long&Fat Pipe: Disks in Trucks!

Given the speed swings from W/2 - W
so why dont we just start average it statistically?
Equation-based congestion control
ignore all the events, just set the transmission rate to a result of function , keep monitoring the drop rate and adjust the rate accordingly.
Other loss reasons(like wireless)
TCP cannot tell congestion/corrpution loss
Not friendly for short flows
most flows are short flow.
- Short flows never leave slow start
- Too few packets to trigger dupACKs
  so isolated loss may also need to waiting for timeout and severely impact the flow completion time
Every flow try to overshoots capacity before get a drop.
So a large flow will dominate the small ones. all the small ones have to bear a large delay.
means that delays are large, and are large for everyone (opposite to the basic idea of multiplexing)
host control, so easy to cheat
- increase CWND faster
- use large init CWND
- use multiple connections

It is hard to supervise the network of cheating with only control on the host side.
So…👇

Router-assisted Congestion Control

Compared to router based control, host based is more like a hack. Now we introduce the router’s help and it can solve many problems.

Choke Packet

From router/destination to source

Backpressure

for Propagation time > transmission time case(long distance or high speed link), from router to source is not effective

Warning Bit

Explicit Congestion Notification(ECN)
a bit in IP header, set by congested router

if data pkt has ECN bit set, receiver sends ACK with ECN bit set
routers can set ECN bit with diverse strategies
trade-off between utilization and delay (more precise strategy means more delay)
ECN semantics can be like loss
end host can react to ECN as if it saw a loss

Advantages:

distinguish corruption with congestion;
Can serve as an early indicator of congestion to avoid delays
Easy (easier) to incrementally deploy

Today: (RFC3168) using ToS/DSCP bits in the IP header, Common in datacenters
For business: Whenever I get an ECN bit set(it means still want to send if keep receiving ECN), I will be charged.

RED (Random Early Discard)

Handle TCP global synchronization problem

Traffic burst simultaneously fills queues so pkts lost, TCP connections enter slow start simultaneously, so traffic drops, network underutilized, and burst and underutilize and burst…
RED: Randomly discard pkts before buffer becomes completely full (over a certain threshold)

calculate

: discard
: queue the packet
: probobality discard

Fairness:

Routers classify packets into “flows” (assume flows are TCP connections)
Each flow has its own FIFO queue in router
Router serves flows fairly
when line is free, take packet from next flow in a fair order

Shen me ta ma de jiao zuo fairness!

Max-Min fairness

Calculate the average bandwidth requirement
offer full service to the flow under average
Remove the served flow
Loop: jmp to 1, until no fully-served flow

🔱If you don’t get full demand, no one gets more than you
When all packets are the same size, it equals round-robin
🤬如何不serve all?以个数为单位？？包不一样大为什么不行？

cope with packets with different sizes
Ideal: bitwise fairness (but no practical)
approximate it

Fair Queuing

compute when the last bit of each packet leaves the router if flows are served bit-by-bit (share the bandwidth equally)
serve packets in the increasing order of their deadlines

WFQ: more general, flows have different bandwidth share.
today some form of WFQ implemented in almost all routers.

pros：

cheating flow cannot occupy more bandwidth
doesn’t have RTT-based unfairness ( unfairness)
Flows can pick any rate adjustment scheme they want 🤬怎么理解?

cons:

more complex than FIFO, per flow queue, per packet bookkeeping.
Today’s routers use more coarse-grained WFQ

FQ does not eliminate congestion, but manages congestion.
A case

Congestion still exists, but links are used more efficiently.

round-table discussion
Fairness have different standard. flow sum, congest hop sum…
Flow have different definition. TCP connection, src-dst pair, src itself…
reduce congestion: router can add ‘fair share(n.)’ to packet to inform end-host what rate they should send at… (still need policing) (Any end-host automatic behaviour needs policing!)

Multimedia

This part of content seem to be deleted in the Eighth Version

Analog → Digital Signal

Audio
- telephone: 8K Samples/sec
  Example quantization: 1 B/Sample
  CD: 44K Samples
- Bit rates
  CD 1.411Mbps
  MP3: 96,128,160 kbps
  Internet Telephony: 5.3 kbps↑
Video
video: sequence of imgs, image: bit map
coding using redundancy: spatial and temporal
- spatial: neighbor same RGB
- temporal: not change RGB between frames in the same place
- MPEG1 for CD-ROM 1.5Mbps
  MPEG2 for DVD 3-6Mbps
  MPEG4 for Internet videos <1Mbps

Services

Streaming, stored audio/video
- streaming: can begin playout before downloading entire file
- stored (at server): can transmit faster than audio/video will be rendered (could be stored/buffered at client)
- e.g. YouTube, Netflix, Hulu
Conversational voice/video over IP
- interactive nature of human-to-human conversation limits delay tolerance
- e.g., Skype
Streaming live audio, video
- e.g., live sporting event (⚽)

challenges: variable nwk delay(jitter), interactivity(pause, fast-fwd, rewind, jump), packet loss and retrans
client-side buffering and playout delay:
compensate for network-added delay, delay jitter

(CBR means constant bit rate)
if keeping , buffer will run out
Trade-off: Long-short pre-cache time
too much wait – video lag

Streaming

with UDP 🤬RTP (RFC 2316)

server sends at constant rate

(ususally )
sender don’t care the congestion rate
2-5s short playout delay
error recovery: application-level, time permitting🤬
but always blocked by firewalls nowadays.

with HTTP

via HTTP GET🤬
send at maximum possible rate under TCP
fill rate fluctuates due to TCP congestion control, retransmissions (in-order delivery)
larger playout delay: smooth TCP delivery rate🤬
HTTP/TCP passes more easily through firewalls

with DASH(Dynamic Adaptive Streaming over HTTP)

server:

divides video file into multiple chunks
each chunk stored, encoded at different rates
manifest file(.mpd): provides different URLs for different chunks

could offer diff coding rates, diff URLs for each chunk.

client:

periodically measures server-to-client bandwidth
consulting manifest and requests one chunk at a time
- choose maximum coding rate sustainable given current bandwidth
- can alter the choice at different time according to bandwidth

“intelligence” at client: client determines

when to request chunk (so no buffer starvation/overflow)
what encoding rate to request (quality according to bandwidth available)
where to request chunk (can request from URL server that is “close” to client or has high available bandwidth)

Voice-over-IP (VoIP)

VoIP end-end-delay requirement (maintain “conversational” aspect)
- higher delays noticeable, interactivity
- <150msec: good ; > 400msec: bad (max tolerable)
- includes app-level (packetization,playout) & network delays
session initialization???
value-added services: call forwarding, screening (means filtering the calls, like blacklist), recording
emergency services:911
Loss
- network loss: IP datagram lost due to network congestion
- delay loss: IP datagram arrives too late
  - delays: processing, queueing in network; endsystem (sender, receiver) delays
- tolerance: between and
  (depending on voice encoding, loss concealment…)

A case: Skype

Problem: Alice,Bob behind NATs
NAT denies outside→inside connection
Handle: relay
A,B maintain open connection with their SNs (servo)
callee’s SN connect callee by initially initiated connection

Proprietary application-layer protocol 🤬？

QoS (Quality of Service)

differentiated new traffic in Internet
- low bandwidth & high addtional value: Secret call, Multiplayer video game, email…
- high bandwidth & low addtional value: file download, web videos…
within QoS

ISA(Integrated Services Architecture)

a distinguishable stream of IP packets → a flow

With the same QOS parameters
ID: <SrcIP, DstIP, port, protocol type>
Unidirectional, can be multicast

Architecture

ISA Functions on a router

Routing Algorithm
- Link cost based on QOS parameters, not just delay
- Routing/forwarding based on which class of QoS the flow belongs to
Queuing discipline
- Priority queuing
- Multiple queues, taking account of different flow requirements
Discard policy
- Selective not just the new ones

resource reservation

Providing QoS for individual app sessions
routers maintain state info of allocated resources for each session (like a virtual circuit)
could deny the request if can not guarantee

with in a intermediate router:

soft state: state which has a lifespan and regularly be refreshed/aged

Scope of the request：reservation request should be propagated
to all sender hosts
disadvantage:
- High cost: soft state for each flow, not scalable
- too complicated architecture
  - RSVP will reduce to best-effort if any router does not support it in the path
- few service levels are defined, not flexible

DS (Differentiated Services) (RFC 2475)

divide traffic into classes, and treat them differently
simple func in network core but complex in edge routers
Type of Service field in IPv4 / Traffic Class field in IPv6
define a SLA(Service Level Agreement) between customer and ISP before the service
APP select DS, and traffic with same DS will be treated consistently

PHB??

DS Domain

portion of Internet with a certain set of DS policies
provided by a single ISP or union of ISPs
ISP configures domain edge routers
Customer may be hosts or edge routers in other domain (as above)
Ongoing measure of performance provided for each class (ensure the service quality)
when handle traffic from neighbor DS domain, choose the most fit class in self policy set for the traffic.

DS Architecture

for Edge router

Per-flow (each traffic flow will be check individually) traffic management
Marking: classify the flows into different QoSs (set the header TOS/TC field)
Policing:
distinguish packets with in-profile and out-profile: monitor if the flow obeys the SLA, if true then in-profile otherwise then out-profile
- monitor average rate, peak rate and burst size
- Shaping: shape the traffic to in-profile
  - Leaky Bucket
    departure: fixed-rated traffic
    may drop pkts when bucket is full

🤬都整形了，还要标记干什么？

Marking and policing are not seperate process, when you police the flow (in/out profile) you may mark them to tell other routers.(so-called intra-class marking)
for Internal router

Per-class traffic management (do not distinguish the pkts from different flows in the same class)
Buffering and scheduling based on marks(header)
in-profile will be preferred, out-profile traffic has low priority or be dropped.

Network Security

The conceptual model

Attack overview

The victim of attacks: Web browser/server/banking server,client/DNS server/Routers..

Active easy to detect, but hard to prevent
Passive can be prevent, but hard to detect

Cryptography: handle interception

Encryption key -> , Decryption key ->

Symmetric key: sender key = receiver key
Public key crypto: public, secret

Requirements

Even attacker knows your crypto algo, he could not decrypt it.
Even attacker knows some ‘cipher txt - plain txt’ pair, he could not work out the key
Even attacker knows many cipher txts, he could not work out the key / get the plaintxt
Should not be easily decrypt by Brute Force

Key distribution is a problem(safely get the keys), shown below.

Attacks

CipherTxt-only attack:
- Brute force try all the keys
- Cryptanalysis:
  try to deduce PlainTxt/key by analyzing the nature of crypto algo or general characteristics of PlainTxt/CipherTxt
Known-PlainTxt attack:
- attacker knows some PlainTxt-CipherTxt pairs to deduce algo/key
Chosen-PlainTxt attack:
- attacker can choose some PlainTxt and get the CipherTxt in some way to deduce algo/key

Methods

⛓️：这些算法能考吗

Symmetric Key Crypto

Traditional ones

Substitution: alphabet replacement
- Caesar cipher: shift the alphabet by k
  - key space size: 26
  - attack: BF
- Monoalphabetic cipher: just map the alphabet to a mess up one(no repeated)
  - key space size: 26!
  - attack: frequency analysis(letter/word/letter pair in specific language)
- Vigenere cipher: do Caesar cipher for each letter in the PlainTxt, the shift value is determined by each letter in the key in cycle
  - a letter in PlainTxt can be mapped into different letters in CipherTxt (most k possibilities)
  - key space size:
  - attack: try to find the , and attack each group as Caesar cipher
Transposition (Permutation) methods:
- Rail Fence cipher: write the PlainTxt in row and read it in column
  - key space size: at most
  - attack: BF
- Row-Column Cipher: write the PlainTxt in row and read it in column(column sum = key length, so that each letter of the key corresponds a column), the read order of column is the alphabet order of the key
  - attack: can just try.
- pure transposition can be easily attacked by Known-PlainTxt attack

Block Cipher

Scheme: Feistel Cipher
allow encryption and decryption to be done with the same algorithm (hardware/software implementation)

DES: Data Encryption Standard
- key space size:
- attack: no backdoor, just BF
  not so slow for modern comupters(let alone the specially designed computer) so useless now.
TDES：Triple DES
- 3 keys, 3 rounds
- key space size:
- EDE(Encrypt-Decrypt-Encrypt) or EEE(Encrypt-Encrypt-Encrypt)
- too slow, block size small
AES: Advanced Encryption Standard
- 128 bits block, 128/192/256 bits key
- 10/12/14 rounds(for 128/192/256 bits key)
- key space size: //
  ⛓️ AES Details

Public Key Crypto

based on mathematical algorithms
Asymmetric key: sender key receiver key
each one has a pair of keys:

public key known to all
private key known by self

and

Requirements

algo should be public
given algo/PUBkey, attacker could not work out PRVkey
given the cipherTxt should not work out the PlainTxt
given CipherTxt-PlainTxt pair and PUBkey should not work out PRVkey

RSA

message is bits, and bits can be seen as a number

Pick 2 large prime numbers and
(a large one at least 1024 bits each, we take 1024 as an example)
(Euler’s totient function)

Note that this is the foundation of RSA. Given , it is hard to work out . The algorithm of working out depends on the factorization of .(factorization has no solution up to now(though it is not proved to be ). And N is a large number(), so attacker can not work out with Brute Force.
However, due to and are prime numbers, so that the key generation could calculate directly with the formula above.
So the core is and . If and are somehow known by attacker, then he could replay this process to break the RSA.
Pick so that and is coprime with （）

e is relatively small, usually 65537, so it not so hard to check if is coprime with
Compute so that （ is the modular multiplicative inverse of mod “模逆元”）

We could use the Extended Euclidean Algorithm to work out .
It’s complexity is , namely about 2048 for our example. It is not so slow.
so that and
then we utilize

Proof: (A brief one, only for , a full one )
(Euler’s theorem)
due to ,

then
So
Encrypt: , and send
Decrypt:

Msg must be smaller than , otherwise, you can only get in the context of modular arithmetic.

Some considerations

and should be about the same size
and should be unlated
should be relatively small(easy to encrypt)
should be large enough to prevent the attack(Bruteforce)

reverse of RSA: the owner could authenticate himself.

append a signature to the message
receiver could verify the signature by
verify it is from A for only A knows

In practice: session keys

RSA is really slow compared to symmetric key crypto

so…act as a secure key exchange method

A and B exchange session key with RSA (establish a secure connection)
A and B communicate using symmetric key crypto with symmetric key

Authentication: handle modification & Fabrication

🤬首先这几个分工写的对吗？其次MAC也不能解决Fabrication啊

v1.0: Declare who u r
Easy to cheat.
v2.0: Declare who u r + your IP
Also easy to cheat for professional.
v3.0: Declare who u r + your IP + shared pwd
Playback Attack: record the request and play it again
v3.1: Declare who u r + your IP + encrypted shared pwd
Playback Attack still works
v4.0:
— Request–→，
←– disposable Random (i.e. nonce) —
— –→
How to exchange shared symmetric key
v5.0: use nonce + public key
— Request –→
←– Nonce —
— –→
←– ask for pub key —
— –→
then B compute know it is Alice

evidence: only Alice know

v5.0 still has a hole: man-in-the-middle attack

How to? : just because only Alice knows , so Bob can not tell if the keys are Alice’s or not.

man-in-the-middle attack can be conducted silently
胡想：在mim攻击中 delay may have relation with size?（没意义，本来也是有关的）

Role

to ensure

come from the right person
have not changed message

creating an authenticator
send the message with the authenticator
receiver verify the authenticator

handling

Source impersonation / spoofing
Message injection / modification
Message re-sequencing / replaying

MAC

notice distinguishing ‘secret’(n.) and ‘secret’(adj.) key
I will refer to secret key as skey.
Notice that skey is a kind of secret(n.)
When it comes to secret(n.), I use for short.

Usually 64-256 bits
can be sent in clear text’
MAC is
Operablity
- $fixed~~length = f_{MAC}(any~~length)$
- Easy to compute
Security:
- One-way: hard to find
- Weak Collision Resistance: hard to find so that
- Strong Collision Resistance: hard to find any
Use Crypto func(with skey) upon
- CBC-MAC (Cipher Block Chaining Message Authentication Code)
  (n bits each)
  encryption algo ( is the skey of Crypto func) 什么🤬
  (Initialization Vector) random n-bit block
  for
  
  i.e. first and last block
Use Hash func upon and

这里应该写错了padding bits 最多可以有512bits🤬
- MD5
  The given demonstration and the diagram on Wikipedia are poor, so I make it on my own. Holes have been found in MD5.
  Xiayun Wang found the high-performance algo of MD5 collision.
  Wikipedia
- SHA-1
- 如何把secretkey引入Hash func HMAC🤬

Digital Signature

🤬just as reverse: RSA
A sign the document and others can prove A made the document
digital signature is Signed MAC (MAC with linked person)
🤬还是不能解决pizza prank啊

🤬所谓的MAC 和 Digital Signature，最主要的区别就是，MAC用的是对称加密，Digital S 用的是非对称加密吧

Key Distribution: handle Fabrication

2 Aims:

Sender and receiver should share common secret
Others should know A’s is actually A’s

against aim1:

Record-and-playback attack

man-in-middle is a kind of Record-and-replay attack(as above)
normal record-and-playback can be prevented by nonce/timestamp
🤬这也不是重放攻击啊

against aim2:
“pizza prank”
the attacker could encrypt the request with his own and declare his is A’s.

A public large prime and generator (actually is OK [] for is prime).(是模循环加群，所以P为质数时就是1~p-1,且除了1都是生成元)

🤬DH还是会被Pizza prank攻击吧

2 kinds of keys:

Session key: just used for a session
Permanent key: used for a long time(for distributing session keys)

CA determine the validity of the sender and receiver, and provide the session key to them.

security service module(SSM): working at the host, like an agent for application layer, offering end-to-end encryption and obtain the key.
The Needham-Schroeder Protocol

keep the forward secrecy: (this part is not a subsection of Trusted Certification Authority)
means that though the key is leaked, the previous messages are still secure.
use one-time session key

prepare a session key

encrypt with of the receiver

encrypt with

send the encrypted and

receiver decrypt to get the with his , then decrypt the with
please refer to
🤬这里用的非对称加密，用DH呢？

Public Key Certificate: to verify the owner of public key

the key owner will register his claim to the CA, and CA will issue a certificate to the owner.

just like a signature of the key, but this signature is not encrypt with owners’ own PRVkey, but a ‘could-not-be-pretended’ CA ‘s PRVkey.
Everyone should be able to find the real CA to check the certificate.
CA could be government or some else.

Content of a certificate:

serial number
info about certificate owner(algorithm, key value)
info about issuer
valid dates
digital signature from the issuer(what to be check)
fingerprint/thumbprint(hash value to check the integrity of the certificate)

Real case: Secure e-mail

Confidentiality: Use session key

Integrity: use MAC(🤬HMAC?)

Combine them!

Then Alice & Bob achieve Confidentiality + Integrity (only this two)

SSL

⛓️整个P3,为什么
https的s
4 🐒
SSL is a sub layer between TCP and Apps

愤怒🤬PGP是什么东西,这翻译,陈鸣老登还有脸把这书出版?
PGP just like the email system above.

直跳13.。。

Application Layer

🐒1-4

Process: program running within a host

hjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Architectures

CS(Client-Server)

Client:

Start as required then init contact with server first
Host may have dynamic IP addresses
e.g. Web: client implemented in browser; Email: in mail reader

Server:

Run as daemon (always-on)
Provides requested service to Client
Host has permanent IP address
e.g. Web server sends requested Web page, mail server delivers Email

P2P(Peer-to-Peer)

No always-on server
free from big-brother
Arbitrary end systems directly communicate
peers request service from other peers, provide service in return to other peers
self scalability - new peers bring new service capacity, as well as new service demands
Peers are intermittently connected and change IP addresses
Highly scalable but difficult to manage
Examples: Gnutella, BitTorrent, Skype

Skype

centralized server for authentication
decentralized P2P for communication

App: DNS (Domain Name Service)

Translates human-readable names to IP addresses
Distributed database implemented in hierarchy of DNS servers
Load balancing: multiple IP addresses for one name

Goal

Uniqueness: no conflicts
Scalability
Distributed, autonomous administration
Ability to update my own (machines’) names
Don’t have to track everybody’s updates
Highly available
Lookups are fast
Perfect consistency is a non-goal

Hierarchy

Hierarchical namespace
A tree structure Top Level Domains(if top domain is not country region, it’s the US’s!)
name is the leaf-to-root path
depth of the tree is arbitrary(<128)
Hierarchical administration
Hierarchical server
- TLD(Top-level domain) server
- local DNS server
  helper the user

Reliability

Replicated DNS servers (primary/secondary)
Name service available if at least one replica is up
Queries can be load-balanced between replicas
Usually, UDP used for queries
Reliability, if needed, must be implemented on UDP
Try alternate servers on timeout
Exponential backoff when retrying same server
Same identifier for all queries(when have multiple DNS server)
- Don’ t care which server responds

Fast lookup

caching
Performing all these queries takes time
Up to1-second latency before starting download
Caching can greatly reduce overhead
The top-level servers very rarely change
Popular sites (e.g., www.cnn.com ) visited often
Local DNS server often has the information cached
How DNS caching works
DNS servers cache responses to queries
Responses include a “time to live” (TTL) field
Server deletes cached entry after TTL expires

Attacks

DDoS Attack

Redirect Attack

man-in-middle: intercept queries.

App: Email

One of most heavily used apps on Internet
SMTP(Simple Mail Transfer Protocol): simple text messages
MIME(Multi-purpose Internet Mail Extension): other types of data
POP: Post Office Protocol
Msg retrieval from server, including authorization and download

IMAP: Internet Mail Access Protocol

Manipulation of stored msgs on server

User Agent

Composing, editing, reading mail messages
e.g. Eudora, Outlook, Foxmail, Netscape Messenger
Outgoing, incoming mail messages stored on server

Mail Servers (Host)

Mailbox contains incoming mail messages for user
Message queue of outgoing mail messages
SMTP protocol between mail servers to send mail messages