Notes for Computer Network

Jackcui NJU Loser
  • L1-1 还有补充
  • L1-2 未完成
  • L2-1 未完成
  • L2-2 还有订正
  • L2-3 未完成
  • L2-4 几乎完成,残留问题已标注, 最后没讲完!!!
    • 同步问题
  • L2-5
  • L4-1 🤬🤬
  • L4-2 🤬🤬🤬🤬🤬🤬

Chapter1 Overview

Network Composition

Protocol: (Definition in Textbook P6)

  • Protocol is a state machine.
  • No protocol ensures 100% reliability.

Internet includes Computing devices, Communication links and Routers.

Three key parts:

  • Network Edge
    Host/Endpoint System: (主机/端系统)
    includes Client and Server

  • Network Core
    mainly the switch
    two model

    • client/server model
    • P2P model

    two switching

    • circuit switching
    • packet switching

    The core concept of switching: Statistical Multiplexing

  • Access Networks
    Endpoint system to Edge router.

    • Home: DSL, Cable(HFC), FTTH…
    • Enterprise(and home): LAN: Ethernet, WiFi…
    • Wide-area wireless: 3G, 4G, LTE, 5G…

    Physical medium: Copper(UTP), Fiber, Radio, Satellite.
    Guided & Unguided

Distinct the three parts:

From ISO-OSI to modern Internet Architecture

TCP/IP Protocol Stack

Layer encapsulation: Protocol headers

More detailed:

Network Performance

  • Delay

    • Transmission Delay (‘The toll station’)
      Push data into the link:

      Bandwidth ↑ , Transmission Delay ↓

    • Propagation Delay (‘The car on the highway’)
      Data run from an end to another
      a graph to demonstrate:

    • Queueing Delay
      Packets wait for transmission

      Persistent queueing causes packet loss.
      Little’s Law:
      for length of queue(average), for arriving rate(average), for waiting time.

    • Processing Delay
      Negligible

  • Loss
    We care about loss rate most.

  • Throughput

    Therefore
    (the one is bottleneck link)

Traditional: supported by adaptor(hardware) and driver(software)
New demand: GPU directed linked net interface ‘Hardwarize’

Different types of links

  • Wired point-to-point links
  • Wired multiple access links (LANs)
  • Wireless links (WiFi)

Service

  • Framing: encapsulate into frames and extract
  • Link access
    • point-to-point or broadcast
    • Multiple Access Control(MAC): avoid the collision(different nodes speak simutaneously) no out-of-band
      use MAC protocol:
      • Channel partitioning: Divide channel into pieces
      • Taking turns
      • Random access: “Recover” from collisions
  • Reliable delivery
    • Flow Control: Ensuring the sender not overwhelm the receiver i.e. preventing buffer overflow

      • Stop and Wait good at large frames
      • Sliding window
        • the sender’s window is the safe region, between the sent(know) and the done(told by receiver)+capability
        • the receiver’s window is between done+capability and accepted

      the decode field of the frame No. is

  • Error Detection & correction: Data + EDC (Redundancy)
    • Parity checking
      detect: 1D, D+C: 2D (as below)

LAN

  • Early: Token Ring

  • Ethernet:

    • topology:
      • early: bus broadcast
      • modern: switch

    Preamble: Where does a bit begin? Physical Layer copes with it.

    • Framing: Where does a frame begin?
      • sentinel bits (必考)

Lab note

  • Mininet
    sudo mn -h : help
    sudo mn : Open CLI
    > help
    > nodes : display nodes
    > net : display links
    > dump : detailed info
    > [node] [command] [args] : run command in a certain node
    > [node] xterm : open a seperate commandline for [node]
    > exit
    sudo mn -c : clean up
    the given exmaple could not run compatibly in Py3…

  • Wireshark:
    Quick Check

  • Switchyard
    Testcase syntax:

没完事上面的 (Lecture 3)

Ideal:

  • R bps for one
  • R/M bps for M
  • fully-decentralized
  • Simple

Channel Partitioning

  • TDMA(time division multiple access):
    fixed time slot
  • FDMA(frequency division multiple access):
    fixed frequency band
  • CDMA(code division multiple access):
    Details in the future

Taking Turns

  • Polling: Master & Slaves:
    master asks slaves in turn
    • cons: polling overhead, latency, single failure
    • use: Bluetooth: very weak(low-power) terminals
  • Token passing:
    • cons: token overhead, latency, single point failure.
    • use: IBM Token Ring

Random Access

  • Carrier Sense, Collision detection, Randomness
  • ALOHA
    ALOHA workflow
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
               +------------------------+
    ↓ |
    Sender --+-- ACK --- fine |
    | ↑
    +-- !ACK -+- p = P --- Resend
    |
    +-- p = 1-P --- Wait

    Enough repeatations : Abort

    Receiver -- checkpass -- ACK

    What about ACK?
    It is a simplified model, so we assume ACK is finished immediately and within the slot. (Once reach then accept)

effciency < 20%

  • slotted ALOHA
    Time in uniform slots = frame transmission time (T0)
    synchronized
    also p=P retransmission

    effciency approach theoritical 37%

  • CSMA(Carrier Sense Multiple Access) (载波侦听)
    still has collisions: for propogation delay

    • use: Ethernet
  • Nonpersistent CSMA (really deferential)

    workflow
    1
    2
    3
    4
    5
    Sender --- Listen --+-- Idle ---Send
    ↑ |
    | +-- Busy --- Wait rand(t)
    | ↓
    +-----------------------+
  • 1-persistent CSMA (selfish)

    workflow
    1
    2
    3
    4
    5
    Sender --- Listen --+-- Idle ---Send
    ↑ |
    | +-- Busy
    | ↓
    +-----------+

    cons: many waiting -> guarantee collision…

  • p-persistent CSMA

    workflow
    1
    2
    3
    4
    5
    6
    7
                 +-------------------------------------+
    ↓ |
    Sender --- Listen --+-- Idle --+-- p = P --- Send |
    ↑ | | |
    | +-- Busy +-- p = 1-P --- Delay t
    | ↓
    +-----------+

    aim of p: avoid instability under heavy load
    ideal value: $p=\frac{1}{Waitingstationssum}$
    cons: usually gives really long delay…

  • CSMA/CD(collison detection) (based on 1 persistent)

    1
    2
    3
    4
    5
    6
    7
    8
                                       +-- Wait rand(t) --- Send Jam & Abort
    | |
    Sender --- Listen --+-- Idle --- Send --- Collision? --- Yes
    ↑ | |
    | +-- Busy No --- ACK
    | ↓
    +-----------+
    Jam: All-1 signal
  • detect abnormal high voltage

    • signal will attenuate: limit distance
    • special judge: multiple ports activity

Minimum frame size: 2*propagation delay

  • ,
  • Otherwise, too late to stop transmission. (The system is not workable, but meaningless)
  • Therefore
1-persistent in IEEE 802.3 (Wired Ethernet)

collision handling: Exponential backoff

IEEE 802.3 Workflow
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23

report error---------------- 16th time?
|
... ...
| |
+-- Wait rand(0~2^10t) ---------- 12th time?
| |
+-- Wait rand(0~2^10t) ---------- 11th time?
| |
... ...
| |
+-- Wait rand(0~8t) ------------ Third time?
| |
+-- Wait rand(0~4t) -- ------- Second time?
| |
+-- Wait rand(0~2t) -------------- Again?
| |
Sender --- Listen --+-- Idle --- Send --- Collision? --- Yes --- Send Jam & Abort
↑ | |
| +-- Busy No --- ACK
| ↓
+-----------+
Jam: All-1 signal

Performance

  • point-to-point

    normalized time: (relative)
  • Ring
    (必考)

  • Slotted ALOHA
    Every node: to send

  • Pure ALOHA

(这里有大问题!)

what you only know initially: distinct MAC addr for each NIC

Protocols

  • Discovery end-hosts
  • bootstrap communication with remote
DHCP (Dynamic Host Configuration Protocol) (RFC 2131)

broadcast for DHCP server(in the past: a real person)’s help
tell you local IP, netmask, DNS, IP of first-hop router.

  • Procedure
    • client -> broadcast: DHCP_DISCOVER Any helper here?
    • all DHCP servers -> client??存疑: DHCP_OFFER I’m here at IP<…>
    • client(choose one) -> client: DHCP_REQUEST I ask the helper at IP<…> to help
    • selected DHCP server: DHCP_ACK Let me tell you your configuration…
    • (client quit) client:DHCP_RELEASE I will go, unbind plz
    • soft quit: the binding will expire if the client do not renew/rebind it.
  • Under failure

Now you want to ping someone with an IP, how could you know his MAC?
Why must be MAC? seem to useless in local net, but when remote MAC is for the destination, but MAC is for every next link.
Why? must what is link layer for

ARP (Address Resolution Protocol) (RFC 826)

Map IP address to MAC address (only on LAN)
Every host has a ARP table

  • Procedure (Local)
    • sender broadcast: ARP_REQUEST Who has IP<…>? I’m MAC<…> IP<…> (in MAC frame)
    • receiver: ARP_REPLY the dest is me… Hi MAC<src> My MAC is <…> (in MAC frame) [cache the { IP<src>: MAC<src> } ]
    • sender get it: [cache the { IP<dst>: MAC<dst> } ]
A remote communication case
  • Procedure
    • client enter the LAN: DHCP get the configurations
    • want to remote comm:
      • use netmask to know it is a remote
      • ARP get the MAC of first-hop router
    • send to first-hop router…\

MAC for the next stop!

Philosophy of ARP & DHCP

  • Broadcasting: Can use broadcast to make contact
    • Scalable because of limited size?怎么理解
  • Caching: remember the past for a while
  • Store the information you learn to reduce overhead
  • Soft state: eventually forget the past
  • Associate a time-to-live field with the information
  • … and either refresh or discard the information
  • Key for robustness in the face of unpredictable change

Interconnection of LANs (still local)

  • store & forward LAN frames
  • exact bitwise copy

Bridge:(IEEE 802.1D)

Philosophy
  • S&F: select the frames for neighbor LAN
    Use the CSMA/CD to access LAN segment
  • Transparent: to stations, unaware of the bridge
  • Plug & play, self-learning: use once plugged in (do not need much configuration)
Mechanisms

9.10

Frame broadcasting
  • “broadcast storm” in extended LAN

How to solve?

Loop resolution
  • Spanning tree protocol (Perlman)
    eliminate the loops!
    make a spanning tree on the physical topology!

    • no configuration by ops and users
    • self-healing

    must make a spanning tree distributedly:

    1. Pick a root (agree on select the Min MAC)
    2. Prune (only keep the shortest path, break ties(a simple agreement: lower ID, any is OK))

    Node aspect:

    • All send Messages (Y, d, X)[This is actually named ‘BPDU’]: “I’m X, I elect Y with distance d from me”
    • If receive message from a lower ID, update election
    • If update election, send new message
    • After the election, nodes determines which is the root port. (Since then do not forward the packets from them, but still listen (in case of BPDU))

      Many many details:
      When election ends? ususally after a constant time gap in which no new (could-make-change) BDPU is received.

What about the Synchro problem, do really need synchro? no one tell me…
how to choose port? store them all?

(必考)

  • Robustness: After spanning tree is built, the root will still broadcast BPDU in certain frequency (eg. 2s), and all other nodes will flood them to all the ports including the blocked ones.

    How to detect? what ambiguous!


  • Flooding: forward to all the valid(on the tree) exports except for the import
    • Philosophy: Waste, but learn from it! (bootstrap more precise/efficient forwarding)
    • Example:
      in spanning tree, I receive message of A from port p, so I know just forward to p when dst is A
      cache the ‘knowledge’ with a certain timer
Address Learning

Address database:
mechanism: just as the timeout-switch lab
receive, learn;
know, forward; adapt the lab!!!!!!!!
do not know, flood;

Bridge do not know who is in his LAN originally!!

More than Bridges

  • Hubs: physical repeater (Not in Link Layer)
    Physically star, logically bus, 2 lines per end (transmit and receive) (collion vulnerable)

  • Bridges: inter-LAN (forward + addr learning)
    2-4 ports for LANs
    as above

  • Layer2 Switches: inter-LAN/host (Bridge + collision free)
    many many ports for LANs/hosts

  • Layer3 Switches: Not for now, a router in nature, but not the same as router (not in Internet)

After switched

Not traditional view! All dedicated links.

  • buffer and forward
  • Ethernet Protocol on each link
  • each link has its own collision domain
  • full deplux: Seperate send and receive
  • Switching: two forward process diff src and dst can happen simutaneously. (Multiplying)
  • No change when adopting switch / adding more switches (scale easily)
  • Station has dedicated capacity equal to original LAN (No summarize self understand)
  • Interconnectable: switch – switch
  • switch self-learning: like the former ones (必考) drawbacks (in large scales) (not intepretale)
  • Broadcast overload
  • Lack of multiple paths

(Rubbish did not finish it!没完事)

Wireless

Philosophy

Wireless, Mobility(handoff, mobile change base station, some wired also needs)

Components

  • wireless hosts:
    user devices, stationary/mobile
  • base station:
    • relay: connect wired and wireless
    • cell towers, APs…
  • wireless link:
    • typically mobile - base station
    • also used in backbone link, saving cable cost
    • MAC
    • different rates and dis…

Model

  • Infrastucture mode: base – mobile
    like master&slave
    save mobiles’ power
  • Ad-hoc mode: hosts self-organization, route among themselves

Characteristics

Attenuation:

FSPL: Free Space Path Loss

d: distance, f: frequency, c: speed of light
Reflection, diffraction, absorption, terrain contours (urban, rural, vegetation), humidity

SNR: Signal-to-Noise Ratio
BER: Bit Error Rate

  • Curve:

  • Same physical layer: Power↑, SNR↑, BER↓

  • Dynamic SNR ↑↓ 👉 adapt Physical layer

Multipath propagation

Self-interference

Interference from other sources

e.g. 2.4GHz-phone-motors-microwave

  • Broadcast medium
    • Anybody in proximity can hear and interfere
  • Cannot receive while transmitting
    • Our own (or nearby) transmission is deafening our receiver Half-duplex
    • Recently may full duplex
  • transmission always not intact
  • Multiple wireless senders and receivers create many problems

Mechanism WiFi - IEEE 802.11 Wireless LANs

Architecture
  • station:device with IEEE 802.11 MAC and phy layer
  • AP(Access Point): wired – wireless
  • BSS(Basic Service Set): a cell coordinated by AP
  • ESS(Extended Service Set): multiple BSSs interconnected by DS(Distribution System)(switches, wired network, wireless network)
    ESS appears like a single logical LAN
    Portals for Internet access.

对比有线帧和无线帧

Association
  • 802.11b: spectrum divided into 11 channels
    AP admin (?) : choose channel.🤬
    host associate with an AP
  • Scanning
    • Passive
    • Positive Then DHCP!
MAC
  • Hidden terminal problem: A & C do not know each other so they keep twittering. - **Solution: 4 Frame Exchange**
  • Exposed terminal problem: A sends to B, C sends to D, A cannot send to B thinking inference but actually not.
  • Signal fading: A & C can not hear each other with inference of B

🤬Really Ambiguous Terms
Actually according to Wikipedia , Medium Access Control, Media Access Control, Multiple Access Control, MAC are exactly the same thing!

3-level IFS
We use shorter IFS for high priority so that the high-order one can steal the turn at the gap between the lower ones.

  • SIFS (Short Inter Frame Space)
    • Shortest gap (IFS), highest priority
    • “all immediate responses”:
      • LLC PDU
      • Poll response: as below
      • ACK/CTS/RTS
  • PIFS (point coordination function IFS)
    • Mid-length IFS
    • Used by the centralized controller in PCF scheme when issuing polls
  • DIFS (distributed coordination function IFS)
    • Longest IFS, for asynchronous frames contending for access ??

JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm
JAMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMm

DCF(Distributed Coordination Function)

CSMA/CA

  • but not CD
    • difficult to detect collision for fading (can not distinguish from noise and own trans)
    • can detect all hidden terminal…
  • very much error, so use ACK

PCF(Point Coordination Function)

  • use centralized controller to polling master (point coordinator)
    PIFS(seize the gap, lock out all asynchronous frames)
  • for time-sensitive traffic (QoS)
  • always AP be the point coordinator
    • has a polling list
    • poll stations in round-robin
    • the IFS is shown below 👇
    • a superframe is divided into CFP and CP
      • CFP for PCF & CP for DCF
      • theoretically, CFP can take up all the superframe, however practically 30%-50%(GPT)

🤬P38: 如何理解下面那个if..if, 间隔怎么选取?图上和ppt不一致
🤬P39: 如何理解第三条

A extended description

LLC和MAC???????注意
区别MAC和MAC!!!!

🤬 超帧长度不是用beacon确定的吗,这算什么??

CSMA/CA:
No CD:

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Network Layer

every host/router has network layer

Function

  • encapsulate segment into datagram, and decapsulate it

  • Switching/Routing
    determine routes(the forwarding table), shortest path…(multiple nodes)

  • Forwarding
    Move packet from input to output(within a single node)(determined by routing)
    Error handling, queuing and scheduling

  • Connection Setup
    in some network arch, especially in virtual circuit switching
    connection in network layer: between 2 hosts and intervening routers
    connection in transport layer: between 2 processes

Service Model

  • for individual packet:
    • guarantee delivery
    • guarantee + limited delay
  • for stream:
    • in-order
    • minimum bandwidth
    • limited of variety of inter-packet spacing

Aborted: ATM
Constant Bit Rate (CBR), Variable Bit Rate (VBR), Available Bit Rate (ABR) and Unspecified Bit Rate (UBR)

Good service, but too expensive!

Survivor: Internet
Only best-effort!

IP routers

infrastructure of teh Internet
Router capacity
Number of external router “ports”
= Speed (“line rate”) of a port
has many types(Core, edge, small business)

Architecture:

ta plane.(actually lower in the protocol stack)
The lower plane is control plane.(actually higher in the protocol stack)

Input

(不确定)

Tasks

  • receive pkts

  • update IP header

  • Lookup:
    challenge is speed!
    <range, port> not <an IP, port> too much!
    aggregated IPs
    Longest prefix matching rule(enable precise control of routing)

    • high efficiency: tree data structure!
      a 01 Trie! (必考:按表画树)

    modern: hardwarize again: TCM 3-state chip! ASICs

  • queuing: if packet arrive faster than forwarding rate, then wait

Output

🤬这个和后面四层的那个差不多吧

Packet Classification

Simple: No classification, FIFO buffer, Drop-tail(when full, drop incoming packet)
Classify IP packets:
with Src/Dst IP, TCP Port, TOS, Type of protocol
some law issues on packet classification and schedule (Internet neutral Law)

Scheduler
  • must fast!

  • Priority scheduler: first serve nobility

  • Round-robin scheduler: each queue in turn

    • Fair queuing (FQ): round-robin for packets of different size (fair in size or fair in sum)
    • Weighted fair queueing (WFQ): serve proportional to weight

(See details at the Transport Layer Part)

Switching fabric

connect input to outputs

switching rate: often measured as $k\times input/outputlineraten\times line~rate$)
(line rate is single port rate)

  • via Memory
    control by a CPU, load & store, 2 memory access each datagram.
    too slow!
  • via bus
    share a bus
    Bus contention: bus bandwidth will limit the switching speed
  • via Mesh
    interconnected network like Banyan network…
    initially designed for processor.
    advanced design: fragment into fixed-len cells and switch

VC

VC network: Aborted by Internet but others may use it, like military

  • Virtual circuit networks:
    service provided on flow of pkts.
    packet, but preserve resources for you
    provide network-layer connection
    (ATM, X.25…)
  • Datagram networks:
    service provided on individual packets
    network-layer connectionless

Connection setup

end hosts and intervening routers establish a path for virtual connection
use routing to find a suitable path

VC Implementation

  • Path from source to dest
  • VC number for each link along the path
  • entries in forwarding table
  • packet carries VC number but not dest address
  • VC is changed at each hop
  • switch on the path maintain state for each passing connection:
    forwarding table list the new VC number for the incoming VC number
    <Input port, Incoming VC, Output port, Outgoing VC>
    Compare: datagram network forwarding table: <Dst IP address, Subnet Mask, output port>
  • Link/switch resources can be allocated to VC
    Dedicated resources means predictable quality of service 🤬这就不是三层了二层吧y用的属于甚至都是switch

Control:signaling protocols信令协议

Addressing

  • Physical network address (Layer2)
  • Inter-network address (Layer3)
    unique address for all the end system and router
  • Application address (Layer4)

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Scope

  • Global:
    non-amibiguity through the world
    a router may have more than one
  • Network attachment address
    unique address for each interface in a network (Layer2)
  • Port address
    just unique in a single host/router

Mode

  • Unicast
  • Broadcast
  • Multicast
  • Anycast
    Point to some dst, just go to 1,
    determined by routing

IP Protocol

  • Internetworking: the nets could have different archtecture
  • IP layer entity resides on each host and router
  • Provides connectionless service (i.e. datagram mechanism)

Operations in General

Routing
  • maintain static/dynamic routing tables
  • adopt different policies (based on different algorithm)
  • Problem: datagram may loop indefinitely
    🤬为什么是TCPneed? ppt2-15

14 source routing???

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Fragmentation & Reassembly

packet length must fit the MTU(Maximum Transmission Unit)

  • fragment at host and router
    • at host: try to figure out min of MTU on the path to avoid fragmentation
    • at router: if too large, fragment(host can not 100% avoid fragmentation)
  • reassembly only at host
    • router can not reassemble, for not all the fragments may pass a certain router
  • may fail if some fragments get lost
    • set the timer when the first fragment arrives
    • if not all fragments arrive until timeout, discard all
Error Control

Due to best-effort, should offer some error control
Router should inform the src if the packet us discarded (checksum failure, TTL expired…) (Use ICMP)
so src could inform higher layer protocol, so that higher layer protocol could offer better service quality.

Flow Control
  • Allows routers to limit rate of incoming data
  • Router discards packets when buffer is full
    • May send source quench packets to sending host
    • Using ICMP (refer to Router-assisted congestion control)

IP Function (According to Packet Format)

Header

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Max length of datagram:

Fragmentation (total length, identification, flags, fragment offset)

each link has its own MTU, router may fragment the datagram
only be reassembled at the destination

Only could be deliver to the higher layer when all the fragments arrive, for the higher layer protocol header will not be duplicated in each fragment. We can only parse the higher layer protocol information when all the frag

  • identification of the originated datagram (so that we could group the fragments and reassemble them)
    <Src addr, Dst addr, Upper layer, Data Unit ID>
  • Data length field: only the data part
  • Offset: Unit: 8Byte
  • More flag: indicate if the last fragment

Procedure

  • prepare a enough buffer
  • receive fragments and insert them to proper position
  • from Offset 0 to a false-More flag datagram
  • reassemble them
Addressing (src address, dst address)

An example: Distribute address

netid is unique can administered by several organizations (ARIN,RIPE,APNIC…)
hostid is assigned within designated organizations

Network sum Addr sum per nwk
typeA 127 reserved for loopback All allocated
typeB () All allocated
typeC ()

‘-2’: all-0 for network addr, all-1 for broadcast
127.0.0.1 loopback addr. “localhost”

Example: Subnet from inside and outside

Subnet和Subnet Mask略去了,没时间了,待补档

CIDR(Classless Inter-Domain Routing)

ISP can be regard as a set of subnets
Hierarchy: ISP -> subnets -> hosts
Hierarchy addressing supports route aggregation: more efficient routing broadcast

IP Suite

NAT(Network Address Translation)

translate when interfaces with broader Internet
enable inside and outside use different IP addr sets

Purpose
  • act as firewall (only from inside to outside, hide internal IP)
  • save IP addresses
  • Isolate ISP/Organization change (just change the NAT)
Types
  • Static NAT: 1:1 mapping (for server)
  • Dynamic NAT: pool of public IP, assign for private (for client PCs in Intranet)
  • Single-Address NAT: all private IP map to one public IP
    Overloading/Masquerading/Network Address Port Translation (NAPT)🤬

1:1

Single Address (most common)

NAT identify the inside hosts’ packets by 4-layer address(port)

NAT那里的Skype部分,什么东西。🤬

NAT hinders P2P

Relaying (in Skype)
P9

ICMP(Internet Control Message Protocol, RFC792) (4 layer)

  • transfer err and ctrl msgs
  • encapsulated in datagram
    • protocol type = 1
    • not reliable
Format
in

ICMP types: maybe do not need to recite… ⏩ppt P13
some formats…

Use
  • Ping

    • test dest reachability
    • calculate round-trip time
    • count hops (TTL)
  • traceroute:

    • send UDP packets with increasing TTL(1,2,3…)
    • reveal the nodes by TTL expired message
    • suffer from dynamic routing
  • Path MTU

    • send packets with DF(do not fragment) bit set
    • router send back parameter unintelligible if too large
    • decrease size and try until no error
    • suffer from dynamic routing (main change during/after your test)

Mobile IP

mobile device’s IP pretend to be unchanged.
P24
Triangle:
P26

function
  • Discovery

Tunnel : IP packet in IP packet?

IPv6

Motivation

  • address exhaustion
  • poor processing speed of header
  • poor QoS support
  • No fragmentation at routers
  • ???

RFCs

4 to 6 Evolution

Routing

building routing tables for datagram networks
choose path and build routing tables for VC networks??

Characteristics

  • efficiency
  • resilience
  • stability

Design ELements (5)

Performance criteria
  • minimum hop
  • least cost
    • what is cost?
      • Minimum delay: queue length
      • Largest throughput: 1/throughput
Decision time & place
  • time
    • for each packet - datagram network
    • when configure VC - VC network
  • place(3 sorts)
    • centralized coordination
    • source routing
    • distributed(by switches)
Network info source & info update timing
  • source
    • Local information
    • Adjacent switches
    • All switches in the network
  • timing
    • Update periodically
    • Upon major changes in switches or links
    • Fixed (manual configuration)

Strategy

Central(static)
  • single fixed route for each src-dest
  • least-cost algo
  • reconfig when major change

🤬这个具体是什么?是P12的图吗?

Distributed: flooding

do not need any info
just flood the packet

  • duplicates: many receive many copies
  • cycle problem: may cause storm
    • handle: hop count

Features

  • very robust
  • at least one packet will take the best route
    can set up VC
  • visit all switch (useful to distribute information)

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
P15

Distributed: random
  • do not need network info
  • select one outgoing path for retransmission
  • random / round & robin / probability calc
  • suitable for strongly-connected network
  • always not optimal


is the cost factor of the link i (can be trans rate, , or …)

Distributed: Adaptive
  • Used by almost all packet switching networks
  • change as conditions on the network change
  • Requires network info
    Tradeoff: high quality info – overhead
  • Aid in congestion control???

An Isolated Example (only use local info)

Least Cost Algos (Use global info)
as below

Classic Algo

*Prim & Kruskal

Dijkstra

26页图即可

  • complexity: best
  • may happen oscillation:
    cost = amount of carried traffic 🤬如何解决震动问题?
Bellman-Ford

再细看

“count to infinity” problem(这个存在于Dji上吗)
poisened reverse:
需要详细整理一下

Comparison

39页图怎么理解?

两种都是静态动态杰克


Scaled routing

flat routing is not practical

Hierarchical Routing

before 7 ..

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm
Peering Center is used to connect ISPs.
ASP’s AS could connect to Tier-1 ISP’s AS.

IGP(Interior Gateway Protocol): Intra-AS

  • focus on performance
  • AS-specific
RIP(Routing Information Protocol): distance vector(必考)

p13
#hop

router has Layer4/5 info in Ctrl Plane(for router’s own communication)

  • each router can see the whole network
    calculate the shortest path with Dijkstra
    make a routing table: < dest, next hop >
Feature
  • Hierarchical OSPF
    AS is divided into areas(32-bit ID)

EGP(Exterior Gateway Protocol): Inter-AS

  • focus on policy

11 的俩图

BGP(Border Gateway Protocol)

support AS’s autonomy and privacy

relationships

  • A –customer–> B

  • A –provider–> B

  • A <–peer–> B

  • AS advertise its best routes for IP/IP-prefix

  • BGP not picking shortest-path routes
    best route based on policy (maybe least-cost)

  • Selective route ad
    AS could choose whether to ad its route to dest

Details
Sessions
  • iBGP
    Distribute external routes internally

Multicasting (必考)

  • Duplicate packets & unicast v.s. multicast
    D&U → a waste of bandwidth!
  • Multicast address → class D in IPv4
  • Application: Multimedia, teleconference, distributed computing, etc.

make a Mcast minimal spanning tree and only duplicate only along the spanning tree (hard, NPC, no scalability)

Router’s responsibility:

  • learn group membership
  • identify links
  • establish routing state

A concept: Mcast group

  • the edge router is responsible for joining the Mcast group.
    translate the IP or not(Unicast/Mcast) according to IGMP Protocol P15/16

IGMP

P18/19

Operations
  • Hosts
    • Send reports to routers to subscribe to (join) and unsubscribe from (unjoin) multicast group
    • Host need not explicitly unjoin group when leaving ???
  • Routers
  • Sends query info at regular intervals
  • Host belonging to a mcast group must reply to query

P21-26

  • he

    • On receipt, hosts start random timers (0~0s) for each multicast group they’re in.
    • when count down, report Membership Report(TTL=1) , others hear so do not need report in this group (stop their timers).
    • Routers hear all, if a group has no report, time out it.
  • v2

  • v3

    • support Group-and-source specific query: The receiver can claim that it only has interest in specific Mcast source.

note 27 with 28!
before 34?????????

Mcast Routing

Group-shared tree and Source-based tree

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Choose
  • PIM: Protocol Independent Multicast
  • Not dependent on any specific underlying unicast routing algorithm (works with all)
  • Sparse mode
  • Group-shared tree, use center-based approach
  • Group members widely dispersed, bandwidth not plentiful
  • Dense mode
  • Flood and prune: source-based tree, reverse path forwarding (Nearly same as DVMRP)
  • group members densely packed, bandwidth more plentiful
Group-shared

MST with constraints(NPC)

NPC never works on Internet!

P42

  • ..
    • choose a router as center (it is also a problem to choose a good center (best:NPC)!)
      Other routers to join:
  • Edge router sends unicast join-msg addressed to center router
  • join-msg processed by intermediate routers and forwarded towards center
  • join-msg either hits existing tree branch for this center, or arrives at center
  • Path taken by join-msg becomes new branch of tree for this router
Source-based
  • Shortest path tree, reverse path *, use with(?) OSPF(Dijkstra)
    • First make a broadcast tree: P39
      Now it is a broadcast tree.
    • Then pruning: The no-interest router report Prune upstream (if has no downstream group member) (How to wait?🤬)

DVMRP & MOSPF(buyaoqiu?)

Multicast routing based on shortest-path unicast protocol(undertand again)
jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Application Layer Mcast

The Fairy of IP Mcast: …
So we implement Mcast in the application layer.

Transport Layer

peer-to-peer communication
routers have Layer4/5, however they do not talk with hosts.

break up messages to segments / reassemble segments to messages

motivations

  • IP could not tell which application
  • IP only provides best-effort service
    • delay, loss, duplication, out-of-order
    • traffic volume control

Role

  • Communication between processes(Compulsory)

    • mux&demux from/to application processes
    • Use Ports
      • Multiplexing (Mux)
        • Gather and combining data chunks from different applications → network layer
      • Demultiplexing (Demux)
        • Delivering correct data to corresponding sockets from a multiplexed stream
  • provide common end-to-end service(optional)

    • reliable, in-order, well-paced…
    • facing diverse requirements of applications
  • socket: software interface (abstraction) to the transport layer for a process

  • Ports: 16-bit number

    • well-known ports: 0-1023 (eg. 80 for webpage)
    • registered ports: 1024-49151
    • dynamic ports: 49152-65535
  • OS stores mapping between sockets and ports

    • For UDP ports (SOCK_DGRAM)
      • OS manage: <HostIP, Port> socket
    • For TCP ports (SOCK_STREAM)
      • OS manage <HostIP, HostPort, RemoteIP, RemotePort> socket

Demultiplexing

host uses IP addresses & port numbers to direct segment to appropriate socket

Connectionless demultiplexing

UDP like, all ports could exchange with a socket.
Only take dst socket into account when demultiplexing.

Connection-oriented demux

  • like TCP
  • socket id: <SrcIP, SrcPort, DstIP, DstPort>
  • server host may support many simultaneous TCP sockets:
  • web servers have different sockets for each connecting client

    non-persistent HTTP will have different socket for each request

or like this (🤬有什么区别吗。。。)

Reliable transport

All bad things can happen…

  • cheaksum -> detect corruption
  • ACK & NACK -> detect loss
  • Sequence number -> identify packets
  • retransmission -> recover loss/corruption
  • flow control -> prevent overflow
  • timeoout -> decide when to retransmit
  • forward error correction
  • network encoding -> repair errors

Corruption handle

Handle:

  • Start from: Detect error: ACK/NACK

  • ACK still get corrupt: ACK/NACK with sequential number

  • packet get lost and so that no ACK

  • ACK get lost and cause timeout

  • Still have problem: no packet loss, delay cause duplicate.

early days: no formal methods! constant packups

Flow control (General Basis)

Stop-and-wait protocol

sender spin-wait ACK/NACK or until TIMEOUT



network protocol significantly limits use of physical resources!

Pipeline protocols

sender allows multiple, “in-flight”, yet-to-be-acknowledged pkts

  • increasing range of sequence numbers
  • buffering at sender and/or receiver

When n-pipeline’s n is proper, it can increase the utilization by n times! (until 1 ideally)

Sliding window: compare with the one above!🤬

Sliding window often called “packets in flight”

Acknowledgement
  • Cumulative ACKs: next in-order expected

  • Selective ACKs:
    ACK individually acknowledges correctly received packet

Resending
  • Go-back-N(GBN)

    • Discards out-of-order packets (pkts other than )

    • cumulative acknowledgements

    • Sender sets timer for

    • if timeout, retransmit

    • better in low error rate, waste bandwidth

      better for hardware, and only hardware can handle high-speed!
      complex protocol and data structure: not a system way!

  • Selective-Repeat(SR)

    • treat the packets individually, but window front edge is determined by the smallest unACKed ID.

    • retransmit only packet on timeout

    • complex book-keeping: timer per packet

    • low speed, high error rate, low and expensive bandwidth: 5G

UDP(User Datagram Protocol) RFC 768

lightweight, provides the minimal service

  • Best effort(may lost or out-of-order)
  • connectionless, each segment is independent
  • use in streaming multimedia apps, DNS, SNMP

Features

  • no connection establishment
  • simple, small header
  • no congestion control

Format

Checksum

split as 16-bit integers, add them up, then take the 1’s complement

No error detected does not mean No error!
checksum is optional (source port is optional too)

Note: checksum是check哪些,见后面的TCP

💱Frontier:UDP is reviving due to oligopoly of Internet companies.
UDP+X protocol is used in these companies to provide better service.
UDP+QUIC is used in Google’s Chrome.

TCP(Transmission Control Protocol) RFC 793

必须记下包头格式

  • Reliable, In-order and Byte stream(assume there is an incoming byte stream)
  • Almost all the techs: checksum, seqnumber, sliding window, cumulative ack, out-of-order buffer, fast retrans, timeout estimate…

TCP functions (compared to header domains if exists)

必须掌握这个结构!
🤬大端存储还是小端存储?

Mux and DMux (Src and Dst Port)

Nothing much to say.

Content Reliable

Pseudo header + header + overhead: A blog

Order Reliable (Seq Number and ACK)
  • Early days: “Stream of bytes” service offer service for TTYs

  • Seq number is ‘Byte(not packet) offset’ (TCP is a byte stream)

    pay attention that TCP is full-duplex, so each pkt should have both seqno and ACK domain. There is no implicit ACK pkt, but exchange the ACK knowledge in back-and0-forth data packets(piggybacked). You should not be misled by the diagram above.

  • TCP segment length:

    1
    2
    3
    4
    |------------------------MTU------------------------------|
    |---------------IP data-------------------|-----IP Hdr----|
    |------TCP data (segment)-----|--TCP Hdr--|
    |-------------MSS-------------|

    So $MSS = MTU - len(IPHdr) - len(TCPHdr)$
    MTU = 1500B for Ethernet
    TCP >= 20B

  • TCP Datagram Sent time:

    注意加一减一,ACK是上一个这些事🤬

at receiver, user can choose GBN-like or SR-like(drop/buffer out-of-order pkts). The TCP protocol does not restrict it.

  • Adopt cumulative + SR-like bitwise(for example)
    ACK the smallest seqno unACKed packet, buffer the out-of-order packets. 🤬ppt上各种GBNlike,在哪呢?那不是cumu导致的吗,和GBN有什么关系?混为一谈!
  • Delay ACK:
    1. receiver will not ACK a in-order pkt, but wait another in-order pkt(for 500ms).(Otherwise, the acknowledgement actually never ‘accumulate’ for you always ACK a packet at once).
    2. If receiving a out-of-order packet, send dupACK at once.
    3. If the in-order packet fill the gap(unACKed holes in the sequence due to out-of-order) and is the first unACKed pkt, ACK at once.(If it fills the gap but not the lowest one, follow rule2)
  • Deduction: receiver will send duplicate ACKs when a packet is lost
    sender will know
    1. packet is lost
    2. link is still alive(receive receive some other packets)
    • Introduces fast retransmit: duplicate ACKs trigger early retransmission
      TCP will retransmit the pkt the dupACKs point to when 3 duplicate ACKs are received
    • And a single timer for timeout(only time the first sent but unACKed pkt, namely reset it each time sender pkt LHS moves) retrans the first unACKed pkt 🤬如果就重传一个,那还有什么可说的书P164(namely first pkt in the window) on timeout.
      How do we pick a timeout value? so should be **a bit longer** than (proportional to) RTT So how to measure RTT? See below:
measure RTT
  • measure RTT: exp weighted average of sampled RTT
    Notice: Usually the sample is taken every unretransed ACK (GPT)
  • Problem: Ambiguous measurement(tell ACK and retrans-ACK)
  • Algos: as below
  • Karn/Partridge Algorithm
    • handle ambiguous: ignore the packet as long as it is retransed
    • Insensitive to RTT variation 🤬
  • Jacobson/Karels Algorithm
    • take the variance of RTT into account: use DevRTT (exp average of Deviation)




Header Length (Hdrlen)

Number of 4-byte words
HdrLen == 5 if no optional regions.

Control (Flags)
Connection Establishment
  • 2-way handshake
    2 SYNs to establish

    • retrans lost SYN, ignore duplicate SYN once connected
    • Problem: slipped segments from old connections
    • handle: set a new initial seq number(ISN) far from the previous for a new connection, SYN is in SYN i+1 form (pay attention to +1)
    • Obsolete SYN

      The A->B SYN and B->A SYN have different seqno. This may be used to implement Full-duplex comm.

    • handle: 3-way
  • 3-way handshake

    🤬为什么非要加1? 0号包怎么你了! 而且3号包一定要seq number吗((3)seqno多少?GPT说三号包seqno=ISN_A+1)

  • Hens and Eggs Problem:

    • SYN is lost: network/server is busy
    • so no SYN-ACK
    • so need to retransmits SYN on timeout
    • but we do not know how to set timeout! (No any RTT)
  • handle:

    • Default: 3 secs (RFC 1122 & 2988) (some use 6s)
    • Sometimes shorter in data center

A SYN loss in Web scene

  • User clicks on a hypertext link
    • Browser creates a socket and does a “connect”
    • “connect” triggers OS to transmit a SYN
  • SYN is lost…
    • 3-6 seconds of delay(very long)
    • User may become impatient and retry
  • triggers an “abort”
    • Browser creates a new socket and another “connect”
    • Can be effective in some cases
Connection Tear Down

Normal termination

🤬这里还是存在问题,当B对最后一个剩余任务发出ACK,那么如果它立即发出FIN,则可能会这个ACK没有被A收到, 从而引发发送端的TimeOut,但发送端在没TimeOut时候就已经收到了FIN(1/2 RTT) 那么如何处理?

TIME_WAIT相关(GPT)
等待足够的时间以允许所有数据包在网络中老化:
TIME_WAIT 状态持续约 2 倍的最大报文段生存时间(Maximum Segment Lifetime, MSL)。MSL 是任何给定的分组被认为在网络中自然消失前能存活的最长时间,通常这个时间设定是2分钟。因此,TIME_WAIT 通常持续 4 分钟(2MSL)。这一过程确保老的重复数据包不会在同一 TCP 端口上的新连接中被误认为是新的数据包。

Can combine FIN and ACK if no pkts left.
Abrupt termination


State Machine
🤬需要掌握到什么程度?

Flow control (Advertised Window)

So far the window size is constant (buffer size).
Left is sender, and right is receiver.

if consumer is slow, buffer will be full.

  • straightforward handle: stop sending ACK
    • however sender could not tell loss/flow ctrl (so not practical)

🤬简化模型吧,现实中如果来源不止一个,怎么回复RWND?

  • handle: Credit scheme
    receiver sets RWND(advertised window) in ACKs
    sender ensures in flight <= RWND
    Left is receiver, right is sender. Additional window from (W is RWND, AN is ACK number)
    • Benefit:
      Decouple flow control from ACK
      Each octet has a sequence number
    • lost ACK/CREDIT little harm: resync the protocol in future pkts
      (for the ‘right pointer’(AN+W) of the window will never go left; and the sender always follow the newest one; and the new ACK/CREDIT does not depend on the old ones.)

Example of credit

  • Problem: credit allocation deadlock

    • —-[]—-→ : closing rcv-window
    • —-[]—x→ : to reopen, but lost
    • A thinks window is closed, B thinks it is open and wait
  • Handle: Use Window timer
    if expired, send something (could retrans last segment)
    The term is called Persist Timer only be set when the permitted window left is 0.

  • Deadlock: due to piggybacked ACK, when RWND=0, sender wait for receiver, but receiver have nothing to send to sender, so it is stuck.
    Handle: sender still send 1 byte when RWND = 0

🤬这地方没解决。什么时候ACK?这两个deadlock问题分别来自ppt和教材,但是方案有冲突,GPT给出的是混合体(如下)?另外丢失ACK的问题在不是等于0,而是sender用光了剩下的时候也存在啊?

TCP的Persist Timer用于处理零窗口大小(Zero Window)的情况,即当接收方通告的窗口大小为零时,发送方将无法发送更多数据。这种情况下,发送方需要知道何时可以再次开始发送数据,但由于窗口大小为零,它不能发送任何新的数据来探测窗口是否已经变大。Persist Timer就是为了解决这个问题而设计的。

Persist Timer启动的情况如下:

接收方通告窗口大小为零:当发送方收到一个窗口更新,指示接收方的窗口大小为零时,它将停止发送数据,并启动Persist Timer。
定期窗口探测:在Persist Timer超时后,发送方会发送一个小的探测报文(通常是1字节的数据),这个探测报文被称为窗口探测(Window Probe)。接收方收到窗口探测后,即使其窗口仍然为零,也应该响应,告诉发送方当前的窗口大小。
窗口大小更新:如果接收方的窗口大小从零变为非零,它会发送一个窗口更新给发送方。发送方收到这个更新后,会停止Persist Timer,并根据新的窗口大小恢复数据传输。

Congestion Control

TCP initially has no congestion control, then causes congestion collapse in 1986.

Jacobson’s fix
Adapt the window size in response to congestion

  • pragmatic and effective
  • required no changes to routers and apps
  • extensively researched and improved upon

Congestion in a simple modeled view.

Real-world problem (much more complicated)

3 must-answered questions in Congestion.
- how to detect congestion.
- who take care of congestion
- how to handle congestion

  • ❌Start from no care: Many packets drop(like UDP)

  • ❌Reservations: negotiate to pre-arrange bandwidth before sending pkts
    Low utilization

  • ❌Pricing: do not drop high-bidders (capitalism!)
    requires payment model

  • ✔️Dynamic:

    1. hosts infer congestionand adjust
    2. network reports and hosts adjust
    3. or combine the two above
    • generality is proved to be powerful
      • not presume business model, traffic features, app requirements… therefore it is universal
      • assume good citizenship
    • simple to implement but suboptimal, messy dynamics(means hard to predict behaviour in real use)

TCP congestion control overall:
$sendingrate \approx \frac{WindowSize}{RTT}Windowsizesendingrate$

Congestion Window FlowControl Window
CWND RWND
without overflowing routers without overflowing receiver
computed by sender determined by recver and reported to sender

We assume in the course
so we do not care about Flow control when discussing Congestion control

In this course we talk about CWND in unit of MSS
Real implementation CWND is in unit of bytes

3 key issues when trying to control congestion

  • Discovering the available (bottleneck) bandwidth
  • Adjusting to variations in bandwidth
  • Sharing bandwidth between flows
Detect congestion:
  • packet delays: too much noise (delay often varies greatly) ❌
  • router tells: too expensive ❌
  • packet loss: ✔️
    Fail-safe (But still has non-congestion loss, like in wireless)
    no more detection overhead (already has to detect for reliable delivery)
    Just like the what we talked above:
    • Duplicate ACKs: isolated loss
      • Still getting ACKs
    • Timeout: much more serious
      • Not enough dupacks
      • Must have suffered several losses

will adjust rate differently for each case

Adjust rate:

How to adapt sender?
General: ACK new data then rate↑, loss then rate↓

  • Discovering available bottle bandwidth: Slow start
    start slow for safety but ramp up quickly for efficiency!
  • Adjusting to variations in bandwidth: Congestion avoidance
    • increase rate slowly and focus on stability
    • AIMD(Additive Increase Multiplicative Decrease)

Slow start

  • Start with CWND = 1

  • increase exponentially () until first loss

  • Math: for each ACK
    So per RTT ???
    therefore (🤬有点奇怪,这里指数不对吧,当pipeline的时候,增长应该是连续的而不是离散的)

    这样看还是离散的,到后期才会变成e指数

  • When , stop slow start and enter congestion avoidance.

  • or there is a loss

Collision Avoidance

  • focus on maintenance. (when CWND>ssthresh)
  • Additive increase
    • for each ACK
    • Math: therefore (Additive increase)
    • ‘CWND is increased by one only if all segments in a CWND have been acknowledged’

When loss happens

  • Multiplicative decrease
    • On 3 dupACKs (no matter in CA or SS)
      • directly go to Collision Avoidance
    • On timeout event
      • Initiate Slow Start

A diagram to show SS,CA and Timeout

ssthresh 的初值是什么??
‘Sawtooth’

Problem: CA is too slow when recovering from isolated loss.
Please refer to the example

Solution: Fast recovery

  • On 3 dupACKs (no matter in CA or SS)
    • go to Fast recovery
  • In fast recovery
    for each dupACK
  • exit fast recovery
    When receiving a new ACK
    (directly go to CA)

🤬这个也没涉及Fast-recovery啊…

Share bandwidth

Efficiency (fully utilize)& Fairness
Now prove AIMD to a proper choice.(compared with AIAD,MIAD,MIMD)
A model to evaluate the performance.
given a source state, we could calculate all the following states on the graph.

AIAD: does not converge to fairness

MIMD: does not converge to fairness

AIMD: converge to fairness

MIAD: does not converge to fairness and efficiency

State machine

It is never implemented in hardware…
for it is too complicated to support high frequency…

TCP flavors

  • TCP-Tahoe
    on 3dupACKs and timeout
  • TCP-Reno
    on 3dupACKs
    on timeout
  • TCP-NewReno
    TCP-Reno + Fast recovery
  • TCP-SACK
    Use selctive ACK but not cumulative ACK

All the TCP in the Internet corporates well, for they are “generally the same”

  • increase CWND on good news
  • decrease CWND on bad news

TCP Performance Metrics & Problems


Packet drop rate,
Throughput,

  • if a flow has higher RTT, it will have low throughput. unfair!
  • Implication for High-speed TCP:
    not good for Large and Fat pipe: high bandwidth and high delay 🤬怎么算的?
    • Too high demand to under-layer robustness
      it is impossible to transmit so much data without error!
    • Too long time from a drop to reach peak performance
    • Handle:
      • past a threshold speed then increase CWND (additive constant depend on CWND)
      • Multiple connections(make many streams, Works up till know, like IDM)
      • Router-assisted

Old days: Best way for Long&Fat Pipe: Disks in Trucks!

  • Given the speed swings from W/2 - W
    so why dont we just start average it statistically?
    Equation-based congestion control
    ignore all the events, just set the transmission rate to a result of function , keep monitoring the drop rate and adjust the rate accordingly.
  • Other loss reasons(like wireless)
    TCP cannot tell congestion/corrpution loss
  • Not friendly for short flows
    most flows are short flow.
    • Short flows never leave slow start
    • Too few packets to trigger dupACKs
      so isolated loss may also need to waiting for timeout and severely impact the flow completion time
  • Every flow try to overshoots capacity before get a drop.
    So a large flow will dominate the small ones. all the small ones have to bear a large delay.
    means that delays are large, and are large for everyone (opposite to the basic idea of multiplexing)
  • host control, so easy to cheat
    • increase CWND faster
    • use large init CWND
    • use multiple connections

It is hard to supervise the network of cheating with only control on the host side.
So…👇

Router-assisted Congestion Control

Compared to router based control, host based is more like a hack. Now we introduce the router’s help and it can solve many problems.

Choke Packet

From router/destination to source

Backpressure

for Propagation time > transmission time case(long distance or high speed link), from router to source is not effective

Warning Bit

Explicit Congestion Notification(ECN)
a bit in IP header, set by congested router

  • if data pkt has ECN bit set, receiver sends ACK with ECN bit set
  • routers can set ECN bit with diverse strategies
    trade-off between utilization and delay (more precise strategy means more delay)
  • ECN semantics can be like loss
    end host can react to ECN as if it saw a loss

Advantages:

  • distinguish corruption with congestion;
  • Can serve as an early indicator of congestion to avoid delays
  • Easy (easier) to incrementally deploy

    Today: (RFC3168) using ToS/DSCP bits in the IP header, Common in datacenters

  • For business: Whenever I get an ECN bit set(it means still want to send if keep receiving ECN), I will be charged.

RED (Random Early Discard)

Handle TCP global synchronization problem

  • Traffic burst simultaneously fills queues so pkts lost, TCP connections enter slow start simultaneously, so traffic drops, network underutilized, and burst and underutilize and burst…

  • RED: Randomly discard pkts before buffer becomes completely full (over a certain threshold)

calculate

  • : discard
  • : queue the packet
  • : probobality discard

Fairness:

  • Routers classify packets into “flows” (assume flows are TCP connections)
  • Each flow has its own FIFO queue in router
  • Router serves flows fairly
  • when line is free, take packet from next flow in a fair order

Shen me ta ma de jiao zuo fairness!

Max-Min fairness
  1. Calculate the average bandwidth requirement
  2. offer full service to the flow under average
  3. Remove the served flow
  4. Loop: jmp to 1, until no fully-served flow

🔱If you don’t get full demand, no one gets more than you
When all packets are the same size, it equals round-robin
🤬如何不serve all?以个数为单位??包不一样大为什么不行?


cope with packets with different sizes
Ideal: bitwise fairness (but no practical)
approximate it

Fair Queuing
  • compute when the last bit of each packet leaves the router if flows are served bit-by-bit (share the bandwidth equally)
  • serve packets in the increasing order of their deadlines

WFQ: more general, flows have different bandwidth share.
today some form of WFQ implemented in almost all routers.

pros:

  • cheating flow cannot occupy more bandwidth
  • doesn’t have RTT-based unfairness ( unfairness)
  • Flows can pick any rate adjustment scheme they want 🤬怎么理解?

cons:

  • more complex than FIFO, per flow queue, per packet bookkeeping.
    Today’s routers use more coarse-grained WFQ

FQ does not eliminate congestion, but manages congestion.
A case

Congestion still exists, but links are used more efficiently.

round-table discussion
Fairness have different standard. flow sum, congest hop sum…
Flow have different definition. TCP connection, src-dst pair, src itself…
reduce congestion: router can add ‘fair share(n.)’ to packet to inform end-host what rate they should send at… (still need policing) (Any end-host automatic behaviour needs policing!)

Multimedia

This part of content seem to be deleted in the Eighth Version

Analog → Digital Signal

  • Audio
    • telephone: 8K Samples/sec
      Example quantization: 1 B/Sample
      CD: 44K Samples
    • Bit rates
      CD 1.411Mbps
      MP3: 96,128,160 kbps
      Internet Telephony: 5.3 kbps↑
  • Video
    video: sequence of imgs, image: bit map
    coding using redundancy: spatial and temporal
    • spatial: neighbor same RGB
    • temporal: not change RGB between frames in the same place
    • MPEG1 for CD-ROM 1.5Mbps
      MPEG2 for DVD 3-6Mbps
      MPEG4 for Internet videos <1Mbps

Services

  • Streaming, stored audio/video
    • streaming: can begin playout before downloading entire file
    • stored (at server): can transmit faster than audio/video will be rendered (could be stored/buffered at client)
    • e.g. YouTube, Netflix, Hulu
  • Conversational voice/video over IP
    • interactive nature of human-to-human conversation limits delay tolerance
    • e.g., Skype
  • Streaming live audio, video
    • e.g., live sporting event (⚽)
  • challenges: variable nwk delay(jitter), interactivity(pause, fast-fwd, rewind, jump), packet loss and retrans
  • client-side buffering and playout delay:
    compensate for network-added delay, delay jitter

(CBR means constant bit rate)
if keeping , buffer will run out
Trade-off: Long-short pre-cache time
too much wait – video lag

Streaming

with UDP 🤬RTP (RFC 2316)

server sends at constant rate

  • (ususally )
  • sender don’t care the congestion rate
  • 2-5s short playout delay
  • error recovery: application-level, time permitting🤬
  • but always blocked by firewalls nowadays.
with HTTP
  • via HTTP GET🤬
  • send at maximum possible rate under TCP
  • fill rate fluctuates due to TCP congestion control, retransmissions (in-order delivery)
  • larger playout delay: smooth TCP delivery rate🤬
  • HTTP/TCP passes more easily through firewalls
with DASH(Dynamic Adaptive Streaming over HTTP)

server:

  • divides video file into multiple chunks
  • each chunk stored, encoded at different rates
  • manifest file(.mpd): provides different URLs for different chunks

could offer diff coding rates, diff URLs for each chunk.

client:

  • periodically measures server-to-client bandwidth
  • consulting manifest and requests one chunk at a time
    • choose maximum coding rate sustainable given current bandwidth
    • can alter the choice at different time according to bandwidth

“intelligence” at client: client determines

  • when to request chunk (so no buffer starvation/overflow)
  • what encoding rate to request (quality according to bandwidth available)
  • where to request chunk (can request from URL server that is “close” to client or has high available bandwidth)

Voice-over-IP (VoIP)

  • VoIP end-end-delay requirement (maintain “conversational” aspect)

    • higher delays noticeable, interactivity
    • <150msec: good ; > 400msec: bad (max tolerable)
    • includes app-level (packetization,playout) & network delays
  • session initialization???

  • value-added services: call forwarding, screening (means filtering the calls, like blacklist), recording

  • emergency services:911

  • Loss

    • network loss: IP datagram lost due to network congestion
    • delay loss: IP datagram arrives too late
      • delays: processing, queueing in network; endsystem (sender, receiver) delays
    • tolerance: between and
      (depending on voice encoding, loss concealment…)

A case: Skype

Problem: Alice,Bob behind NATs
NAT denies outside→inside connection
Handle: relay
A,B maintain open connection with their SNs (servo)
callee’s SN connect callee by initially initiated connection

Proprietary application-layer protocol 🤬?

QoS (Quality of Service)

  • differentiated new traffic in Internet
    • low bandwidth & high addtional value: Secret call, Multiplayer video game, email…
    • high bandwidth & low addtional value: file download, web videos…
  • within QoS

ISA(Integrated Services Architecture)

a distinguishable stream of IP packets → a flow

  • With the same QOS parameters
  • ID: <SrcIP, DstIP, port, protocol type>
  • Unidirectional, can be multicast
Architecture

ISA Functions on a router

  • Routing Algorithm
    • Link cost based on QOS parameters, not just delay
    • Routing/forwarding based on which class of QoS the flow belongs to
  • Queuing discipline
    • Priority queuing
    • Multiple queues, taking account of different flow requirements
  • Discard policy
    • Selective not just the new ones
resource reservation
  • Providing QoS for individual app sessions
  • routers maintain state info of allocated resources for each session (like a virtual circuit)
  • could deny the request if can not guarantee

with in a intermediate router:

soft state: state which has a lifespan and regularly be refreshed/aged

  • Scope of the request:reservation request should be propagated
    to all sender hosts

  • disadvantage:

    • High cost: soft state for each flow, not scalable
    • too complicated architecture
      • RSVP will reduce to best-effort if any router does not support it in the path
    • few service levels are defined, not flexible

jammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

DS (Differentiated Services) (RFC 2475)

  • divide traffic into classes, and treat them differently
  • simple func in network core but complex in edge routers
  • Type of Service field in IPv4 / Traffic Class field in IPv6
  • define a SLA(Service Level Agreement) between customer and ISP before the service
  • APP select DS, and traffic with same DS will be treated consistently

PHB??

DS Domain
  • portion of Internet with a certain set of DS policies
  • provided by a single ISP or union of ISPs
  • ISP configures domain edge routers
  • Customer may be hosts or edge routers in other domain (as above)
  • Ongoing measure of performance provided for each class (ensure the service quality)
  • when handle traffic from neighbor DS domain, choose the most fit class in self policy set for the traffic.
DS Architecture

for Edge router

  • Per-flow (each traffic flow will be check individually) traffic management

  • Marking: classify the flows into different QoSs (set the header TOS/TC field)

  • Policing:
    distinguish packets with in-profile and out-profile: monitor if the flow obeys the SLA, if true then in-profile otherwise then out-profile

    • monitor average rate, peak rate and burst size
    • Shaping: shape the traffic to in-profile
      • Leaky Bucket
        departure: fixed-rated traffic
        may drop pkts when bucket is full

🤬都整形了,还要标记干什么?

Marking and policing are not seperate process, when you police the flow (in/out profile) you may mark them to tell other routers.(so-called intra-class marking)
for Internal router

  • Per-class traffic management (do not distinguish the pkts from different flows in the same class)
  • Buffering and scheduling based on marks(header)
  • in-profile will be preferred, out-profile traffic has low priority or be dropped.

Network Security

The conceptual model

Attack overview

The victim of attacks: Web browser/server/banking server,client/DNS server/Routers..

  • Active easy to detect, but hard to prevent
  • Passive can be prevent, but hard to detect

Cryptography: handle interception

Encryption key -> , Decryption key ->

  • Symmetric key: sender key = receiver key
  • Public key crypto: public, secret

Requirements

  • Even attacker knows your crypto algo, he could not decrypt it.
  • Even attacker knows some ‘cipher txt - plain txt’ pair, he could not work out the key
  • Even attacker knows many cipher txts, he could not work out the key / get the plaintxt
  • Should not be easily decrypt by Brute Force

Key distribution is a problem(safely get the keys), shown below.

Attacks

  • CipherTxt-only attack:
    • Brute force try all the keys
    • Cryptanalysis:
      try to deduce PlainTxt/key by analyzing the nature of crypto algo or general characteristics of PlainTxt/CipherTxt
  • Known-PlainTxt attack:
    • attacker knows some PlainTxt-CipherTxt pairs to deduce algo/key
  • Chosen-PlainTxt attack:
    • attacker can choose some PlainTxt and get the CipherTxt in some way to deduce algo/key

Methods

⛓️:这些算法能考吗

Symmetric Key Crypto
Traditional ones
  • Substitution: alphabet replacement
    • Caesar cipher: shift the alphabet by k
      • key space size: 26
      • attack: BF
    • Monoalphabetic cipher: just map the alphabet to a mess up one(no repeated)
      • key space size: 26!
      • attack: frequency analysis(letter/word/letter pair in specific language)
    • Vigenere cipher: do Caesar cipher for each letter in the PlainTxt, the shift value is determined by each letter in the key in cycle
      • a letter in PlainTxt can be mapped into different letters in CipherTxt (most k possibilities)
      • key space size:
      • attack: try to find the , and attack each group as Caesar cipher
  • Transposition (Permutation) methods:
    • Rail Fence cipher: write the PlainTxt in row and read it in column
      • key space size: at most
      • attack: BF
    • Row-Column Cipher: write the PlainTxt in row and read it in column(column sum = key length, so that each letter of the key corresponds a column), the read order of column is the alphabet order of the key
      • attack: can just try.
    • pure transposition can be easily attacked by Known-PlainTxt attack
Block Cipher

Scheme: Feistel Cipher
allow encryption and decryption to be done with the same algorithm (hardware/software implementation)

  • DES: Data Encryption Standard
    • key space size:
    • attack: no backdoor, just BF
      not so slow for modern comupters(let alone the specially designed computer) so useless now.
  • TDES:Triple DES
    • 3 keys, 3 rounds
    • key space size:
    • EDE(Encrypt-Decrypt-Encrypt) or EEE(Encrypt-Encrypt-Encrypt)
    • too slow, block size small
  • AES: Advanced Encryption Standard
    • 128 bits block, 128/192/256 bits key
    • 10/12/14 rounds(for 128/192/256 bits key)
    • key space size: //
      ⛓️ AES Details
Public Key Crypto

based on mathematical algorithms
Asymmetric key: sender key receiver key
each one has a pair of keys:

  • public key known to all
  • private key known by self

and

Requirements

  • algo should be public
  • given algo/PUBkey, attacker could not work out PRVkey
  • given the cipherTxt should not work out the PlainTxt
  • given CipherTxt-PlainTxt pair and PUBkey should not work out PRVkey
RSA





message is bits, and bits can be seen as a number

  • Pick 2 large prime numbers and
  • (a large one at least 1024 bits each, we take 1024 as an example)
  • (Euler’s totient function)

    Note that this is the foundation of RSA. Given , it is hard to work out . The algorithm of working out depends on the factorization of .(factorization has no solution up to now(though it is not proved to be ). And N is a large number(), so attacker can not work out with Brute Force.
    However, due to and are prime numbers, so that the key generation could calculate directly with the formula above.
    So the core is and . If and are somehow known by attacker, then he could replay this process to break the RSA.

  • Pick so that and is coprime with

    e is relatively small, usually 65537, so it not so hard to check if is coprime with

  • Compute so that is the modular multiplicative inverse of mod “模逆元”)

    We could use the Extended Euclidean Algorithm to work out .
    It’s complexity is , namely about 2048 for our example. It is not so slow.

  • so that and
  • then we utilize

    Proof: (A brief one, only for , a full one )
    (Euler’s theorem)
    due to ,

    then

  • So
    Encrypt: , and send
    Decrypt:

    Msg must be smaller than , otherwise, you can only get in the context of modular arithmetic.

Some considerations

  • and should be about the same size
  • and should be unlated
  • should be relatively small(easy to encrypt)
  • should be large enough to prevent the attack(Bruteforce)

reverse of RSA: the owner could authenticate himself.

  • append a signature to the message
  • receiver could verify the signature by
  • verify it is from A for only A knows

In practice: session keys

  • RSA is really slow compared to symmetric key crypto

so…act as a secure key exchange method

  • A and B exchange session key with RSA (establish a secure connection)
  • A and B communicate using symmetric key crypto with symmetric key

Authentication: handle modification & Fabrication

🤬首先这几个分工写的对吗?其次MAC也不能解决Fabrication啊

  • v1.0: Declare who u r
    Easy to cheat.
  • v2.0: Declare who u r + your IP
    Also easy to cheat for professional.
  • v3.0: Declare who u r + your IP + shared pwd
    Playback Attack: record the request and play it again
  • v3.1: Declare who u r + your IP + encrypted shared pwd
    Playback Attack still works
  • v4.0:
    — Request–→,
    ←– disposable Random (i.e. nonce) —
    –→
    How to exchange shared symmetric key
  • v5.0: use nonce + public key
    — Request –→
    ←– Nonce
    –→
    ←– ask for pub key —
    –→
    then B compute know it is Alice

    evidence: only Alice know

v5.0 still has a hole: man-in-the-middle attack

How to? : just because only Alice knows , so Bob can not tell if the keys are Alice’s or not.

man-in-the-middle attack can be conducted silently
胡想:在mim攻击中 delay may have relation with size?(没意义,本来也是有关的)

Role

to ensure

  • come from the right person
  • have not changed message

By

  • creating an authenticator
  • send the message with the authenticator
  • receiver verify the authenticator

handling

  • Source impersonation / spoofing
  • Message injection / modification
  • Message re-sequencing / replaying

MAC

notice distinguishing ‘secret’(n.) and ‘secret’(adj.) key
I will refer to secret key as skey.
Notice that skey is a kind of secret(n.)
When it comes to secret(n.), I use for short.

  • Usually 64-256 bits

  • can be sent in clear text’

  • MAC is

  • Operablity

    • $fixedlength = f_{MAC}(anylength)$
    • Easy to compute
  • Security:

    • One-way: hard to find
    • Weak Collision Resistance: hard to find so that
    • Strong Collision Resistance: hard to find any
  • Use Crypto func(with skey) upon

    • CBC-MAC (Cipher Block Chaining Message Authentication Code)
      (n bits each)
      encryption algo ( is the skey of Crypto func) 什么🤬
      (Initialization Vector) random n-bit block
      for

      i.e. first and last block
  • Use Hash func upon and

    这里应该写错了padding bits 最多可以有512bits🤬

    • MD5
      The given demonstration and the diagram on Wikipedia are poor, so I make it on my own. Holes have been found in MD5.
      Xiayun Wang found the high-performance algo of MD5 collision.
      Wikipedia
    • SHA-1
    • 如何把secretkey引入Hash func HMAC🤬

Digital Signature

🤬just as reverse: RSA
A sign the document and others can prove A made the document
digital signature is Signed MAC (MAC with linked person)
🤬还是不能解决pizza prank啊

🤬所谓的MAC 和 Digital Signature,最主要的区别就是,MAC用的是对称加密,Digital S 用的是非对称加密吧

Key Distribution: handle Fabrication

2 Aims:

  • Sender and receiver should share common secret
  • Others should know A’s is actually A’s

against aim1:

Record-and-playback attack


man-in-middle is a kind of Record-and-replay attack(as above)
normal record-and-playback can be prevented by nonce/timestamp
🤬这也不是重放攻击啊

against aim2:
“pizza prank”
the attacker could encrypt the request with his own and declare his is A’s.

Diffie-Hellman Key Xchange(考): to share secret

A public large prime and generator (actually is OK [] for is prime).(是模循环加群,所以P为质数时就是1~p-1,且除了1都是生成元)

🤬DH还是会被Pizza prank攻击吧

Trusted Certification Authority: to share with help of CA

2 kinds of keys:

  • Session key: just used for a session
  • Permanent key: used for a long time(for distributing session keys)

CA determine the validity of the sender and receiver, and provide the session key to them.

  • security service module(SSM): working at the host, like an agent for application layer, offering end-to-end encryption and obtain the key.
  • The Needham-Schroeder Protocol

keep the forward secrecy: (this part is not a subsection of Trusted Certification Authority)
means that though the key is leaked, the previous messages are still secure.
use one-time session key

  1. prepare a session key
  2. encrypt with of the receiver
  3. encrypt with
  4. send the encrypted and
  5. receiver decrypt to get the with his , then decrypt the with
    please refer to
    🤬这里用的非对称加密,用DH呢?

Public Key Certificate: to verify the owner of public key

the key owner will register his claim to the CA, and CA will issue a certificate to the owner.

just like a signature of the key, but this signature is not encrypt with owners’ own PRVkey, but a ‘could-not-be-pretended’ CA ‘s PRVkey.
Everyone should be able to find the real CA to check the certificate.
CA could be government or some else.

Content of a certificate:

  • serial number
  • info about certificate owner(algorithm, key value)
  • info about issuer
  • valid dates
  • digital signature from the issuer(what to be check)
  • fingerprint/thumbprint(hash value to check the integrity of the certificate)

Real case: Secure e-mail

Confidentiality: Use session key

Integrity: use MAC(🤬HMAC?)

Combine them!

Then Alice & Bob achieve Confidentiality + Integrity (only this two)

SSL

⛓️整个P3,为什么
https的s
4 🐒
SSL is a sub layer between TCP and Apps

愤怒🤬PGP是什么东西,这翻译,陈鸣老登还有脸把这书出版?
PGP just like the email system above.

直跳13.。。

Application Layer

🐒1-4

Process: program running within a host

hjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmjammmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm

Architectures

CS(Client-Server)

Client:

  • Start as required then init contact with server first
  • Host may have dynamic IP addresses
  • e.g. Web: client implemented in browser; Email: in mail reader

Server:

  • Run as daemon (always-on)
  • Provides requested service to Client
  • Host has permanent IP address
  • e.g. Web server sends requested Web page, mail server delivers Email

P2P(Peer-to-Peer)

  • No always-on server
  • free from big-brother
  • Arbitrary end systems directly communicate
  • peers request service from other peers, provide service in return to other peers
  • self scalability - new peers bring new service capacity, as well as new service demands
  • Peers are intermittently connected and change IP addresses
  • Highly scalable but difficult to manage
  • Examples: Gnutella, BitTorrent, Skype

Skype

  • centralized server for authentication
  • decentralized P2P for communication

App: DNS (Domain Name Service)

  • Translates human-readable names to IP addresses
  • Distributed database implemented in hierarchy of DNS servers
  • Load balancing: multiple IP addresses for one name

Goal

  • Uniqueness: no conflicts
  • Scalability
  • Distributed, autonomous administration
  • Ability to update my own (machines’) names
  • Don’t have to track everybody’s updates
  • Highly available
  • Lookups are fast
  • Perfect consistency is a non-goal

Hierarchy

  • Hierarchical namespace
    A tree structure Top Level Domains(if top domain is not country region, it’s the US’s!)
    name is the leaf-to-root path
    depth of the tree is arbitrary(<128)
  • Hierarchical administration
  • Hierarchical server
    • TLD(Top-level domain) server
    • local DNS server
      helper the user

Reliability

  • Replicated DNS servers (primary/secondary)
  • Name service available if at least one replica is up
  • Queries can be load-balanced between replicas
  • Usually, UDP used for queries
  • Reliability, if needed, must be implemented on UDP
  • Try alternate servers on timeout
  • Exponential backoff when retrying same server
  • Same identifier for all queries(when have multiple DNS server)
    • Don’ t care which server responds

Fast lookup

  • caching

  • Performing all these queries takes time

  • Up to1-second latency before starting download

  • Caching can greatly reduce overhead

  • The top-level servers very rarely change

  • Popular sites (e.g., www.cnn.com ) visited often

  • Local DNS server often has the information cached

  • How DNS caching works

  • DNS servers cache responses to queries

  • Responses include a “time to live” (TTL) field

  • Server deletes cached entry after TTL expires

Attacks

DDoS Attack

Redirect Attack

  • man-in-middle: intercept queries.

App: Email

  • One of most heavily used apps on Internet
  • SMTP(Simple Mail Transfer Protocol): simple text messages
  • MIME(Multi-purpose Internet Mail Extension): other types of data
  • POP: Post Office Protocol
  • Msg retrieval from server, including authorization and download

IMAP: Internet Mail Access Protocol

  • Manipulation of stored msgs on server

User Agent

  • Composing, editing, reading mail messages
  • e.g. Eudora, Outlook, Foxmail, Netscape Messenger
  • Outgoing, incoming mail messages stored on server

Mail Servers (Host)

  • Mailbox contains incoming mail messages for user
  • Message queue of outgoing mail messages
  • SMTP protocol between mail servers to send mail messages
  • Post title:Notes for Computer Network
  • Post author:Jackcui
  • Create time:2024-02-26 13:14:44
  • Post link:https://jackcuii.github.io/2024/02/26/ComputerNetwork/
  • Copyright Notice:All articles in this blog are licensed under BY-NC-SA unless stating additionally.
 Comments
On this page
Notes for Computer Network