In an effort to understand RIST retransmits a little better, I setup a single RIST stream of 20mbit (2,000 packets per second), with a 1 second RIST buffer – about 10 times round trip time, from an OBE into and OBD, and looked at the RTP traffic on the decoder. The ping time from the encoder and decoder is about 83ms.
I then applied a massive 20% packet loss on the encoder with sudo tc qdisc del dev eno1 root; sudo tc qdisc add dev eno1 root netem loss 20%
On the decoder, run up “tcpdump -nn -T rtp” and look at the sequence numbers coming in. Take 65,000 lines (about 30 seconds) of output.
These are the packets coming in. the majority are in the right order, packet 0, 2, 4, 6, 7, 8, 9, 10, 12, 13, 14, 15, 17 etc. But some are missing. Manually counting the first 50 packets shows 9 missing, which is about expected from 20% random loss.
The decoder asks for these packets again, and here’s some of the retransmits at 183-76 ms later — 108 ms.
That suggests a timeline along this:
- 14:31:11.75ms: expected arrival of packet 1,2
- 76ms: expected arrival of packet 3,4
- 92ms: Packet 1 is MIA, ask for retransmit
- 100ms: Packet 3,5,11,16 is MIA, ask for a retransmit. Expected arrival of packet 49 and 50 at this stage.
- 134ms: Retransmit request arrives for packet 1 at encoder, packet sent again
- 142ms: Retransmit request arrives for packet 3,4,11,16 at encoder, packet sent again
- 175ms: Packet 1 arrives on decoder
- 183ms: Packet 3,4,11,16 arrives
- 191ms: Packet 31 resend arrives
- 199ms: Packet 49 resend arrives
- 297ms: Packet 33 resend arrives
- 413ms: packet 41 resent arrives
So it takes 25ms to realise the packet is missing, and ask for the retransmit. Now this might only be 20ms, and there’s a 5ms delay at the far end before sending.
So do we get all our retransmits? In this block we’re missing packets 1,3,5,11,16,31,33,41,49.
Packed 1 arrived at 175ms. Packet 3 arrives at 183ms, followed by 5, 11 and 16 all in the same millisecond. Packet 31 arrives at 191ms, Packet 49 at 199ms. But where are packet 33 and 41?
Well I’d expect of those 9 retranmits, about 20% would be lost, that’s packet 33 and 41 this time.
Packet 49 turned up at 199ms, on it’s first retransmit, with the retransmission sent at 116ms, 16ms after its expected arrival.
Packet 33 turned up 297ms later. I would guess that the first retransmit failed.
Packet 33 story:
- Expected 91.5ms, between packet 32 and 34, 1ms after packet 31
- Both packets were re-requested about 110ms, with the request arriving on the encoder about 152ms.
- Encoder sent both packets out, but packet 33 was dropped. Packet 31 arrived on the first resend attempt at 191.7ms – 100 ms after it should have arrived (with an 83ms rtt and 15-20ms of overhead/buffer)
- Packet 33 arrived at t=297ms, another 105ms after packet 31. That must have been a second retransmit attempt.
As such with RIST (on OBEs) it seems you need minimum of RTT+20ms of buffer to recover a single packet loss, 2xRTT+40ms for one retransmit loss, and so on.
With 20% loss how many retransmits are we expecting? Well at 20mbit that’s 2000 packets a second, or 7,200,000 packets an hour.
1.44 million will be lost on first transmit and need a retransmit (20% of 7.2M), and 288k will be lost on first and second transmit (20% of 1.44M).
57,600 will fail even with 3 transmits. 2304 with 5 transmits, 18 will fail with 9 retransmits, so you need more than 10 retransmits (RTT+15ms) to average a completely clean line over an hour, 12 over the course of a day, and 16 for the course of a year.
On this particular experiment I should theoretically be able to cope with about 10 retransmits. That would mean about either 1 or 5 unrecovered packets in 10 million. That’s on average of course.
NB: some of these numbers look very familiar to those of us from the 90s (1.44M floppy, 28.8k baud).