CS144: Introduction to Computer Networking Lab 1
You can get the whole series from here
First, you should carefully read the docs provided by the lab
The TCP sender is dividing its byte stream up into short segments so that they can fit inside a datagram. But the network might reorder these datagrams, or drop them, or deliver them more than once. The receiver must reassemble the segments into the contiguous stream of bytes that they started out as.
In this lab you’ll write the data structure that will be responsible for this reassembly: a
StreamReassembler
. It will receive substrings, consisting of a string of bytes, and the index of the first byte of that string within the larger stream. Each byte of the steam has its own unique index, starting from zero and counting upwards. TheStreamReassembler
will own a ByteStream for the output: as soon as the reassembler knows the next byte of the stream, it will write it into theByteStream
. The owner can access and read from theByteStream
whenever it wants.
And it provides the following public interface:
StreamReassembler(const size_t capacity)
: Construct aStreamReassembler
that will store up tocapacity
bytes.push_substring(const string &data, const size_t index, const bool eof)
: Receive a substring and write any newly contiguous bytes into the stream.ByteStream &stream_out()
: Access the reassembled ByteStream (Lab 0)size_t unaseembled_bytes() const
: The number of bytes in the substrings stored but not yet reassembled.
So, the requirement is obvious. We need to maintain a window. When the datagram comes, we should check whether we should accept the datagram. In order to achieve the functionality, we should consider the following things:
- Indicate the start index of the current window we want to accept.
- The capacity of the window.
So we need an array to hold the data which I named _stream
. And for this array, the logical index is _next_index
which means the next index we need. The logical index must exceed the actual length of the stream
but we could easily deal with that using _next_index % _capacity
. Also, we need a value _unassembly
to record the current number of bytes stored but not yet reassembled. And we need a way to know whether the array index is accessed. So maybe hash is a good idea, however, I think here I can only need an array called _dirty
to indicate whether the array index is accessed. Simple idea.
1 | class StreamReassembler { |
Constructor
For constructor, it is easy, we should initialize the _capacity
and _output
, and initialize the _stream
and _dirty
.
1 | StreamReassembler::StreamReassembler(const size_t capacity) : _output(capacity), _capacity(capacity) { |
Easy Functions
There are some extremely easy functions:
1 | size_t StreamReassembler::unassembled_bytes() const { return {_unassembly}; } |
Reassembler
Now we come to the most interesting part. We should consider the following situations:
- When the data is totally before the
_next_index
: we should reject. - When the data is totally after the
_next_index + _capacity
: we should reject. - When the data is inside the window: we should store the value to
_stream
, and sets the corresponding_dirty
totrue
. If we can find continuos range from_next_index
, we should write the data to the ByteStream and move the_next_index
pointer. - When the data is partially inside the window: do as above.
And next we should indicate whether we should signal the ByteStream that the input is end. The push_substring
function provides a boolean value eof
to indicate this string is the end. However, when should we accept this information. This should be a valid data, which means that we could write the data to the ByteStream. However, pay attention to the corner case: the data could be empty in this situation.
However, there are some trivial details. You could see the following code with descriptive comments.
1 | void StreamReassembler::push_substring(const string &data, const size_t index, const bool eof) { |