Stanford CS144 lab1


Lab1 stitching substrings into a byte stream

  • 概述
    • Lab1 将实现一个流重组器,也就是一个模块将小的字节片段(与TCP segments不一样,TCP segments是互联网上的IP数据包载荷)重新组装成一个连续的,正确的字节流。
    • Lab2 将实现一个TCP接受方,用来处理入端字节流,这将包括ack应答与流量控制的接受窗口。
    • Lab3 将实现TCP发送方,处理字节流的出端,当报文流失时,它应该怎么办以及何时重传。
    • Lab4 将之前的部分连接起来,得到一个可以用的TCP连接

TCP发送方将字节流分割成一些不超过1460字节的短分段。

  • 几个要点:

    • 任何报文段,只要它前面的字节被push进了ByteStream,就应该立即被push进入ByteStream。

    • 到来的seg会有重复的字节,所以需要手动的去重并组合。

    • Lab1 中实现的数据结构不应该存储有重叠的子字符串(seg)

    • 通过这个测试,可以得知first_unread并不是永远为0,在push_substring函数中通过first_unread = _output.bytes_read();获取当前first_unread。

    • 通过这个测试,可以精妙的判断程序对于处理eof的代码是否正确。

  • 主要思路:

    • 对于每个substring,遍历每个字节。
    • 用map标记对应index的字节是在first_unread + _capacity的范围内,也就是之前图中的红色区域与绿色区域。
    • 判断序号为index的字节是否可以放入ByteStream中。
    • 判断改字节是否是重叠部分且可以放入ByteStream中
    • 遍历结束候,还需要判断是否之后的unassembled区域中正好存在可以放入ByteStream中的字节
    • 另外需要特殊判断字节流是否达到eof。
点击查看代码
//stream_reassembler.cc
#include "stream_reassembler.hh"

// Dummy implementation of a stream reassembler.

// For Lab 1, please replace with a real implementation that passes the
// automated checks run by `make check_lab1`.

// You will need to add private members to the class declaration in `stream_reassembler.hh`

template 
void DUMMY_CODE(Targs &&... /* unused */) {}

using namespace std;

StreamReassembler::StreamReassembler(const size_t capacity) : _output(capacity), _capacity(capacity), first_unacceptable(capacity) {}

//! \details This function accepts a substring (aka a segment) of bytes,
//! possibly out-of-order, from the logical stream, and assembles any newly
//! contiguous substrings and writes them into the output stream in order.
void StreamReassembler::push_substring(const string &data, const size_t index, const bool eof) {
    size_t len = data.size();
    first_unread = _output.bytes_read();//注意first_unread并不是永远是0
    for(size_t i = 0; i < len ; ++i){
        /*每个分段的字符分为有重叠与没有重叠,其中没重叠的又分为可以立即push的与不可以立即push的
        */
       size_t cur = i + index;
        if(cur < first_unread + _capacity){//如果当前字节的index超出了或等于capacity,直接丢弃
            
            if(!m_exist[cur] || cur >= first_unassembled){
                if(!m_exist[cur]){
                    m_exist[cur] = true;//将对应index的存在位置位
                    m_char[cur] = data[i];//记录index对应字符
                    if(cur == first_unassembled){//当取data的字符时,注意这里需要将i与cur区分清楚,一定是data[i]
                        _output.write(data.substr(i, 1));//将data下标为i的字符传入ByteStream
                        ++first_unassembled;
                    }else{
                        ++unassembled;
                    }
                }else{
                    if(cur == first_unassembled){
                        _output.write(string(1,m_char[cur]));
                        ++first_unassembled;
                        --unassembled;
                    }
                }
            }       
        }
    }
    size_t cur = index + len;
    while(m_exist[cur] && cur == first_unassembled){//这里判断是否还有unassembled的字节
        _output.write(string(1,m_char[cur]));
        ++first_unassembled;
        --unassembled;
        ++cur;
    }
    if(eof == true && _eof == false && index + len <=  first_unread + _capacity)
         _eof = true;//如果最后一段字节提前到达,需要将_eof标记为true,而且一定得是这个段的最后一个字节push进了ByteStream
    if (empty() && _eof)//判断是否需要置位ByteStreams的eof信号
        _output.end_input();
}

size_t StreamReassembler::unassembled_bytes() const { return unassembled; }

bool StreamReassembler::empty() const { return unassembled == 0; }

//stream_reassembler.hh
#ifndef SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH
#define SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH

#include "byte_stream.hh"

#include 
#include 
#include 

//! \brief A class that assembles a series of excerpts from a byte stream (possibly out of order,
//! possibly overlapping) into an in-order byte stream.
class StreamReassembler {
  private:
    // Your code here -- add private members as necessary.
    // std::vector unassembled;
    ByteStream _output;  //!< The reassembled in-order byte stream
    size_t _capacity;    //!< The maximum number of bytes
    size_t first_unread = 0;
    size_t first_unassembled = 0;
    size_t first_unacceptable;
    size_t unassembled = 0;
    bool _eof = false;
    std::unordered_map m_exist = {};//使用map标记index是否存在
    std::unordered_map m_char = {};//这里记得需要初始化map
    // size_t unassembled_len;
  public:
    //! \brief Construct a `StreamReassembler` that will store up to `capacity` bytes.
    //! \note This capacity limits both the bytes that have been reassembled,
    //! and those that have not yet been reassembled.
    StreamReassembler(const size_t capacity);

    //! \brief Receive a substring and write any newly contiguous bytes into the stream.
    //!
    //! The StreamReassembler will stay within the memory limits of the `capacity`.
    //! Bytes that would exceed the capacity are silently discarded.
    //!
    //! \param data the substring
    //! \param index indicates the index (place in sequence) of the first byte in `data`
    //! \param eof the last byte of `data` will be the last byte in the entire stream
    void push_substring(const std::string &data, const uint64_t index, const bool eof);

    //! \name Access the reassembled byte stream
    //!@{
    const ByteStream &stream_out() const { return _output; }
    ByteStream &stream_out() { return _output; }
    //!@}

    //! The number of bytes in the substrings stored but not yet reassembled
    //!
    //! \note If the byte at a particular index has been pushed more than once, it
    //! should only be counted once for the purpose of this function.
    size_t unassembled_bytes() const;

    //! \brief Is the internal state empty (other than the output stream)?
    //! \returns `true` if no substrings are waiting to be assembled
    bool empty() const;
};

#endif  // SPONGE_LIBSPONGE_STREAM_REASSEMBLER_HH