Text extracted from GSO pacth for FreeBSD.
“The use of large frames makes network communication much less demanding for the CPU. Yet, backward compatibility and slow links requires the use of 1500 byte or smaller frames. Modern NICs with hardware TCP segmentation offloading (TSO) address this problem. However, a generic software version (GSO) provided by the OS has reason to exist, for use on paths with no suitable hardware, such as between virtual machines or with older or buggy NICs.
Much of the advantage of TSO comes from crossing the network stack only once per (large) segment instead of once per 1500-byte frame. GSO does the same both for segmentation (TCP) and fragmentation (UDP) by doing these operations as late as possible. Ideally, this could be done within the device driver, but that would require modifications to all drivers. A more convenient, similarly effective approach is to segment just before the packet is passed to the driver (in ether_output()).
Our preliminary implementation supports TCP and UDP on IPv4/IPv6; it only intercepts packets large than the MTU (others are left unchanged), and only when GSO is marked as enabled for the interface.
Segments larger than the MTU are not split in tcp_output(), udp_output(), or ip_output(), but marked with a flag (contained in m_pkthdr.csum_flags), which is processed by ether_output() just before calling the device driver.
ether_output(), through gso_dispatch(), splits the large frame as needed, creating headers and possibly doing checksums if not supported by the hardware.”