DenseNet (Densely Connected Convolutional Networks) addresses the vanishing gradient problem by connecting every layer to all subsequent layers via channel-wise concatenation, unlike ResNet's element-wise summation skip connections. This dense connectivity enables feature reuse and results in fewer parameters than traditional CNNs. The post covers the architecture's key components: dense blocks, bottleneck layers (1×1 + 3×3 convolutions), and transition layers with a compression factor. A full PyTorch implementation of DenseNet-121 (DenseNet-BC variant) is built from scratch, with code for each component and tensor shape traces showing the full forward pass.
Sort: