A hands-on benchmarking report of the AMD Radeon RX 7900 XTX GPU for machine learning workloads in mid-2023. The author tests TensorFlow with ROCm for neural network training and ResNet50, finding performance well below theoretical expectations — barely beating the older 6900 XT despite having 2.6x more FP32 TFLOPS. A raw RDNA3 assembly benchmark via tinygrad confirms the card can hit its advertised 61 TFLOPS peak, pointing to software/driver issues rather than hardware limitations. Porting the benchmark to tinygrad yielded mixed results — occasionally beating TensorFlow by ~10% but highly unstable across versions. The conclusion is that AMD GPU ML support is immature but improving, with official RDNA3 ROCm support planned for late 2023.
Table of contents
Simple TensorFlow Neural Network Training Benchmark ⌗resnet50 via tf_cnn_benchmarks ⌗Benchmarking 7900 XTX Raw FP32 FLOPS ⌗Porting the TensorFlow benchmark to TinyGrad ⌗Conclusion ⌗Sort: