A team built an end-to-end protein AI pipeline covering structure prediction, sequence design, and codon optimization. They compared multiple transformer architectures for codon-level language modeling, with CodonRoBERTa-large-v2 achieving the best results (perplexity of 4.10, Spearman CAI correlation of 0.40), outperforming
Sort: