Forward-thinking discrete diffusion via a learned latent channel trained with truncated BPTT.
19 min read · May 13, 2026
2026 · discrete-diffusion masked-diffusion bptt language-models