Comprehensive benchmarks and rankings for the latest AI models. Compare performance across key metrics to find the perfect model for your needs.
Welcome to flame
, a minimal and efficient framework built on torchtitan
for training Flash Linear Attention (FLA) models with blazing efficiency.
fla
and transformers
50B Tokens
300B Tokens
1T Tokens
Scaling from 90M parameters up to 7B parameters