- RWKV v5: Eagle 7B
Eagle 7B is trained on 1.1 Trillion Tokens across 100+ world languages (70% English, 15% multilang, 15% code). - Built on the RWKV-v5 architecture (a linear transformer with 10-100x+ lower inference cost) - Ranks as the world's greenest 7B model (per token) - Outperforms all 7B class models in multi-lingual benchmarks - Approaches Falcon (1.5T), LLaMA2 (2T), Mistral (>2T?) level of performance in English evals - Trade blows with MPT-7B (1T) in English evals - All while being an "Attention-Free Transformer" Eagle 7B models are provided for free, by Recursal.AI, for the beta period till end of March 2024 Find out more here rnn
by recursal10K context$0/M input tkns$0/M output tkns4.2M tokens this week - RWKV v5 World 3B
RWKV is an RNN (recurrent neural network) with transformer-level performance. It aims to combine the best of RNNs and transformers - great performance, fast inference, low VRAM, fast training, "infinite" context length, and free sentence embedding. RWKV-5 is trained on 100+ world languages (70% English, 15% multilang, 15% code). RWKV 3B models are provided for free, by Recursal.AI, for the beta period. More details here. #rnn
by rwkv10K context$0/M input tkns$0/M output tkns2M tokens this week - RWKV v5 3B AI Town
This is an RWKV 3B model finetuned specifically for the AI Town project. RWKV is an RNN (recurrent neural network) with transformer-level performance. It aims to combine the best of RNNs and transformers - great performance, fast inference, low VRAM, fast training, "infinite" context length, and free sentence embedding. RWKV 3B models are provided for free, by Recursal.AI, for the beta period. More details here. #rnn
by recursal10K context$0/M input tkns$0/M output tkns281K tokens this week