Blog

Async-GRPO - Open, Fast, and Performant

Async-GRPO - Open, Fast, and Performant

Written by Aldo Pareja, Mustafa Eyceoz Introducing Async-GRPO With the incr...

Sculpting Subspaces: How We Solved Continual Learning in Large Language Models

Sculpting Subspaces: How We Solved Continual Learning in Large Language Models

Authors: Nikhil Shivakumar Nayak, Krishnateja Killamsetty, Ligong Han, Abhish...

Update 3 - On Reasoning vs Inference-time scaling - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Update 3 - On Reasoning vs Inference-time scaling - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Written by Akash Srivastava, Isha Puri, Kai Xu, Shivchander Sudalairaj, Musta...

Update 2 - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Update 2 - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Written by Akash Srivastava, Isha Puri, Kai Xu, Shivchander Sudalairaj, Musta...

Update 1 - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Update 1 - Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Written by Akash Srivastava, Isha Puri, Kai Xu, Shivchander Sudalairaj, Musta...

Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Lessons on Reproducing R1-like Reasoning in Small LLMs without using DeepSeek-R1-Zero (or its derivatives)

Written by Akash Srivastava, Isha Puri, Kai Xu, Shivchander Sudalairaj, Musta...