Unsloth and Training Hub: Lightning-fast LoRA and QLoRA fine-tuning
Training Hub v0.4.0 adds LoRA and QLoRA fine-tuning powered by Unsloth, enabling fast, cost-effective model adaptation with roughly 70% less VRAM than full fine-tuning.
A collective of researchers and engineers from Red Hat & IBM building LLM toolkits you can use today.
Inference-time scaling for LLMs.
Synthetic data generation pipelines
Post training algorithms for LLMs
Asynchronous GRPO for scalable reinforcement learning.
A method for skipping redundant attention blocks in language models
Efficient training library for large language models up to 70B parameters on a single node.
Adaptive SVD-based continual learning method for LLMs.
Inference-time scaling with particle filtering.
State-of-the-art reward models for preference data generation and acceptance criteria.
KV cache quantization for scaling inference time
Efficient messages-format SFT library for language models
Training Hub v0.4.0 adds LoRA and QLoRA fine-tuning powered by Unsloth, enabling fast, cost-effective model adaptation with roughly 70% less VRAM than full fine-tuning.
A four-step pathway for scaling LLM fine-tuning from local experimentation to production deployment using Training Hub, OpenShift AI, Kubeflow Trainer, and AI pipelines.
IBM Technology explores how synthetic data generation with SDG Hub enables smarter AI workflows.
📹 Nondeterminism in LLM Inference & Training-Rollout Mismatch
👤 Speaker: Xinheng Ding
📹 Evolutionary Arms Races Between LLMs in Core War
👤 Speaker: Akarsh Kumar
📹 Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
👤 Speaker: Qizheng Zhang