Synthetic Data Generation for Smarter AI Workflows
IBM Technology explores how synthetic data generation with SDG Hub enables smarter AI workflows.
A collective of researchers and engineers from Red Hat & IBM building LLM toolkits you can use today.
Inference-time scaling for LLMs.
Synthetic data generation pipelines
Post training algorithms for LLMs
Asynchronous GRPO for scalable reinforcement learning.
A method for skipping redundant attention blocks in language models
Efficient training library for large language models up to 70B parameters on a single node.
Adaptive SVD-based continual learning method for LLMs.
Inference-time scaling with particle filtering.
State-of-the-art reward models for preference data generation and acceptance criteria.
KV cache quantization for scaling inference time
Efficient messages-format SFT library for language models
IBM Technology explores how synthetic data generation with SDG Hub enables smarter AI workflows.
How SDG Hub enables teams to automatically create grounded evaluation datasets with question-answer-context triplets, transforming RAG tuning from intuition-driven to measurable.
A tutorial on using SDG Hub to turn a small amount of quality data into larger useful datasets through automated pipelines that generate and validate synthetic data.
📹 Nondeterminism in LLM Inference & Training-Rollout Mismatch
👤 Speaker: Xinheng Ding
📹 Evolutionary Arms Races Between LLMs in Core War
👤 Speaker: Akarsh Kumar
📹 Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
👤 Speaker: Qizheng Zhang