API Reference

Complete reference for all public classes in SDG Hub. Every block, flow component, and connector is documented with its fields, methods, and type signatures.

Blocks / Base

BaseBlock

Bases: BaseModel, ABC
from sdg_hub.core.blocks import BaseBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typeOptional[str]--Block type (e.g., 'llm', 'transform', 'parser', 'filtering')
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

BaseBlock.from_configclassmethod
(cls, config: dict[str, Any]) -> 'BaseBlock'

Instantiate block from serialized config.

ParameterTypeDefault
configdict[str, Any]required
Returns BaseBlock
BaseBlock.generateabstract
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
BaseBlock.get_config
(self) -> dict[str, Any]

Return only constructor arguments for serialization.

Returns dict[str, Any]
BaseBlock.get_info
(self) -> dict[str, Any]

Return a high-level summary of block metadata and config.

Returns dict[str, Any]
BaseBlock.normalize_input_colsclassmethod
(cls, v: str | list[str] | dict[str, Any] | None) -> list[str] | dict[str, Any]
ParameterTypeDefault
vstr | list[str] | dict[str, Any] | Nonerequired
Returns list[str] | dict[str, Any]
BaseBlock.normalize_output_colsclassmethod
(cls, v: str | list[str] | dict[str, Any] | None) -> list[str] | dict[str, Any]
ParameterTypeDefault
vstr | list[str] | dict[str, Any] | Nonerequired
Returns list[str] | dict[str, Any]
Blocks / Registry

BlockRegistry

from sdg_hub.core.blocks import BlockRegistry

Registry for block classes with metadata and enhanced error handling.

Methods

BlockRegistry.categoriesclassmethod
(cls) -> list[str]

Get all available categories.

Returns list[str]
BlockRegistry.discover_blocksclassmethod
(cls) -> None

Print a Rich-formatted table of all available blocks.

BlockRegistry.list_blocksclassmethod
(cls, category: Optional[str] = None, *, grouped: bool = False, include_deprecated: bool = True) -> list[str] | dict[str, list[str]]

List registered blocks, optionally filtered by category.

ParameterTypeDefault
categoryOptional[str]--
groupedboolFalse
include_deprecatedboolTrue
Returns list[str] | dict[str, list[str]]
BlockRegistry.registerclassmethod
(cls, block_name: str, category: str, description: str = '', deprecated: bool = False, replacement: Optional[str] = None)

Register a block class with metadata.

ParameterTypeDefault
block_namestrrequired
categorystrrequired
descriptionstr""
deprecatedboolFalse
replacementOptional[str]--
Blocks / Registry

BlockMetadata

from sdg_hub.core.blocks.registry import BlockMetadata

Metadata for registered blocks.

Fields

NameTypeDefaultDescription
namestrrequired--
block_classtyperequired--
categorystrrequired--
descriptionstr""--
deprecatedboolFalse--
replacementOptional[str]----

Methods

BlockMetadata.__init__
(self, name: str, block_class: type, category: str, description: str = '', deprecated: bool = False, replacement: Optional[str] = None) -> None

Initialize self. See help(type(self)) for accurate signature.

ParameterTypeDefault
namestrrequired
block_classtyperequired
categorystrrequired
descriptionstr""
deprecatedboolFalse
replacementOptional[str]--
Blocks / LLM

LLMChatBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import LLMChatBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"llm"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
modelOptional[str]--Model identifier in LiteLLM format
api_keyOptional[pydantic.types.SecretStr]--API key for the provider
api_baseOptional[str]--Base URL for the API
async_modeboolFalseWhether to use async processing
timeoutfloat120Request timeout in seconds
num_retriesint6Number of retry attempts (uses LiteLLM's built-in retry mechanism)
drop_paramsboolTrueWhether to drop unsupported parameters to prevent API errors
max_completion_tokensOptional[int]--Maximum completion tokens (used by newer models like GPT-5 instead of max_tokens). When set, max_tokens is automatically excluded.

Methods

LLMChatBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate responses from the LLM.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
LLMChatBlock.model_post_init
(self, _LLMChatBlock__context) -> None

Initialize after Pydantic validation.

ParameterTypeDefault
_LLMChatBlock__contextrequired
LLMChatBlock.validate_single_input_colclassmethod
(cls, v)

Ensure exactly one input column.

ParameterTypeDefault
vrequired
LLMChatBlock.validate_single_output_colclassmethod
(cls, v)

Ensure exactly one output column.

ParameterTypeDefault
vrequired
Blocks / LLM

PromptBuilderBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import PromptBuilderBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"llm_util"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
prompt_config_pathstrrequiredPath to YAML file containing the Jinja template configuration
format_as_messagesboolTrueWhether to format output as chat messages
prompt_template_configOptional[prompt_builder_block.PromptTemplateConfig]--Loaded prompt template configuration
prompt_rendererOptional[prompt_builder_block.PromptRenderer]--Prompt renderer instance

Methods

PromptBuilderBlock.generate
(self, samples: pandas.DataFrame, **_kwargs: Any) -> pandas.DataFrame

Generate formatted output for all samples.

ParameterTypeDefault
samplespandas.DataFramerequired
_kwargsAnyrequired
Returns pandas.DataFrame
PromptBuilderBlock.model_post_init
(self, _PromptBuilderBlock__context: Any) -> None

Initialize the block after Pydantic validation.

ParameterTypeDefault
_PromptBuilderBlock__contextAnyrequired
PromptBuilderBlock.validate_single_output_colclassmethod
(cls, v)

Validate that exactly one output column is specified.

ParameterTypeDefault
vrequired
Blocks / LLM

ChatMessage

Bases: BaseModel
from sdg_hub.core.blocks.llm.prompt_builder_block import ChatMessage

Pydantic model for chat messages with proper validation.

Fields

NameTypeDefaultDescription
roleLiteral[system, user, assistant, tool]required--
contentstrrequired--

Methods

ChatMessage.validate_content_not_emptyclassmethod
(cls, v: str) -> str

Ensure content is not empty or just whitespace.

ParameterTypeDefault
vstrrequired
Returns str
Blocks / LLM

MessageTemplate

Bases: BaseModel
from sdg_hub.core.blocks.llm.prompt_builder_block import MessageTemplate

Template for a chat message with Jinja2 template and original source.

Fields

NameTypeDefaultDescription
roleLiteral[system, user, assistant, tool]required--
content_templatejinja2.environment.Templaterequired--
original_sourcestrrequired--
Blocks / LLM

PromptTemplateConfig

from sdg_hub.core.blocks.llm.prompt_builder_block import PromptTemplateConfig

Self-contained class for loading and validating YAML prompt configurations.

Methods

PromptTemplateConfig.__init__
(self, config_path: str)

Initialize with path to YAML config file.

ParameterTypeDefault
config_pathstrrequired
PromptTemplateConfig.get_message_templates
(self) -> list[prompt_builder_block.MessageTemplate]

Return the compiled message templates.

Returns list[prompt_builder_block.MessageTemplate]
Blocks / LLM

PromptRenderer

from sdg_hub.core.blocks.llm.prompt_builder_block import PromptRenderer

Handles rendering of message templates with variable substitution.

Methods

PromptRenderer.__init__
(self, message_templates: list[prompt_builder_block.MessageTemplate])

Initialize with a list of message templates.

ParameterTypeDefault
message_templateslist[prompt_builder_block.MessageTemplate]required
PromptRenderer.get_required_variables
(self) -> set

Extract all required variables from message templates.

Returns set
PromptRenderer.render_messages
(self, template_vars: dict[str, Any]) -> list[prompt_builder_block.ChatMessage]

Render all message templates with the given variables.

ParameterTypeDefault
template_varsdict[str, Any]required
Returns list[prompt_builder_block.ChatMessage]
PromptRenderer.resolve_template_vars
(self, sample: dict[str, Any], input_cols) -> dict[str, Any]

Resolve template variables from dataset columns based on input_cols.

ParameterTypeDefault
sampledict[str, Any]required
input_colsrequired
Returns dict[str, Any]
Blocks / LLM

LLMResponseExtractorBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import LLMResponseExtractorBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"llm_util"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
extract_contentboolTrueWhether to extract 'content' field from responses.
extract_reasoning_contentboolFalseWhether to extract 'reasoning_content' field from responses.
extract_tool_callsboolFalseWhether to extract 'tool_calls' field from responses.
expand_listsboolTrueWhether to expand list inputs into individual rows (True) or preserve lists (False).
field_prefixstr""Prefix to add to output field names (e.g., 'llm_' results in 'llm_content', 'llm_reasoning_content').

Methods

LLMResponseExtractorBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
LLMResponseExtractorBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
LLMResponseExtractorBlock.validate_extraction_configuration
(self)

Validate that at least one extraction field is enabled and pre-compute field names.

Blocks / Parsing

JSONParserBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import JSONParserBlock

Block for parsing JSON from text and expanding fields into columns.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"parsing"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
field_prefixstr""Optional prefix to add to extracted column names
fix_trailing_commasboolTrueWhether to fix trailing commas in JSON (common LLM output issue)
extract_embeddedboolTrueWhether to extract JSON embedded in surrounding text
drop_inputboolFalseWhether to drop the input column after extraction

Methods

JSONParserBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with JSON fields expanded into columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
JSONParserBlock.validate_input_colsclassmethod
(cls, v: list[str]) -> list[str]

Validate that exactly one input column is specified.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Blocks / Parsing

RegexParserBlock

Bases: BaseTextParserBlock, BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import RegexParserBlock

Block for parsing text content using regex patterns.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"parser"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
parser_cleanup_tagsOptional[list[str]]--Tags to remove from extracted content
parsing_patternstrrequired--

Methods

RegexParserBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
Blocks / Parsing

TagParserBlock

Bases: BaseTextParserBlock, BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import TagParserBlock

Block for parsing text content using start/end tags.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"parser"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
parser_cleanup_tagsOptional[list[str]]--Tags to remove from extracted content
start_tagslist[str]requiredStart tags for extraction
end_tagslist[str]requiredEnd tags for extraction

Methods

TagParserBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
TagParserBlock.normalize_tagsclassmethod
(cls, v)
ParameterTypeDefault
vrequired
TagParserBlock.validate_tags
(self)
Blocks / Transform

DuplicateColumnsBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import DuplicateColumnsBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

DuplicateColumnsBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with duplicated columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
DuplicateColumnsBlock.model_post_init
(self, _DuplicateColumnsBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_DuplicateColumnsBlock__contextAnyrequired
DuplicateColumnsBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is a non-empty dict.

ParameterTypeDefault
vrequired
Blocks / Transform

IndexBasedMapperBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import IndexBasedMapperBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
choice_mapdict[str, str]requiredDictionary mapping choice values to column names
choice_colslist[str]requiredList of column names containing choice values

Methods

IndexBasedMapperBlock.generate
(self, samples: pandas.DataFrame, **kwargs) -> pandas.DataFrame

Generate a new dataset with selected values.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsrequired
Returns pandas.DataFrame
IndexBasedMapperBlock.model_post_init
(self, _IndexBasedMapperBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_IndexBasedMapperBlock__contextAnyrequired
IndexBasedMapperBlock.validate_choice_cols_not_emptyclassmethod
(cls, v)

Validate that choice_cols is not empty.

ParameterTypeDefault
vrequired
IndexBasedMapperBlock.validate_choice_mapclassmethod
(cls, v)

Validate that choice_map is not empty.

ParameterTypeDefault
vrequired
IndexBasedMapperBlock.validate_input_output_consistency
(self)

Validate that choice_cols and output_cols have same length and consistency.

Blocks / Transform

JSONStructureBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import JSONStructureBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
ensure_json_serializableboolTrueWhether to ensure all values are JSON serializable
pretty_printboolFalseWhether to format JSON with indentation

Methods

JSONStructureBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with JSON structured output.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
JSONStructureBlock.validate_output_colsclassmethod
(cls, v)

Validate that exactly one output column is specified.

ParameterTypeDefault
vrequired
Blocks / Transform

MeltColumnsBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import MeltColumnsBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

MeltColumnsBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a flattened dataset in long format.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
MeltColumnsBlock.model_post_init
(self, _MeltColumnsBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_MeltColumnsBlock__contextAnyrequired
MeltColumnsBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is not empty.

ParameterTypeDefault
vrequired
MeltColumnsBlock.validate_output_colsclassmethod
(cls, v)

Validate that exactly two output columns are specified.

ParameterTypeDefault
vrequired
Blocks / Transform

RenameColumnsBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import RenameColumnsBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

RenameColumnsBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with renamed columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
RenameColumnsBlock.model_post_init
(self, _RenameColumnsBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_RenameColumnsBlock__contextAnyrequired
RenameColumnsBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is a non-empty dict.

ParameterTypeDefault
vrequired
Blocks / Transform

RowMultiplierBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import RowMultiplierBlock

Block for duplicating dataset rows.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
num_samplesintrequiredNumber of times to duplicate each row
shuffleboolFalseShuffle output rows after duplication
random_seedOptional[int]--Seed for reproducible shuffling

Methods

RowMultiplierBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with duplicated rows.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
Blocks / Transform

SamplerBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import SamplerBlock

Block for randomly sampling values from list columns or across rows.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
num_samplesint5Number of values to randomly sample
random_seedOptional[int]--Random seed for reproducibility
return_scalarboolFalseWhen num_samples=1, return scalar value instead of single-element list
sourceLiteral[cell, column]"cell"Sampling source: 'cell' samples from a list within each row; 'column' samples scalar values from the column across rows
exclude_selfboolTrueWhen source='column', exclude the current row's value from the pool
exclude_by_valueboolFalseWhen source='column' and exclude_self=True, exclude all pool entries matching the current row's value (not just the current index). Use after RowMultiplierBlock to avoid sampling duplicated copies of the same row.
replaceboolFalseSample with replacement (True) or without (False)
sample_rangeOptional[list[int]]--When source='column', restrict sampling pool to rows [start, end). Default None uses all rows.

Methods

SamplerBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with sampled values.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
SamplerBlock.validate_input_colsclassmethod
(cls, v: list[str]) -> list[str]

Validate that exactly one input column is specified.

ParameterTypeDefault
vlist[str]required
Returns list[str]
SamplerBlock.validate_mode_specific_params
(self) -> 'SamplerBlock'

Reject mode-irrelevant parameters set to non-default values.

Returns SamplerBlock
SamplerBlock.validate_num_samplesclassmethod
(cls, v: int) -> int

Validate that num_samples is at least 1.

ParameterTypeDefault
vintrequired
Returns int
SamplerBlock.validate_output_cols_for_source
(self) -> 'SamplerBlock'

Validate output_cols length based on source mode.

Returns SamplerBlock
SamplerBlock.validate_sample_rangeclassmethod
(cls, v: Optional[list[int]]) -> Optional[list[int]]

Validate sample_range is a valid [start, end) pair.

ParameterTypeDefault
vOptional[list[int]]required
Returns Optional[list[int]]
Blocks / Transform

TextConcatBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import TextConcatBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
separatorstr" "Separator to use between combined values

Methods

TextConcatBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with combined columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
TextConcatBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is a non-empty list.

ParameterTypeDefault
vrequired
TextConcatBlock.validate_output_colsclassmethod
(cls, v)

Validate that exactly one output column is specified.

ParameterTypeDefault
vrequired
Blocks / Transform

UniformColumnValueSetter

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import UniformColumnValueSetter

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
reduction_strategyLiteral[mode, min, max, mean, median]"mode"--

Methods

UniformColumnValueSetter.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
UniformColumnValueSetter.model_post_init
(self, _UniformColumnValueSetter__context: Any) -> None

Override this method to perform additional initialization after `__init__` and `model_construct`. This is useful if you want to do some validation that requires the entire model to be initialized.

ParameterTypeDefault
_UniformColumnValueSetter__contextAnyrequired
UniformColumnValueSetter.validate_input_cols_singleclassmethod
(cls, v)
ParameterTypeDefault
vrequired
Blocks / Filtering

ColumnValueFilterBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import ColumnValueFilterBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"filtering"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
filter_valueUnion[Any, list[Any]]requiredThe value(s) to filter by
operationstrrequiredString name of binary operator for comparison (e.g., 'eq', 'contains')
convert_dtypeOptional[str]--String name of type to convert filter column to ('float' or 'int')

Methods

ColumnValueFilterBlock.generate
(self, samples: pandas.DataFrame, **_kwargs: Any) -> pandas.DataFrame

Generate filtered dataset based on specified conditions.

ParameterTypeDefault
samplespandas.DataFramerequired
_kwargsAnyrequired
Returns pandas.DataFrame
ColumnValueFilterBlock.model_post_init
(self, _ColumnValueFilterBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_ColumnValueFilterBlock__contextAnyrequired
ColumnValueFilterBlock.validate_convert_dtypeclassmethod
(cls, v)

Validate that convert_dtype is a supported type string.

ParameterTypeDefault
vrequired
ColumnValueFilterBlock.validate_input_cols_not_emptyclassmethod
(cls, v)

Validate that we have at least one input column.

ParameterTypeDefault
vrequired
ColumnValueFilterBlock.validate_operationclassmethod
(cls, v)

Validate that operation is a supported operation string.

ParameterTypeDefault
vrequired
Blocks / Agent

AgentBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import AgentBlock

Block for executing external agent frameworks on DataFrame rows.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"agent"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
agent_frameworkstrrequiredConnector name (e.g., 'langflow')
agent_urlstrrequiredAgent API endpoint URL
agent_api_keyOptional[str]--API key for authentication
timeoutfloat120Request timeout in seconds
max_retriesint3Maximum retry attempts
session_id_colOptional[str]--Column containing session IDs
async_modeboolFalseUse async execution for better throughput
max_concurrencyint10Maximum concurrent requests in async mode
connector_kwargsdict[str, Any][object Object]Extra keyword arguments passed to the connector constructor. Use for framework-specific settings like assistant_id for LangGraph.

Methods

AgentBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Process DataFrame rows through the agent.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
AgentBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
Blocks / Agent

AgentResponseExtractorBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import AgentResponseExtractorBlock

Block for extracting fields from standardized agent response objects.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"agent_util"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
extract_textboolTrueWhether to extract text content from responses.
extract_session_idboolFalseWhether to extract session_id from responses.
extract_tool_traceboolFalseWhether to extract the full tool call trace from agent responses. Collects all assistant messages with tool_calls and tool result messages into a structured trace list.
expand_listsboolTrueWhether to expand list inputs into individual rows (True) or preserve lists (False).
field_prefixstr""Prefix to add to output field names (e.g., 'agent_' results in 'agent_text').

Methods

AgentResponseExtractorBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
AgentResponseExtractorBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
AgentResponseExtractorBlock.validate_extraction_configuration
(self)

Validate that at least one extraction field is enabled and pre-compute field names.

Blocks / MCP

MCPAgentBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import MCPAgentBlock

LLM agent block that connects to remote MCP servers for tool use.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"mcp"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
mcp_server_urlstrrequiredURL of the remote MCP server
mcp_headersUnion[dict[str, str], NoneType]--HTTP headers for MCP server authentication
modelstrrequiredModel identifier in LiteLLM format
api_keyOptional[pydantic.types.SecretStr]--API key for the LLM provider
api_baseOptional[str]--Base URL for the LLM API
max_iterationsint10Maximum number of agentic loop iterations
system_promptOptional[str]--System prompt to prepend to conversations

Methods

MCPAgentBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate responses using LLM with MCP tools.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
MCPAgentBlock.model_post_init
(self, _MCPAgentBlock__context) -> None

Initialize after Pydantic validation.

ParameterTypeDefault
_MCPAgentBlock__contextrequired
MCPAgentBlock.validate_max_iterationsclassmethod
(cls, v)

Ensure max_iterations is positive.

ParameterTypeDefault
vrequired
MCPAgentBlock.validate_single_input_colclassmethod
(cls, v)

Ensure exactly one input column.

ParameterTypeDefault
vrequired
MCPAgentBlock.validate_single_output_colclassmethod
(cls, v)

Ensure exactly one output column.

ParameterTypeDefault
vrequired
Flow / Base

Flow

Bases: BaseModel
from sdg_hub import Flow

Pydantic-based flow for chaining data generation blocks.

Fields

NameTypeDefaultDescription
blockslist[BaseBlock]Ordered list of blocks to execute in the flow
metadataFlowMetadatarequiredFlow metadata including name, version, author, etc.

Methods

Flow.add_block
(self, block: BaseBlock) -> 'Flow'

Add a block to the flow, returning a new Flow instance.

ParameterTypeDefault
blockBaseBlockrequired
Returns Flow
Flow.dry_run
(self, dataset: Union[pandas.DataFrame, Dataset], sample_size: int = 2, runtime_params: Optional[dict[str, dict[str, Any]]] = None, max_concurrency: Optional[int] = None, enable_time_estimation: bool = False) -> dict[str, Any]

Perform a dry run of the flow with a subset of data.

ParameterTypeDefault
datasetUnion[pandas.DataFrame, Dataset]required
sample_sizeint2
runtime_paramsUnion[dict[str, dict[str, Any]], NoneType]--
max_concurrencyOptional[int]--
enable_time_estimationboolFalse
Returns dict[str, Any]
Flow.from_yamlclassmethod
(cls, yaml_path: str) -> 'Flow'

Load flow from YAML configuration file.

ParameterTypeDefault
yaml_pathstrrequired
Returns Flow
Flow.generate
(self, dataset: Union[pandas.DataFrame, Dataset], runtime_params: Optional[dict[str, dict[str, Any]]] = None, checkpoint_dir: Optional[str] = None, save_freq: Optional[int] = None, log_dir: Optional[str] = None, max_concurrency: Optional[int] = None) -> Union[pandas.DataFrame, Dataset]

Execute the flow blocks in sequence to generate data.

ParameterTypeDefault
datasetUnion[pandas.DataFrame, Dataset]required
runtime_paramsUnion[dict[str, dict[str, Any]], NoneType]--
checkpoint_dirOptional[str]--
save_freqOptional[int]--
log_dirOptional[str]--
max_concurrencyOptional[int]--
Returns Union[pandas.DataFrame, Dataset]
Flow.get_dataset_requirements
(self) -> Optional[DatasetRequirements]

Get the dataset requirements for this flow, or None if not defined.

Returns Optional[DatasetRequirements]
Flow.get_dataset_schema
(self) -> pandas.DataFrame

Get an empty DataFrame with the correct schema for this flow.

Returns pandas.DataFrame
Flow.get_default_model
(self) -> Optional[str]

Get the default recommended model for this flow, or None if unspecified.

Returns Optional[str]
Flow.get_info
(self) -> dict[str, Any]

Get information about the flow.

Returns dict[str, Any]
Flow.get_model_recommendations
(self) -> dict[str, Any]

Get model recommendations dict with 'default', 'compatible', 'experimental' keys.

Returns dict[str, Any]
Flow.is_agent_config_required
(self) -> bool

Check if agent configuration is required (True if flow has agent blocks).

Returns bool
Flow.is_agent_config_set
(self) -> bool

Check if agent configuration has been set or is not required.

Returns bool
Flow.is_model_config_required
(self) -> bool

Check if model configuration is required (True if flow has LLM blocks).

Returns bool
Flow.is_model_config_set
(self) -> bool

Check if model configuration has been set or is not required.

Returns bool
Flow.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
Flow.print_info
(self) -> None

Print an interactive summary of the Flow in the console using rich.

Flow.reset_agent_config
(self) -> None

Reset agent configuration flag (useful for testing or reconfiguration).

Flow.reset_model_config
(self) -> None

Reset model configuration flag (useful for testing or reconfiguration).

Flow.set_agent_config
(self, agent_framework: Optional[str] = None, agent_url: Optional[str] = None, agent_api_key: Optional[str] = None, blocks: Optional[list[str]] = None, **kwargs: Any) -> None

Configure agent settings for agent blocks in this flow (in-place).

ParameterTypeDefault
agent_frameworkOptional[str]--
agent_urlOptional[str]--
agent_api_keyOptional[str]--
blocksOptional[list[str]]--
kwargsAnyrequired
Flow.set_model_config
(self, model: Optional[str] = None, api_base: Optional[str] = None, api_key: Optional[str] = None, blocks: Optional[list[str]] = None, **kwargs: Any) -> None

Configure model settings for LLM blocks in this flow (in-place).

ParameterTypeDefault
modelOptional[str]--
api_baseOptional[str]--
api_keyOptional[str]--
blocksOptional[list[str]]--
kwargsAnyrequired
Flow.to_yaml
(self, output_path: str) -> None

Save flow configuration to YAML file.

ParameterTypeDefault
output_pathstrrequired
Flow.validate_block_names_unique
(self) -> 'Flow'

Ensure all block names are unique within the flow.

Returns Flow
Flow.validate_blocksclassmethod
(cls, v: list[BaseBlock]) -> list[BaseBlock]

Validate that all blocks are BaseBlock instances.

ParameterTypeDefault
vlist[BaseBlock]required
Returns list[BaseBlock]
Flow.validate_dataset
(self, dataset: Union[pandas.DataFrame, Dataset]) -> list[str]

Validate dataset against flow requirements. Returns list of error messages.

ParameterTypeDefault
datasetUnion[pandas.DataFrame, Dataset]required
Returns list[str]
Flow / Registry

FlowRegistry

from sdg_hub import FlowRegistry

Registry for managing contributed flows.

Methods

FlowRegistry.discover_flowsclassmethod
(cls) -> None

Discover and display all flows in a formatted table.

FlowRegistry.get_flow_metadataclassmethod
(cls, flow_name: str) -> Optional[FlowMetadata]

Get metadata for a registered flow.

ParameterTypeDefault
flow_namestrrequired
Returns Optional[FlowMetadata]
FlowRegistry.get_flow_pathclassmethod
(cls, flow_name_or_id: str) -> Optional[str]

Get the path to a registered flow.

ParameterTypeDefault
flow_name_or_idstrrequired
Returns Optional[str]
FlowRegistry.get_flow_path_safeclassmethod
(cls, flow_name_or_id: str) -> str

Get the path to a registered flow with better error handling.

ParameterTypeDefault
flow_name_or_idstrrequired
Returns str
FlowRegistry.get_flows_by_categoryclassmethod
(cls) -> Dict[str, List[Dict[str, str]]]

Get flows organized by their primary tag.

Returns dict[str, list[dict[str, str]]]
FlowRegistry.list_flowsclassmethod
(cls) -> List[Dict[str, str]]

List all registered flows with their IDs.

Returns list[dict[str, str]]
FlowRegistry.register_search_pathclassmethod
(cls, path: str) -> None

Add a directory to search for flows.

ParameterTypeDefault
pathstrrequired
FlowRegistry.search_flowsclassmethod
(cls, tag: Optional[str] = None, author: Optional[str] = None) -> List[Dict[str, str]]

Search flows by criteria.

ParameterTypeDefault
tagOptional[str]--
authorOptional[str]--
Returns list[dict[str, str]]
Flow / Registry

FlowRegistryEntry

from sdg_hub.core.flow.registry import FlowRegistryEntry

Entry in the flow registry.

Fields

NameTypeDefaultDescription
pathstrrequired--
metadataFlowMetadatarequired--

Methods

FlowRegistryEntry.__init__
(self, path: str, metadata: FlowMetadata) -> None

Initialize self. See help(type(self)) for accurate signature.

ParameterTypeDefault
pathstrrequired
metadataFlowMetadatarequired
Flow / Metadata

FlowMetadata

Bases: BaseModel
from sdg_hub import FlowMetadata

Metadata for flow configuration and open source contributions.

Fields

NameTypeDefaultDescription
namestrrequiredHuman-readable name
idstr""Unique identifier for the flow, generated from name
descriptionstr""Detailed description
versionstr"1.0.0"Semantic version
authorstr""Author or contributor name
recommended_modelsOptional[RecommendedModels]--Simplified recommended models structure
tagslist[str]Tags for categorization and search
licensestr"Apache-2.0"License identifier
dataset_requirementsOptional[DatasetRequirements]--Requirements for input datasets
output_columnsOptional[list[str]]--Columns to keep in the final output. Original input columns are always preserved.

Methods

FlowMetadata.ensure_id
(self) -> 'FlowMetadata'

Ensure id is set.

Returns FlowMetadata
FlowMetadata.get_best_model
(self, available_models: Optional[list[str]] = None) -> Optional[str]

Get the best recommended model based on availability.

ParameterTypeDefault
available_modelsOptional[list[str]]--
Returns Optional[str]
FlowMetadata.validate_idclassmethod
(cls, v: str) -> str

Validate flow id.

ParameterTypeDefault
vstrrequired
Returns str
FlowMetadata.validate_output_columnsclassmethod
(cls, v: Optional[list[str]]) -> Optional[list[str]]

Validate and clean output columns.

ParameterTypeDefault
vOptional[list[str]]required
Returns Optional[list[str]]
FlowMetadata.validate_recommended_modelsclassmethod
(cls, v: Optional[RecommendedModels]) -> Optional[RecommendedModels]

Validate recommended models structure.

ParameterTypeDefault
vOptional[RecommendedModels]required
Returns Optional[RecommendedModels]
FlowMetadata.validate_tagsclassmethod
(cls, v: list[str]) -> list[str]

Validate and clean tags.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Flow / Metadata

RecommendedModels

Bases: BaseModel
from sdg_hub.core.flow.metadata import RecommendedModels

Simplified recommended models structure.

Fields

NameTypeDefaultDescription
defaultstrrequiredDefault model to use
compatiblelist[str]Compatible models
experimentallist[str]Experimental models

Methods

RecommendedModels.get_all_models
(self) -> list[str]

Get all models (default + compatible + experimental).

Returns list[str]
RecommendedModels.get_best_model
(self, available_models: Optional[list[str]] = None) -> Optional[str]

Get the best model based on availability.

ParameterTypeDefault
available_modelsOptional[list[str]]--
Returns Optional[str]
RecommendedModels.validate_defaultclassmethod
(cls, v: str) -> str

Validate default model name is not empty.

ParameterTypeDefault
vstrrequired
Returns str
RecommendedModels.validate_model_listsclassmethod
(cls, v: list[str]) -> list[str]

Validate model lists contain non-empty names.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Flow / Metadata

DatasetRequirements

Bases: BaseModel
from sdg_hub.core.flow.metadata import DatasetRequirements

Dataset requirements for flow execution.

Fields

NameTypeDefaultDescription
required_columnslist[str]Column names that must be present
optional_columnslist[str]Optional columns that can enhance performance
min_samplesint1Minimum number of samples required
max_samplesOptional[int]--Maximum number of samples to process
column_typesdict[str, str][object Object]Expected types for specific columns
descriptionstr""Human-readable description

Methods

DatasetRequirements.validate_column_namesclassmethod
(cls, v: list[str]) -> list[str]

Validate column names are not empty.

ParameterTypeDefault
vlist[str]required
Returns list[str]
DatasetRequirements.validate_dataset
(self, dataset_columns: list[str], dataset_size: int) -> list[str]

Validate a dataset against these requirements.

ParameterTypeDefault
dataset_columnslist[str]required
dataset_sizeintrequired
Returns list[str]
DatasetRequirements.validate_sample_limits
(self) -> 'DatasetRequirements'

Validate sample limits are consistent.

Returns DatasetRequirements
Flow / Metadata

ModelOption

Bases: BaseModel
from sdg_hub.core.flow.metadata import ModelOption

Represents a model option with compatibility level.

Fields

NameTypeDefaultDescription
namestrrequiredModel identifier
compatibilityModelCompatibility"compatible"Compatibility level with the flow

Methods

ModelOption.validate_nameclassmethod
(cls, v: str) -> str

Validate model name is not empty.

ParameterTypeDefault
vstrrequired
Returns str
Flow / Metadata

ModelCompatibility

Bases: Enum
from sdg_hub.core.flow.metadata import ModelCompatibility

Model compatibility levels.

Connectors / Base

BaseConnector

Bases: BaseModel, ABC
from sdg_hub.core.connectors.base import BaseConnector

Abstract base class for all connectors.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration

Methods

BaseConnector.aexecute
(self, request: Any) -> Any

Execute an asynchronous request.

ParameterTypeDefault
requestAnyrequired
Returns Any
BaseConnector.executeabstract
(self, request: Any) -> Any

Execute a synchronous request.

ParameterTypeDefault
requestAnyrequired
Returns Any
Connectors / Base

ConnectorConfig

Bases: BaseModel
from sdg_hub.core.connectors.base import ConnectorConfig

Base configuration for all connectors.

Fields

NameTypeDefaultDescription
urlOptional[str]--Base URL for the service
api_keyOptional[str]--API key for authentication
timeoutfloat120Request timeout in seconds
max_retriesint3Maximum retry attempts
Connectors / Registry

ConnectorRegistry

from sdg_hub.core.connectors.registry import ConnectorRegistry

Global registry for connector classes.

Methods

ConnectorRegistry.clearclassmethod
(cls) -> 'None'

Clear all registered connectors. Primarily for testing.

Returns None
ConnectorRegistry.getclassmethod
(cls, name: 'str') -> 'type[BaseConnector]'

Get a connector class by name.

ParameterTypeDefault
namestrrequired
Returns type[BaseConnector]
ConnectorRegistry.list_allclassmethod
(cls) -> 'list[str]'

Get all registered connector names.

Returns list[str]
ConnectorRegistry.registerclassmethod
(cls, name: 'str')

Register a connector class.

ParameterTypeDefault
namestrrequired
Connectors / Agent

BaseAgentConnector

Bases: BaseConnector, BaseModel, ABC
from sdg_hub.core.connectors.agent.base import BaseAgentConnector

Base class for agent framework connectors.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration

Methods

BaseAgentConnector.asend
(self, request: mlflow.types.agent.ChatAgentRequest) -> mlflow.types.agent.ChatAgentResponse

Async send - convenience wrapper.

ParameterTypeDefault
requestmlflow.types.agent.ChatAgentRequestrequired
Returns mlflow.types.agent.ChatAgentResponse
BaseAgentConnector.build_requestabstract
(self, request: mlflow.types.agent.ChatAgentRequest) -> dict[str, Any]

Build framework-specific request payload.

ParameterTypeDefault
requestmlflow.types.agent.ChatAgentRequestrequired
Returns dict[str, Any]
BaseAgentConnector.execute
(self, request: dict[str, Any]) -> dict[str, Any]

Execute a request (BaseConnector interface).

ParameterTypeDefault
requestdict[str, Any]required
Returns dict[str, Any]
BaseAgentConnector.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
BaseAgentConnector.parse_responseabstract
(self, response: dict[str, Any]) -> mlflow.types.agent.ChatAgentResponse

Parse and validate framework response.

ParameterTypeDefault
responsedict[str, Any]required
Returns mlflow.types.agent.ChatAgentResponse
BaseAgentConnector.send
(self, request: mlflow.types.agent.ChatAgentRequest, async_mode: bool = False)

Send a request to the agent.

ParameterTypeDefault
requestmlflow.types.agent.ChatAgentRequestrequired
async_modeboolFalse
Connectors / Agent

LangflowConnector

Bases: BaseAgentConnector, BaseConnector, BaseModel, ABC
from sdg_hub.core.connectors.agent.langflow import LangflowConnector

Connector for Langflow agent framework.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration

Methods

LangflowConnector.build_request
(self, request: mlflow.types.agent.ChatAgentRequest) -> dict[str, Any]

Build Langflow-specific request payload.

ParameterTypeDefault
requestmlflow.types.agent.ChatAgentRequestrequired
Returns dict[str, Any]
LangflowConnector.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
LangflowConnector.parse_response
(self, response: dict[str, Any]) -> mlflow.types.agent.ChatAgentResponse

Parse Langflow response into ChatAgentResponse.

ParameterTypeDefault
responsedict[str, Any]required
Returns mlflow.types.agent.ChatAgentResponse
Connectors / Agent

LangGraphConnector

Bases: BaseAgentConnector, BaseConnector, BaseModel, ABC
from sdg_hub.core.connectors.agent.langgraph import LangGraphConnector

Connector for LangGraph agent framework.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration
assistant_idstr"agent"The assistant ID or graph name to run.
run_configdict[str, Any][object Object]Optional configuration dict passed in the run payload. Merged as the 'config' key in the LangGraph /runs/wait request. Use this to pass runtime parameters to the graph via 'configurable', e.g. ``{'configurable': {'model': 'gpt-4o'}}``.

Methods

LangGraphConnector.build_request
(self, request: mlflow.types.agent.ChatAgentRequest) -> dict[str, Any]

Build LangGraph run request payload.

ParameterTypeDefault
requestmlflow.types.agent.ChatAgentRequestrequired
Returns dict[str, Any]
LangGraphConnector.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
LangGraphConnector.parse_response
(self, response: dict[str, Any]) -> mlflow.types.agent.ChatAgentResponse

Parse LangGraph response into ChatAgentResponse.

ParameterTypeDefault
responsedict[str, Any]required
Returns mlflow.types.agent.ChatAgentResponse