API Reference
Complete reference for all public classes in SDG Hub. Every block, flow component, and connector is documented with its fields, methods, and type signatures.
BaseBlock
BaseModel, ABCfrom sdg_hub.core.blocks import BaseBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | Optional[str] | -- | Block type (e.g., 'llm', 'transform', 'parser', 'filtering') |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
Methods
BaseBlock.from_configclassmethod(cls, config: dict[str, Any]) -> 'BaseBlock'Instantiate block from serialized config.
| Parameter | Type | Default |
|---|---|---|
config | dict[str, Any] | required |
BaseBlockBaseBlock.generateabstract(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameSubclass method to implement data generation logic.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameBaseBlock.get_config(self) -> dict[str, Any]Return only constructor arguments for serialization.
dict[str, Any]BaseBlock.get_info(self) -> dict[str, Any]Return a high-level summary of block metadata and config.
dict[str, Any]BaseBlock.normalize_input_colsclassmethod(cls, v: str | list[str] | dict[str, Any] | None) -> list[str] | dict[str, Any]| Parameter | Type | Default |
|---|---|---|
v | str | list[str] | dict[str, Any] | None | required |
list[str] | dict[str, Any]BaseBlock.normalize_output_colsclassmethod(cls, v: str | list[str] | dict[str, Any] | None) -> list[str] | dict[str, Any]| Parameter | Type | Default |
|---|---|---|
v | str | list[str] | dict[str, Any] | None | required |
list[str] | dict[str, Any]BlockRegistry
from sdg_hub.core.blocks import BlockRegistryRegistry for block classes with metadata and enhanced error handling.
Methods
BlockRegistry.categoriesclassmethod(cls) -> list[str]Get all available categories.
list[str]BlockRegistry.discover_blocksclassmethod(cls) -> NonePrint a Rich-formatted table of all available blocks.
BlockRegistry.list_blocksclassmethod(cls, category: Optional[str] = None, *, grouped: bool = False, include_deprecated: bool = True) -> list[str] | dict[str, list[str]]List registered blocks, optionally filtered by category.
| Parameter | Type | Default |
|---|---|---|
category | Optional[str] | -- |
grouped | bool | False |
include_deprecated | bool | True |
list[str] | dict[str, list[str]]BlockRegistry.registerclassmethod(cls, block_name: str, category: str, description: str = '', deprecated: bool = False, replacement: Optional[str] = None)Register a block class with metadata.
| Parameter | Type | Default |
|---|---|---|
block_name | str | required |
category | str | required |
description | str | "" |
deprecated | bool | False |
replacement | Optional[str] | -- |
BlockMetadata
from sdg_hub.core.blocks.registry import BlockMetadataMetadata for registered blocks.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
name | str | required | -- |
block_class | type | required | -- |
category | str | required | -- |
description | str | "" | -- |
deprecated | bool | False | -- |
replacement | Optional[str] | -- | -- |
Methods
BlockMetadata.__init__(self, name: str, block_class: type, category: str, description: str = '', deprecated: bool = False, replacement: Optional[str] = None) -> NoneInitialize self. See help(type(self)) for accurate signature.
| Parameter | Type | Default |
|---|---|---|
name | str | required |
block_class | type | required |
category | str | required |
description | str | "" |
deprecated | bool | False |
replacement | Optional[str] | -- |
LLMChatBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import LLMChatBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "llm" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
model | Optional[str] | -- | Model identifier in LiteLLM format |
api_key | Optional[pydantic.types.SecretStr] | -- | API key for the provider |
api_base | Optional[str] | -- | Base URL for the API |
async_mode | bool | False | Whether to use async processing |
timeout | float | 120 | Request timeout in seconds |
num_retries | int | 6 | Number of retry attempts (uses LiteLLM's built-in retry mechanism) |
drop_params | bool | True | Whether to drop unsupported parameters to prevent API errors |
max_completion_tokens | Optional[int] | -- | Maximum completion tokens (used by newer models like GPT-5 instead of max_tokens). When set, max_tokens is automatically excluded. |
Methods
LLMChatBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate responses from the LLM.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameLLMChatBlock.model_post_init(self, _LLMChatBlock__context) -> NoneInitialize after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_LLMChatBlock__context | | required |
LLMChatBlock.validate_single_input_colclassmethod(cls, v)Ensure exactly one input column.
| Parameter | Type | Default |
|---|---|---|
v | | required |
LLMChatBlock.validate_single_output_colclassmethod(cls, v)Ensure exactly one output column.
| Parameter | Type | Default |
|---|---|---|
v | | required |
PromptBuilderBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import PromptBuilderBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "llm_util" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
prompt_config_path | str | required | Path to YAML file containing the Jinja template configuration |
format_as_messages | bool | True | Whether to format output as chat messages |
prompt_template_config | Optional[prompt_builder_block.PromptTemplateConfig] | -- | Loaded prompt template configuration |
prompt_renderer | Optional[prompt_builder_block.PromptRenderer] | -- | Prompt renderer instance |
Methods
PromptBuilderBlock.generate(self, samples: pandas.DataFrame, **_kwargs: Any) -> pandas.DataFrameGenerate formatted output for all samples.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
_kwargs | Any | required |
pandas.DataFramePromptBuilderBlock.model_post_init(self, _PromptBuilderBlock__context: Any) -> NoneInitialize the block after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_PromptBuilderBlock__context | Any | required |
PromptBuilderBlock.validate_single_output_colclassmethod(cls, v)Validate that exactly one output column is specified.
| Parameter | Type | Default |
|---|---|---|
v | | required |
ChatMessage
BaseModelfrom sdg_hub.core.blocks.llm.prompt_builder_block import ChatMessagePydantic model for chat messages with proper validation.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
role | Literal[system, user, assistant, tool] | required | -- |
content | str | required | -- |
Methods
ChatMessage.validate_content_not_emptyclassmethod(cls, v: str) -> strEnsure content is not empty or just whitespace.
| Parameter | Type | Default |
|---|---|---|
v | str | required |
strMessageTemplate
BaseModelfrom sdg_hub.core.blocks.llm.prompt_builder_block import MessageTemplateTemplate for a chat message with Jinja2 template and original source.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
role | Literal[system, user, assistant, tool] | required | -- |
content_template | jinja2.environment.Template | required | -- |
original_source | str | required | -- |
PromptTemplateConfig
from sdg_hub.core.blocks.llm.prompt_builder_block import PromptTemplateConfigSelf-contained class for loading and validating YAML prompt configurations.
Methods
PromptTemplateConfig.__init__(self, config_path: str)Initialize with path to YAML config file.
| Parameter | Type | Default |
|---|---|---|
config_path | str | required |
PromptTemplateConfig.get_message_templates(self) -> list[prompt_builder_block.MessageTemplate]Return the compiled message templates.
list[prompt_builder_block.MessageTemplate]PromptRenderer
from sdg_hub.core.blocks.llm.prompt_builder_block import PromptRendererHandles rendering of message templates with variable substitution.
Methods
PromptRenderer.__init__(self, message_templates: list[prompt_builder_block.MessageTemplate])Initialize with a list of message templates.
| Parameter | Type | Default |
|---|---|---|
message_templates | list[prompt_builder_block.MessageTemplate] | required |
PromptRenderer.get_required_variables(self) -> setExtract all required variables from message templates.
setPromptRenderer.render_messages(self, template_vars: dict[str, Any]) -> list[prompt_builder_block.ChatMessage]Render all message templates with the given variables.
| Parameter | Type | Default |
|---|---|---|
template_vars | dict[str, Any] | required |
list[prompt_builder_block.ChatMessage]PromptRenderer.resolve_template_vars(self, sample: dict[str, Any], input_cols) -> dict[str, Any]Resolve template variables from dataset columns based on input_cols.
| Parameter | Type | Default |
|---|---|---|
sample | dict[str, Any] | required |
input_cols | | required |
dict[str, Any]LLMResponseExtractorBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import LLMResponseExtractorBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "llm_util" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
extract_content | bool | True | Whether to extract 'content' field from responses. |
extract_reasoning_content | bool | False | Whether to extract 'reasoning_content' field from responses. |
extract_tool_calls | bool | False | Whether to extract 'tool_calls' field from responses. |
expand_lists | bool | True | Whether to expand list inputs into individual rows (True) or preserve lists (False). |
field_prefix | str | "" | Prefix to add to output field names (e.g., 'llm_' results in 'llm_content', 'llm_reasoning_content'). |
Methods
LLMResponseExtractorBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameSubclass method to implement data generation logic.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameLLMResponseExtractorBlock.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneLLMResponseExtractorBlock.validate_extraction_configuration(self)Validate that at least one extraction field is enabled and pre-compute field names.
JSONParserBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import JSONParserBlockBlock for parsing JSON from text and expanding fields into columns.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "parsing" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
field_prefix | str | "" | Optional prefix to add to extracted column names |
fix_trailing_commas | bool | True | Whether to fix trailing commas in JSON (common LLM output issue) |
extract_embedded | bool | True | Whether to extract JSON embedded in surrounding text |
drop_input | bool | False | Whether to drop the input column after extraction |
Methods
JSONParserBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with JSON fields expanded into columns.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameJSONParserBlock.validate_input_colsclassmethod(cls, v: list[str]) -> list[str]Validate that exactly one input column is specified.
| Parameter | Type | Default |
|---|---|---|
v | list[str] | required |
list[str]RegexParserBlock
BaseTextParserBlock, BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import RegexParserBlockBlock for parsing text content using regex patterns.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "parser" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
parser_cleanup_tags | Optional[list[str]] | -- | Tags to remove from extracted content |
parsing_pattern | str | required | -- |
Methods
RegexParserBlock.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneTagParserBlock
BaseTextParserBlock, BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import TagParserBlockBlock for parsing text content using start/end tags.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "parser" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
parser_cleanup_tags | Optional[list[str]] | -- | Tags to remove from extracted content |
start_tags | list[str] | required | Start tags for extraction |
end_tags | list[str] | required | End tags for extraction |
Methods
TagParserBlock.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneTagParserBlock.normalize_tagsclassmethod(cls, v)| Parameter | Type | Default |
|---|---|---|
v | | required |
TagParserBlock.validate_tags(self)DuplicateColumnsBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import DuplicateColumnsBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
Methods
DuplicateColumnsBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with duplicated columns.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameDuplicateColumnsBlock.model_post_init(self, _DuplicateColumnsBlock__context: Any) -> NoneInitialize derived attributes after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_DuplicateColumnsBlock__context | Any | required |
DuplicateColumnsBlock.validate_input_colsclassmethod(cls, v)Validate that input_cols is a non-empty dict.
| Parameter | Type | Default |
|---|---|---|
v | | required |
IndexBasedMapperBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import IndexBasedMapperBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
choice_map | dict[str, str] | required | Dictionary mapping choice values to column names |
choice_cols | list[str] | required | List of column names containing choice values |
Methods
IndexBasedMapperBlock.generate(self, samples: pandas.DataFrame, **kwargs) -> pandas.DataFrameGenerate a new dataset with selected values.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | | required |
pandas.DataFrameIndexBasedMapperBlock.model_post_init(self, _IndexBasedMapperBlock__context: Any) -> NoneInitialize derived attributes after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_IndexBasedMapperBlock__context | Any | required |
IndexBasedMapperBlock.validate_choice_cols_not_emptyclassmethod(cls, v)Validate that choice_cols is not empty.
| Parameter | Type | Default |
|---|---|---|
v | | required |
IndexBasedMapperBlock.validate_choice_mapclassmethod(cls, v)Validate that choice_map is not empty.
| Parameter | Type | Default |
|---|---|---|
v | | required |
IndexBasedMapperBlock.validate_input_output_consistency(self)Validate that choice_cols and output_cols have same length and consistency.
JSONStructureBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import JSONStructureBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
ensure_json_serializable | bool | True | Whether to ensure all values are JSON serializable |
pretty_print | bool | False | Whether to format JSON with indentation |
Methods
JSONStructureBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with JSON structured output.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameJSONStructureBlock.validate_output_colsclassmethod(cls, v)Validate that exactly one output column is specified.
| Parameter | Type | Default |
|---|---|---|
v | | required |
MeltColumnsBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import MeltColumnsBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
Methods
MeltColumnsBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a flattened dataset in long format.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameMeltColumnsBlock.model_post_init(self, _MeltColumnsBlock__context: Any) -> NoneInitialize derived attributes after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_MeltColumnsBlock__context | Any | required |
MeltColumnsBlock.validate_input_colsclassmethod(cls, v)Validate that input_cols is not empty.
| Parameter | Type | Default |
|---|---|---|
v | | required |
MeltColumnsBlock.validate_output_colsclassmethod(cls, v)Validate that exactly two output columns are specified.
| Parameter | Type | Default |
|---|---|---|
v | | required |
RenameColumnsBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import RenameColumnsBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
Methods
RenameColumnsBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with renamed columns.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameRenameColumnsBlock.model_post_init(self, _RenameColumnsBlock__context: Any) -> NoneInitialize derived attributes after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_RenameColumnsBlock__context | Any | required |
RenameColumnsBlock.validate_input_colsclassmethod(cls, v)Validate that input_cols is a non-empty dict.
| Parameter | Type | Default |
|---|---|---|
v | | required |
RowMultiplierBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import RowMultiplierBlockBlock for duplicating dataset rows.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
num_samples | int | required | Number of times to duplicate each row |
shuffle | bool | False | Shuffle output rows after duplication |
random_seed | Optional[int] | -- | Seed for reproducible shuffling |
Methods
RowMultiplierBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with duplicated rows.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameSamplerBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import SamplerBlockBlock for randomly sampling values from list columns.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
num_samples | int | 5 | Number of values to randomly sample from each list |
random_seed | Optional[int] | -- | Random seed for reproducibility |
return_scalar | bool | False | When num_samples=1, return scalar value instead of single-element list |
Methods
SamplerBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with sampled values.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameSamplerBlock.validate_input_colsclassmethod(cls, v: list[str]) -> list[str]Validate that exactly one input column is specified.
| Parameter | Type | Default |
|---|---|---|
v | list[str] | required |
list[str]SamplerBlock.validate_num_samplesclassmethod(cls, v: int) -> intValidate that num_samples is at least 1.
| Parameter | Type | Default |
|---|---|---|
v | int | required |
intSamplerBlock.validate_output_colsclassmethod(cls, v: list[str]) -> list[str]Validate that exactly one output column is specified.
| Parameter | Type | Default |
|---|---|---|
v | list[str] | required |
list[str]TextConcatBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import TextConcatBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
separator | str | "
" | Separator to use between combined values |
Methods
TextConcatBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate a dataset with combined columns.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameTextConcatBlock.validate_input_colsclassmethod(cls, v)Validate that input_cols is a non-empty list.
| Parameter | Type | Default |
|---|---|---|
v | | required |
TextConcatBlock.validate_output_colsclassmethod(cls, v)Validate that exactly one output column is specified.
| Parameter | Type | Default |
|---|---|---|
v | | required |
UniformColumnValueSetter
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import UniformColumnValueSetterBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "transform" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
reduction_strategy | Literal[mode, min, max, mean, median] | "mode" | -- |
Methods
UniformColumnValueSetter.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameSubclass method to implement data generation logic.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameUniformColumnValueSetter.model_post_init(self, _UniformColumnValueSetter__context: Any) -> NoneOverride this method to perform additional initialization after `__init__` and `model_construct`. This is useful if you want to do some validation that requires the entire model to be initialized.
| Parameter | Type | Default |
|---|---|---|
_UniformColumnValueSetter__context | Any | required |
UniformColumnValueSetter.validate_input_cols_singleclassmethod(cls, v)| Parameter | Type | Default |
|---|---|---|
v | | required |
ColumnValueFilterBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import ColumnValueFilterBlockBase class for all blocks, with standardized patterns and full Pydantic compatibility.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "filtering" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
filter_value | Union[Any, list[Any]] | required | The value(s) to filter by |
operation | str | required | String name of binary operator for comparison (e.g., 'eq', 'contains') |
convert_dtype | Optional[str] | -- | String name of type to convert filter column to ('float' or 'int') |
Methods
ColumnValueFilterBlock.generate(self, samples: pandas.DataFrame, **_kwargs: Any) -> pandas.DataFrameGenerate filtered dataset based on specified conditions.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
_kwargs | Any | required |
pandas.DataFrameColumnValueFilterBlock.model_post_init(self, _ColumnValueFilterBlock__context: Any) -> NoneInitialize derived attributes after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_ColumnValueFilterBlock__context | Any | required |
ColumnValueFilterBlock.validate_convert_dtypeclassmethod(cls, v)Validate that convert_dtype is a supported type string.
| Parameter | Type | Default |
|---|---|---|
v | | required |
ColumnValueFilterBlock.validate_input_cols_not_emptyclassmethod(cls, v)Validate that we have at least one input column.
| Parameter | Type | Default |
|---|---|---|
v | | required |
ColumnValueFilterBlock.validate_operationclassmethod(cls, v)Validate that operation is a supported operation string.
| Parameter | Type | Default |
|---|---|---|
v | | required |
AgentBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import AgentBlockBlock for executing external agent frameworks on DataFrame rows.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "agent" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
agent_framework | str | required | Connector name (e.g., 'langflow') |
agent_url | str | required | Agent API endpoint URL |
agent_api_key | Optional[str] | -- | API key for authentication |
timeout | float | 120 | Request timeout in seconds |
max_retries | int | 3 | Maximum retry attempts |
session_id_col | Optional[str] | -- | Column containing session IDs |
async_mode | bool | False | Use async execution for better throughput |
max_concurrency | int | 10 | Maximum concurrent requests in async mode |
connector_kwargs | dict[str, Any] | [object Object] | Extra keyword arguments passed to the connector constructor. Use for framework-specific settings like assistant_id for LangGraph. |
Methods
AgentBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameProcess DataFrame rows through the agent.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameAgentBlock.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneAgentResponseExtractorBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import AgentResponseExtractorBlockBlock for extracting fields from agent framework response objects.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "agent_util" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
agent_framework | str | required | Agent framework whose response format to parse (e.g., 'langflow') |
extract_text | bool | True | Whether to extract text content from responses. |
extract_session_id | bool | False | Whether to extract session_id from responses. |
extract_tool_trace | bool | False | Whether to extract the full tool call trace from agent responses. For Langflow, this extracts the content_blocks 'Agent Steps' array containing structured tool_use entries (name, tool_input, output) and text entries (input/output messages). |
expand_lists | bool | True | Whether to expand list inputs into individual rows (True) or preserve lists (False). |
field_prefix | str | "" | Prefix to add to output field names (e.g., 'agent_' results in 'agent_text'). |
Methods
AgentResponseExtractorBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameSubclass method to implement data generation logic.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameAgentResponseExtractorBlock.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneAgentResponseExtractorBlock.validate_extraction_configuration(self)Validate that at least one extraction field is enabled and pre-compute field names.
MCPAgentBlock
BaseBlock, BaseModel, ABCfrom sdg_hub.core.blocks import MCPAgentBlockLLM agent block that connects to remote MCP servers for tool use.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
block_name | str | required | Unique identifier for this block instance |
block_type | str | "mcp" | -- |
input_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Input columns: str, list, or dict |
output_cols | Union[str, list[str], dict[str, Any], NoneType] | -- | Output columns: str, list, or dict |
mcp_server_url | str | required | URL of the remote MCP server |
mcp_headers | Union[dict[str, str], NoneType] | -- | HTTP headers for MCP server authentication |
model | str | required | Model identifier in LiteLLM format |
api_key | Optional[pydantic.types.SecretStr] | -- | API key for the LLM provider |
api_base | Optional[str] | -- | Base URL for the LLM API |
max_iterations | int | 10 | Maximum number of agentic loop iterations |
system_prompt | Optional[str] | -- | System prompt to prepend to conversations |
Methods
MCPAgentBlock.generate(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrameGenerate responses using LLM with MCP tools.
| Parameter | Type | Default |
|---|---|---|
samples | pandas.DataFrame | required |
kwargs | Any | required |
pandas.DataFrameMCPAgentBlock.model_post_init(self, _MCPAgentBlock__context) -> NoneInitialize after Pydantic validation.
| Parameter | Type | Default |
|---|---|---|
_MCPAgentBlock__context | | required |
MCPAgentBlock.validate_max_iterationsclassmethod(cls, v)Ensure max_iterations is positive.
| Parameter | Type | Default |
|---|---|---|
v | | required |
MCPAgentBlock.validate_single_input_colclassmethod(cls, v)Ensure exactly one input column.
| Parameter | Type | Default |
|---|---|---|
v | | required |
MCPAgentBlock.validate_single_output_colclassmethod(cls, v)Ensure exactly one output column.
| Parameter | Type | Default |
|---|---|---|
v | | required |
Flow
BaseModelfrom sdg_hub import FlowPydantic-based flow for chaining data generation blocks.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
blocks | list[BaseBlock] | | Ordered list of blocks to execute in the flow |
metadata | FlowMetadata | required | Flow metadata including name, version, author, etc. |
Methods
Flow.add_block(self, block: BaseBlock) -> 'Flow'Add a block to the flow, returning a new Flow instance.
| Parameter | Type | Default |
|---|---|---|
block | BaseBlock | required |
FlowFlow.dry_run(self, dataset: Union[pandas.DataFrame, Dataset], sample_size: int = 2, runtime_params: Optional[dict[str, dict[str, Any]]] = None, max_concurrency: Optional[int] = None, enable_time_estimation: bool = False) -> dict[str, Any]Perform a dry run of the flow with a subset of data.
| Parameter | Type | Default |
|---|---|---|
dataset | Union[pandas.DataFrame, Dataset] | required |
sample_size | int | 2 |
runtime_params | Union[dict[str, dict[str, Any]], NoneType] | -- |
max_concurrency | Optional[int] | -- |
enable_time_estimation | bool | False |
dict[str, Any]Flow.from_yamlclassmethod(cls, yaml_path: str) -> 'Flow'Load flow from YAML configuration file.
| Parameter | Type | Default |
|---|---|---|
yaml_path | str | required |
FlowFlow.generate(self, dataset: Union[pandas.DataFrame, Dataset], runtime_params: Optional[dict[str, dict[str, Any]]] = None, checkpoint_dir: Optional[str] = None, save_freq: Optional[int] = None, log_dir: Optional[str] = None, max_concurrency: Optional[int] = None) -> Union[pandas.DataFrame, Dataset]Execute the flow blocks in sequence to generate data.
| Parameter | Type | Default |
|---|---|---|
dataset | Union[pandas.DataFrame, Dataset] | required |
runtime_params | Union[dict[str, dict[str, Any]], NoneType] | -- |
checkpoint_dir | Optional[str] | -- |
save_freq | Optional[int] | -- |
log_dir | Optional[str] | -- |
max_concurrency | Optional[int] | -- |
Union[pandas.DataFrame, Dataset]Flow.get_dataset_requirements(self) -> Optional[DatasetRequirements]Get the dataset requirements for this flow, or None if not defined.
Optional[DatasetRequirements]Flow.get_dataset_schema(self) -> pandas.DataFrameGet an empty DataFrame with the correct schema for this flow.
pandas.DataFrameFlow.get_default_model(self) -> Optional[str]Get the default recommended model for this flow, or None if unspecified.
Optional[str]Flow.get_info(self) -> dict[str, Any]Get information about the flow.
dict[str, Any]Flow.get_model_recommendations(self) -> dict[str, Any]Get model recommendations dict with 'default', 'compatible', 'experimental' keys.
dict[str, Any]Flow.is_agent_config_required(self) -> boolCheck if agent configuration is required (True if flow has agent blocks).
boolFlow.is_agent_config_set(self) -> boolCheck if agent configuration has been set or is not required.
boolFlow.is_model_config_required(self) -> boolCheck if model configuration is required (True if flow has LLM blocks).
boolFlow.is_model_config_set(self) -> boolCheck if model configuration has been set or is not required.
boolFlow.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneFlow.print_info(self) -> NonePrint an interactive summary of the Flow in the console using rich.
Flow.reset_agent_config(self) -> NoneReset agent configuration flag (useful for testing or reconfiguration).
Flow.reset_model_config(self) -> NoneReset model configuration flag (useful for testing or reconfiguration).
Flow.set_agent_config(self, agent_framework: Optional[str] = None, agent_url: Optional[str] = None, agent_api_key: Optional[str] = None, blocks: Optional[list[str]] = None, **kwargs: Any) -> NoneConfigure agent settings for agent blocks in this flow (in-place).
| Parameter | Type | Default |
|---|---|---|
agent_framework | Optional[str] | -- |
agent_url | Optional[str] | -- |
agent_api_key | Optional[str] | -- |
blocks | Optional[list[str]] | -- |
kwargs | Any | required |
Flow.set_model_config(self, model: Optional[str] = None, api_base: Optional[str] = None, api_key: Optional[str] = None, blocks: Optional[list[str]] = None, **kwargs: Any) -> NoneConfigure model settings for LLM blocks in this flow (in-place).
| Parameter | Type | Default |
|---|---|---|
model | Optional[str] | -- |
api_base | Optional[str] | -- |
api_key | Optional[str] | -- |
blocks | Optional[list[str]] | -- |
kwargs | Any | required |
Flow.to_yaml(self, output_path: str) -> NoneSave flow configuration to YAML file.
| Parameter | Type | Default |
|---|---|---|
output_path | str | required |
Flow.validate_block_names_unique(self) -> 'Flow'Ensure all block names are unique within the flow.
FlowFlow.validate_blocksclassmethod(cls, v: list[BaseBlock]) -> list[BaseBlock]Validate that all blocks are BaseBlock instances.
| Parameter | Type | Default |
|---|---|---|
v | list[BaseBlock] | required |
list[BaseBlock]Flow.validate_dataset(self, dataset: Union[pandas.DataFrame, Dataset]) -> list[str]Validate dataset against flow requirements. Returns list of error messages.
| Parameter | Type | Default |
|---|---|---|
dataset | Union[pandas.DataFrame, Dataset] | required |
list[str]FlowRegistry
from sdg_hub import FlowRegistryRegistry for managing contributed flows.
Methods
FlowRegistry.discover_flowsclassmethod(cls) -> NoneDiscover and display all flows in a formatted table.
FlowRegistry.get_flow_metadataclassmethod(cls, flow_name: str) -> Optional[FlowMetadata]Get metadata for a registered flow.
| Parameter | Type | Default |
|---|---|---|
flow_name | str | required |
Optional[FlowMetadata]FlowRegistry.get_flow_pathclassmethod(cls, flow_name_or_id: str) -> Optional[str]Get the path to a registered flow.
| Parameter | Type | Default |
|---|---|---|
flow_name_or_id | str | required |
Optional[str]FlowRegistry.get_flow_path_safeclassmethod(cls, flow_name_or_id: str) -> strGet the path to a registered flow with better error handling.
| Parameter | Type | Default |
|---|---|---|
flow_name_or_id | str | required |
strFlowRegistry.get_flows_by_categoryclassmethod(cls) -> Dict[str, List[Dict[str, str]]]Get flows organized by their primary tag.
dict[str, list[dict[str, str]]]FlowRegistry.list_flowsclassmethod(cls) -> List[Dict[str, str]]List all registered flows with their IDs.
list[dict[str, str]]FlowRegistry.register_search_pathclassmethod(cls, path: str) -> NoneAdd a directory to search for flows.
| Parameter | Type | Default |
|---|---|---|
path | str | required |
FlowRegistry.search_flowsclassmethod(cls, tag: Optional[str] = None, author: Optional[str] = None) -> List[Dict[str, str]]Search flows by criteria.
| Parameter | Type | Default |
|---|---|---|
tag | Optional[str] | -- |
author | Optional[str] | -- |
list[dict[str, str]]FlowRegistryEntry
from sdg_hub.core.flow.registry import FlowRegistryEntryEntry in the flow registry.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
path | str | required | -- |
metadata | FlowMetadata | required | -- |
Methods
FlowRegistryEntry.__init__(self, path: str, metadata: FlowMetadata) -> NoneInitialize self. See help(type(self)) for accurate signature.
| Parameter | Type | Default |
|---|---|---|
path | str | required |
metadata | FlowMetadata | required |
FlowMetadata
BaseModelfrom sdg_hub import FlowMetadataMetadata for flow configuration and open source contributions.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
name | str | required | Human-readable name |
id | str | "" | Unique identifier for the flow, generated from name |
description | str | "" | Detailed description |
version | str | "1.0.0" | Semantic version |
author | str | "" | Author or contributor name |
recommended_models | Optional[RecommendedModels] | -- | Simplified recommended models structure |
tags | list[str] | | Tags for categorization and search |
license | str | "Apache-2.0" | License identifier |
dataset_requirements | Optional[DatasetRequirements] | -- | Requirements for input datasets |
output_columns | Optional[list[str]] | -- | Columns to keep in the final output. Original input columns are always preserved. |
Methods
FlowMetadata.ensure_id(self) -> 'FlowMetadata'Ensure id is set.
FlowMetadataFlowMetadata.get_best_model(self, available_models: Optional[list[str]] = None) -> Optional[str]Get the best recommended model based on availability.
| Parameter | Type | Default |
|---|---|---|
available_models | Optional[list[str]] | -- |
Optional[str]FlowMetadata.validate_idclassmethod(cls, v: str) -> strValidate flow id.
| Parameter | Type | Default |
|---|---|---|
v | str | required |
strFlowMetadata.validate_output_columnsclassmethod(cls, v: Optional[list[str]]) -> Optional[list[str]]Validate and clean output columns.
| Parameter | Type | Default |
|---|---|---|
v | Optional[list[str]] | required |
Optional[list[str]]FlowMetadata.validate_recommended_modelsclassmethod(cls, v: Optional[RecommendedModels]) -> Optional[RecommendedModels]Validate recommended models structure.
| Parameter | Type | Default |
|---|---|---|
v | Optional[RecommendedModels] | required |
Optional[RecommendedModels]FlowMetadata.validate_tagsclassmethod(cls, v: list[str]) -> list[str]Validate and clean tags.
| Parameter | Type | Default |
|---|---|---|
v | list[str] | required |
list[str]RecommendedModels
BaseModelfrom sdg_hub.core.flow.metadata import RecommendedModelsSimplified recommended models structure.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
default | str | required | Default model to use |
compatible | list[str] | | Compatible models |
experimental | list[str] | | Experimental models |
Methods
RecommendedModels.get_all_models(self) -> list[str]Get all models (default + compatible + experimental).
list[str]RecommendedModels.get_best_model(self, available_models: Optional[list[str]] = None) -> Optional[str]Get the best model based on availability.
| Parameter | Type | Default |
|---|---|---|
available_models | Optional[list[str]] | -- |
Optional[str]RecommendedModels.validate_defaultclassmethod(cls, v: str) -> strValidate default model name is not empty.
| Parameter | Type | Default |
|---|---|---|
v | str | required |
strRecommendedModels.validate_model_listsclassmethod(cls, v: list[str]) -> list[str]Validate model lists contain non-empty names.
| Parameter | Type | Default |
|---|---|---|
v | list[str] | required |
list[str]DatasetRequirements
BaseModelfrom sdg_hub.core.flow.metadata import DatasetRequirementsDataset requirements for flow execution.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
required_columns | list[str] | | Column names that must be present |
optional_columns | list[str] | | Optional columns that can enhance performance |
min_samples | int | 1 | Minimum number of samples required |
max_samples | Optional[int] | -- | Maximum number of samples to process |
column_types | dict[str, str] | [object Object] | Expected types for specific columns |
description | str | "" | Human-readable description |
Methods
DatasetRequirements.validate_column_namesclassmethod(cls, v: list[str]) -> list[str]Validate column names are not empty.
| Parameter | Type | Default |
|---|---|---|
v | list[str] | required |
list[str]DatasetRequirements.validate_dataset(self, dataset_columns: list[str], dataset_size: int) -> list[str]Validate a dataset against these requirements.
| Parameter | Type | Default |
|---|---|---|
dataset_columns | list[str] | required |
dataset_size | int | required |
list[str]DatasetRequirements.validate_sample_limits(self) -> 'DatasetRequirements'Validate sample limits are consistent.
DatasetRequirementsModelOption
BaseModelfrom sdg_hub.core.flow.metadata import ModelOptionRepresents a model option with compatibility level.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
name | str | required | Model identifier |
compatibility | ModelCompatibility | "compatible" | Compatibility level with the flow |
Methods
ModelOption.validate_nameclassmethod(cls, v: str) -> strValidate model name is not empty.
| Parameter | Type | Default |
|---|---|---|
v | str | required |
strModelCompatibility
Enumfrom sdg_hub.core.flow.metadata import ModelCompatibilityModel compatibility levels.
BaseConnector
BaseModel, ABCfrom sdg_hub.core.connectors.base import BaseConnectorAbstract base class for all connectors.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
config | ConnectorConfig | required | Connector configuration |
Methods
BaseConnector.aexecute(self, request: Any) -> AnyExecute an asynchronous request.
| Parameter | Type | Default |
|---|---|---|
request | Any | required |
AnyBaseConnector.executeabstract(self, request: Any) -> AnyExecute a synchronous request.
| Parameter | Type | Default |
|---|---|---|
request | Any | required |
AnyConnectorConfig
BaseModelfrom sdg_hub.core.connectors.base import ConnectorConfigBase configuration for all connectors.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
url | Optional[str] | -- | Base URL for the service |
api_key | Optional[str] | -- | API key for authentication |
timeout | float | 120 | Request timeout in seconds |
max_retries | int | 3 | Maximum retry attempts |
ConnectorRegistry
from sdg_hub.core.connectors.registry import ConnectorRegistryGlobal registry for connector classes.
Methods
ConnectorRegistry.clearclassmethod(cls) -> 'None'Clear all registered connectors. Primarily for testing.
NoneConnectorRegistry.getclassmethod(cls, name: 'str') -> 'type[BaseConnector]'Get a connector class by name.
| Parameter | Type | Default |
|---|---|---|
name | str | required |
type[BaseConnector]ConnectorRegistry.list_allclassmethod(cls) -> 'list[str]'Get all registered connector names.
list[str]ConnectorRegistry.registerclassmethod(cls, name: 'str')Register a connector class.
| Parameter | Type | Default |
|---|---|---|
name | str | required |
BaseAgentConnector
BaseConnector, BaseModel, ABCfrom sdg_hub.core.connectors.agent.base import BaseAgentConnectorBase class for agent framework connectors.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
config | ConnectorConfig | required | Connector configuration |
Methods
BaseAgentConnector.asend(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]Async send - convenience wrapper.
| Parameter | Type | Default |
|---|---|---|
messages | list[dict[str, Any]] | required |
session_id | str | required |
dict[str, Any]BaseAgentConnector.build_requestabstract(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]Build framework-specific request payload.
| Parameter | Type | Default |
|---|---|---|
messages | list[dict[str, Any]] | required |
session_id | str | required |
dict[str, Any]BaseAgentConnector.execute(self, request: dict[str, Any]) -> dict[str, Any]Execute a request (BaseConnector interface).
| Parameter | Type | Default |
|---|---|---|
request | dict[str, Any] | required |
dict[str, Any]BaseAgentConnector.extract_session_idclassmethod(cls, response: dict[str, Any]) -> str | NoneExtract session ID from a framework response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
str | NoneBaseAgentConnector.extract_textclassmethod(cls, response: dict[str, Any]) -> str | NoneExtract text content from a framework response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
str | NoneBaseAgentConnector.extract_tool_traceclassmethod(cls, response: dict[str, Any]) -> list[dict[str, Any]] | NoneExtract tool call trace from a framework response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
list[dict[str, Any]] | NoneBaseAgentConnector.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneBaseAgentConnector.parse_responseabstract(self, response: dict[str, Any]) -> dict[str, Any]Parse and validate framework response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
dict[str, Any]BaseAgentConnector.send(self, messages: list[dict[str, Any]], session_id: str, async_mode: bool = False)Send messages to the agent.
| Parameter | Type | Default |
|---|---|---|
messages | list[dict[str, Any]] | required |
session_id | str | required |
async_mode | bool | False |
LangflowConnector
BaseAgentConnector, BaseConnector, BaseModel, ABCfrom sdg_hub.core.connectors.agent.langflow import LangflowConnectorConnector for Langflow agent framework.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
config | ConnectorConfig | required | Connector configuration |
Methods
LangflowConnector.build_request(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]Build Langflow-specific request payload.
| Parameter | Type | Default |
|---|---|---|
messages | list[dict[str, Any]] | required |
session_id | str | required |
dict[str, Any]LangflowConnector.extract_session_idclassmethod(cls, response: dict[str, Any]) -> str | NoneExtract session ID from a Langflow response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
str | NoneLangflowConnector.extract_textclassmethod(cls, response: dict[str, Any]) -> str | NoneExtract text content from a Langflow response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
str | NoneLangflowConnector.extract_tool_traceclassmethod(cls, response: dict[str, Any]) -> list[dict[str, Any]] | NoneExtract tool call trace from Langflow content_blocks.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
list[dict[str, Any]] | NoneLangflowConnector.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneLangflowConnector.parse_response(self, response: dict[str, Any]) -> dict[str, Any]Parse Langflow response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
dict[str, Any]LangGraphConnector
BaseAgentConnector, BaseConnector, BaseModel, ABCfrom sdg_hub.core.connectors.agent.langgraph import LangGraphConnectorConnector for LangGraph agent framework.
Fields
| Name | Type | Default | Description |
|---|---|---|---|
config | ConnectorConfig | required | Connector configuration |
assistant_id | str | "agent" | The assistant ID or graph name to run. |
run_config | dict[str, Any] | [object Object] | Optional configuration dict passed in the run payload. Merged as the 'config' key in the LangGraph /runs/wait request. Use this to pass runtime parameters to the graph via 'configurable', e.g. ``{'configurable': {'model': 'gpt-4o'}}``. |
Methods
LangGraphConnector.build_request(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]Build LangGraph run request payload.
| Parameter | Type | Default |
|---|---|---|
messages | list[dict[str, Any]] | required |
session_id | str | required |
dict[str, Any]LangGraphConnector.extract_session_idclassmethod(cls, response: dict[str, Any]) -> str | NoneExtract session ID from a LangGraph response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
str | NoneLangGraphConnector.extract_textclassmethod(cls, response: dict[str, Any]) -> str | NoneExtract text from the last AI message in LangGraph state.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
str | NoneLangGraphConnector.extract_tool_traceclassmethod(cls, response: dict[str, Any]) -> list[dict[str, Any]] | NoneExtract tool call trace from LangGraph messages.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
list[dict[str, Any]] | NoneLangGraphConnector.model_post_init(self: 'BaseModel', context: 'Any', /) -> 'None'This function is meant to behave like a BaseModel method to initialize private attributes.
| Parameter | Type | Default |
|---|---|---|
context | Any | required |
NoneLangGraphConnector.parse_response(self, response: dict[str, Any]) -> dict[str, Any]Parse LangGraph response.
| Parameter | Type | Default |
|---|---|---|
response | dict[str, Any] | required |
dict[str, Any]