API Reference

Complete reference for all public classes in SDG Hub. Every block, flow component, and connector is documented with its fields, methods, and type signatures.

Blocks / Base

BaseBlock

Bases: BaseModel, ABC
from sdg_hub.core.blocks import BaseBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typeOptional[str]--Block type (e.g., 'llm', 'transform', 'parser', 'filtering')
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

BaseBlock.from_configclassmethod
(cls, config: dict[str, Any]) -> 'BaseBlock'

Instantiate block from serialized config.

ParameterTypeDefault
configdict[str, Any]required
Returns BaseBlock
BaseBlock.generateabstract
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
BaseBlock.get_config
(self) -> dict[str, Any]

Return only constructor arguments for serialization.

Returns dict[str, Any]
BaseBlock.get_info
(self) -> dict[str, Any]

Return a high-level summary of block metadata and config.

Returns dict[str, Any]
BaseBlock.normalize_input_colsclassmethod
(cls, v: str | list[str] | dict[str, Any] | None) -> list[str] | dict[str, Any]
ParameterTypeDefault
vstr | list[str] | dict[str, Any] | Nonerequired
Returns list[str] | dict[str, Any]
BaseBlock.normalize_output_colsclassmethod
(cls, v: str | list[str] | dict[str, Any] | None) -> list[str] | dict[str, Any]
ParameterTypeDefault
vstr | list[str] | dict[str, Any] | Nonerequired
Returns list[str] | dict[str, Any]
Blocks / Registry

BlockRegistry

from sdg_hub.core.blocks import BlockRegistry

Registry for block classes with metadata and enhanced error handling.

Methods

BlockRegistry.categoriesclassmethod
(cls) -> list[str]

Get all available categories.

Returns list[str]
BlockRegistry.discover_blocksclassmethod
(cls) -> None

Print a Rich-formatted table of all available blocks.

BlockRegistry.list_blocksclassmethod
(cls, category: Optional[str] = None, *, grouped: bool = False, include_deprecated: bool = True) -> list[str] | dict[str, list[str]]

List registered blocks, optionally filtered by category.

ParameterTypeDefault
categoryOptional[str]--
groupedboolFalse
include_deprecatedboolTrue
Returns list[str] | dict[str, list[str]]
BlockRegistry.registerclassmethod
(cls, block_name: str, category: str, description: str = '', deprecated: bool = False, replacement: Optional[str] = None)

Register a block class with metadata.

ParameterTypeDefault
block_namestrrequired
categorystrrequired
descriptionstr""
deprecatedboolFalse
replacementOptional[str]--
Blocks / Registry

BlockMetadata

from sdg_hub.core.blocks.registry import BlockMetadata

Metadata for registered blocks.

Fields

NameTypeDefaultDescription
namestrrequired--
block_classtyperequired--
categorystrrequired--
descriptionstr""--
deprecatedboolFalse--
replacementOptional[str]----

Methods

BlockMetadata.__init__
(self, name: str, block_class: type, category: str, description: str = '', deprecated: bool = False, replacement: Optional[str] = None) -> None

Initialize self. See help(type(self)) for accurate signature.

ParameterTypeDefault
namestrrequired
block_classtyperequired
categorystrrequired
descriptionstr""
deprecatedboolFalse
replacementOptional[str]--
Blocks / LLM

LLMChatBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import LLMChatBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"llm"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
modelOptional[str]--Model identifier in LiteLLM format
api_keyOptional[pydantic.types.SecretStr]--API key for the provider
api_baseOptional[str]--Base URL for the API
async_modeboolFalseWhether to use async processing
timeoutfloat120Request timeout in seconds
num_retriesint6Number of retry attempts (uses LiteLLM's built-in retry mechanism)
drop_paramsboolTrueWhether to drop unsupported parameters to prevent API errors
max_completion_tokensOptional[int]--Maximum completion tokens (used by newer models like GPT-5 instead of max_tokens). When set, max_tokens is automatically excluded.

Methods

LLMChatBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate responses from the LLM.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
LLMChatBlock.model_post_init
(self, _LLMChatBlock__context) -> None

Initialize after Pydantic validation.

ParameterTypeDefault
_LLMChatBlock__contextrequired
LLMChatBlock.validate_single_input_colclassmethod
(cls, v)

Ensure exactly one input column.

ParameterTypeDefault
vrequired
LLMChatBlock.validate_single_output_colclassmethod
(cls, v)

Ensure exactly one output column.

ParameterTypeDefault
vrequired
Blocks / LLM

PromptBuilderBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import PromptBuilderBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"llm_util"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
prompt_config_pathstrrequiredPath to YAML file containing the Jinja template configuration
format_as_messagesboolTrueWhether to format output as chat messages
prompt_template_configOptional[prompt_builder_block.PromptTemplateConfig]--Loaded prompt template configuration
prompt_rendererOptional[prompt_builder_block.PromptRenderer]--Prompt renderer instance

Methods

PromptBuilderBlock.generate
(self, samples: pandas.DataFrame, **_kwargs: Any) -> pandas.DataFrame

Generate formatted output for all samples.

ParameterTypeDefault
samplespandas.DataFramerequired
_kwargsAnyrequired
Returns pandas.DataFrame
PromptBuilderBlock.model_post_init
(self, _PromptBuilderBlock__context: Any) -> None

Initialize the block after Pydantic validation.

ParameterTypeDefault
_PromptBuilderBlock__contextAnyrequired
PromptBuilderBlock.validate_single_output_colclassmethod
(cls, v)

Validate that exactly one output column is specified.

ParameterTypeDefault
vrequired
Blocks / LLM

ChatMessage

Bases: BaseModel
from sdg_hub.core.blocks.llm.prompt_builder_block import ChatMessage

Pydantic model for chat messages with proper validation.

Fields

NameTypeDefaultDescription
roleLiteral[system, user, assistant, tool]required--
contentstrrequired--

Methods

ChatMessage.validate_content_not_emptyclassmethod
(cls, v: str) -> str

Ensure content is not empty or just whitespace.

ParameterTypeDefault
vstrrequired
Returns str
Blocks / LLM

MessageTemplate

Bases: BaseModel
from sdg_hub.core.blocks.llm.prompt_builder_block import MessageTemplate

Template for a chat message with Jinja2 template and original source.

Fields

NameTypeDefaultDescription
roleLiteral[system, user, assistant, tool]required--
content_templatejinja2.environment.Templaterequired--
original_sourcestrrequired--
Blocks / LLM

PromptTemplateConfig

from sdg_hub.core.blocks.llm.prompt_builder_block import PromptTemplateConfig

Self-contained class for loading and validating YAML prompt configurations.

Methods

PromptTemplateConfig.__init__
(self, config_path: str)

Initialize with path to YAML config file.

ParameterTypeDefault
config_pathstrrequired
PromptTemplateConfig.get_message_templates
(self) -> list[prompt_builder_block.MessageTemplate]

Return the compiled message templates.

Returns list[prompt_builder_block.MessageTemplate]
Blocks / LLM

PromptRenderer

from sdg_hub.core.blocks.llm.prompt_builder_block import PromptRenderer

Handles rendering of message templates with variable substitution.

Methods

PromptRenderer.__init__
(self, message_templates: list[prompt_builder_block.MessageTemplate])

Initialize with a list of message templates.

ParameterTypeDefault
message_templateslist[prompt_builder_block.MessageTemplate]required
PromptRenderer.get_required_variables
(self) -> set

Extract all required variables from message templates.

Returns set
PromptRenderer.render_messages
(self, template_vars: dict[str, Any]) -> list[prompt_builder_block.ChatMessage]

Render all message templates with the given variables.

ParameterTypeDefault
template_varsdict[str, Any]required
Returns list[prompt_builder_block.ChatMessage]
PromptRenderer.resolve_template_vars
(self, sample: dict[str, Any], input_cols) -> dict[str, Any]

Resolve template variables from dataset columns based on input_cols.

ParameterTypeDefault
sampledict[str, Any]required
input_colsrequired
Returns dict[str, Any]
Blocks / LLM

LLMResponseExtractorBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import LLMResponseExtractorBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"llm_util"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
extract_contentboolTrueWhether to extract 'content' field from responses.
extract_reasoning_contentboolFalseWhether to extract 'reasoning_content' field from responses.
extract_tool_callsboolFalseWhether to extract 'tool_calls' field from responses.
expand_listsboolTrueWhether to expand list inputs into individual rows (True) or preserve lists (False).
field_prefixstr""Prefix to add to output field names (e.g., 'llm_' results in 'llm_content', 'llm_reasoning_content').

Methods

LLMResponseExtractorBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
LLMResponseExtractorBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
LLMResponseExtractorBlock.validate_extraction_configuration
(self)

Validate that at least one extraction field is enabled and pre-compute field names.

Blocks / Parsing

JSONParserBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import JSONParserBlock

Block for parsing JSON from text and expanding fields into columns.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"parsing"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
field_prefixstr""Optional prefix to add to extracted column names
fix_trailing_commasboolTrueWhether to fix trailing commas in JSON (common LLM output issue)
extract_embeddedboolTrueWhether to extract JSON embedded in surrounding text
drop_inputboolFalseWhether to drop the input column after extraction

Methods

JSONParserBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with JSON fields expanded into columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
JSONParserBlock.validate_input_colsclassmethod
(cls, v: list[str]) -> list[str]

Validate that exactly one input column is specified.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Blocks / Parsing

RegexParserBlock

Bases: BaseTextParserBlock, BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import RegexParserBlock

Block for parsing text content using regex patterns.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"parser"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
parser_cleanup_tagsOptional[list[str]]--Tags to remove from extracted content
parsing_patternstrrequired--

Methods

RegexParserBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
Blocks / Parsing

TagParserBlock

Bases: BaseTextParserBlock, BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import TagParserBlock

Block for parsing text content using start/end tags.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"parser"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
parser_cleanup_tagsOptional[list[str]]--Tags to remove from extracted content
start_tagslist[str]requiredStart tags for extraction
end_tagslist[str]requiredEnd tags for extraction

Methods

TagParserBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
TagParserBlock.normalize_tagsclassmethod
(cls, v)
ParameterTypeDefault
vrequired
TagParserBlock.validate_tags
(self)
Blocks / Transform

DuplicateColumnsBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import DuplicateColumnsBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

DuplicateColumnsBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with duplicated columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
DuplicateColumnsBlock.model_post_init
(self, _DuplicateColumnsBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_DuplicateColumnsBlock__contextAnyrequired
DuplicateColumnsBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is a non-empty dict.

ParameterTypeDefault
vrequired
Blocks / Transform

IndexBasedMapperBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import IndexBasedMapperBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
choice_mapdict[str, str]requiredDictionary mapping choice values to column names
choice_colslist[str]requiredList of column names containing choice values

Methods

IndexBasedMapperBlock.generate
(self, samples: pandas.DataFrame, **kwargs) -> pandas.DataFrame

Generate a new dataset with selected values.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsrequired
Returns pandas.DataFrame
IndexBasedMapperBlock.model_post_init
(self, _IndexBasedMapperBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_IndexBasedMapperBlock__contextAnyrequired
IndexBasedMapperBlock.validate_choice_cols_not_emptyclassmethod
(cls, v)

Validate that choice_cols is not empty.

ParameterTypeDefault
vrequired
IndexBasedMapperBlock.validate_choice_mapclassmethod
(cls, v)

Validate that choice_map is not empty.

ParameterTypeDefault
vrequired
IndexBasedMapperBlock.validate_input_output_consistency
(self)

Validate that choice_cols and output_cols have same length and consistency.

Blocks / Transform

JSONStructureBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import JSONStructureBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
ensure_json_serializableboolTrueWhether to ensure all values are JSON serializable
pretty_printboolFalseWhether to format JSON with indentation

Methods

JSONStructureBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with JSON structured output.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
JSONStructureBlock.validate_output_colsclassmethod
(cls, v)

Validate that exactly one output column is specified.

ParameterTypeDefault
vrequired
Blocks / Transform

MeltColumnsBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import MeltColumnsBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

MeltColumnsBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a flattened dataset in long format.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
MeltColumnsBlock.model_post_init
(self, _MeltColumnsBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_MeltColumnsBlock__contextAnyrequired
MeltColumnsBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is not empty.

ParameterTypeDefault
vrequired
MeltColumnsBlock.validate_output_colsclassmethod
(cls, v)

Validate that exactly two output columns are specified.

ParameterTypeDefault
vrequired
Blocks / Transform

RenameColumnsBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import RenameColumnsBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict

Methods

RenameColumnsBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with renamed columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
RenameColumnsBlock.model_post_init
(self, _RenameColumnsBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_RenameColumnsBlock__contextAnyrequired
RenameColumnsBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is a non-empty dict.

ParameterTypeDefault
vrequired
Blocks / Transform

RowMultiplierBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import RowMultiplierBlock

Block for duplicating dataset rows.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
num_samplesintrequiredNumber of times to duplicate each row
shuffleboolFalseShuffle output rows after duplication
random_seedOptional[int]--Seed for reproducible shuffling

Methods

RowMultiplierBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with duplicated rows.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
Blocks / Transform

SamplerBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import SamplerBlock

Block for randomly sampling values from list columns.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
num_samplesint5Number of values to randomly sample from each list
random_seedOptional[int]--Random seed for reproducibility
return_scalarboolFalseWhen num_samples=1, return scalar value instead of single-element list

Methods

SamplerBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with sampled values.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
SamplerBlock.validate_input_colsclassmethod
(cls, v: list[str]) -> list[str]

Validate that exactly one input column is specified.

ParameterTypeDefault
vlist[str]required
Returns list[str]
SamplerBlock.validate_num_samplesclassmethod
(cls, v: int) -> int

Validate that num_samples is at least 1.

ParameterTypeDefault
vintrequired
Returns int
SamplerBlock.validate_output_colsclassmethod
(cls, v: list[str]) -> list[str]

Validate that exactly one output column is specified.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Blocks / Transform

TextConcatBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import TextConcatBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
separatorstr" "Separator to use between combined values

Methods

TextConcatBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate a dataset with combined columns.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
TextConcatBlock.validate_input_colsclassmethod
(cls, v)

Validate that input_cols is a non-empty list.

ParameterTypeDefault
vrequired
TextConcatBlock.validate_output_colsclassmethod
(cls, v)

Validate that exactly one output column is specified.

ParameterTypeDefault
vrequired
Blocks / Transform

UniformColumnValueSetter

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import UniformColumnValueSetter

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"transform"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
reduction_strategyLiteral[mode, min, max, mean, median]"mode"--

Methods

UniformColumnValueSetter.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
UniformColumnValueSetter.model_post_init
(self, _UniformColumnValueSetter__context: Any) -> None

Override this method to perform additional initialization after `__init__` and `model_construct`. This is useful if you want to do some validation that requires the entire model to be initialized.

ParameterTypeDefault
_UniformColumnValueSetter__contextAnyrequired
UniformColumnValueSetter.validate_input_cols_singleclassmethod
(cls, v)
ParameterTypeDefault
vrequired
Blocks / Filtering

ColumnValueFilterBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import ColumnValueFilterBlock

Base class for all blocks, with standardized patterns and full Pydantic compatibility.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"filtering"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
filter_valueUnion[Any, list[Any]]requiredThe value(s) to filter by
operationstrrequiredString name of binary operator for comparison (e.g., 'eq', 'contains')
convert_dtypeOptional[str]--String name of type to convert filter column to ('float' or 'int')

Methods

ColumnValueFilterBlock.generate
(self, samples: pandas.DataFrame, **_kwargs: Any) -> pandas.DataFrame

Generate filtered dataset based on specified conditions.

ParameterTypeDefault
samplespandas.DataFramerequired
_kwargsAnyrequired
Returns pandas.DataFrame
ColumnValueFilterBlock.model_post_init
(self, _ColumnValueFilterBlock__context: Any) -> None

Initialize derived attributes after Pydantic validation.

ParameterTypeDefault
_ColumnValueFilterBlock__contextAnyrequired
ColumnValueFilterBlock.validate_convert_dtypeclassmethod
(cls, v)

Validate that convert_dtype is a supported type string.

ParameterTypeDefault
vrequired
ColumnValueFilterBlock.validate_input_cols_not_emptyclassmethod
(cls, v)

Validate that we have at least one input column.

ParameterTypeDefault
vrequired
ColumnValueFilterBlock.validate_operationclassmethod
(cls, v)

Validate that operation is a supported operation string.

ParameterTypeDefault
vrequired
Blocks / Agent

AgentBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import AgentBlock

Block for executing external agent frameworks on DataFrame rows.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"agent"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
agent_frameworkstrrequiredConnector name (e.g., 'langflow')
agent_urlstrrequiredAgent API endpoint URL
agent_api_keyOptional[str]--API key for authentication
timeoutfloat120Request timeout in seconds
max_retriesint3Maximum retry attempts
session_id_colOptional[str]--Column containing session IDs
async_modeboolFalseUse async execution for better throughput
max_concurrencyint10Maximum concurrent requests in async mode
connector_kwargsdict[str, Any][object Object]Extra keyword arguments passed to the connector constructor. Use for framework-specific settings like assistant_id for LangGraph.

Methods

AgentBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Process DataFrame rows through the agent.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
AgentBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
Blocks / Agent

AgentResponseExtractorBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import AgentResponseExtractorBlock

Block for extracting fields from agent framework response objects.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"agent_util"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
agent_frameworkstrrequiredAgent framework whose response format to parse (e.g., 'langflow')
extract_textboolTrueWhether to extract text content from responses.
extract_session_idboolFalseWhether to extract session_id from responses.
extract_tool_traceboolFalseWhether to extract the full tool call trace from agent responses. For Langflow, this extracts the content_blocks 'Agent Steps' array containing structured tool_use entries (name, tool_input, output) and text entries (input/output messages).
expand_listsboolTrueWhether to expand list inputs into individual rows (True) or preserve lists (False).
field_prefixstr""Prefix to add to output field names (e.g., 'agent_' results in 'agent_text').

Methods

AgentResponseExtractorBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Subclass method to implement data generation logic.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
AgentResponseExtractorBlock.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
AgentResponseExtractorBlock.validate_extraction_configuration
(self)

Validate that at least one extraction field is enabled and pre-compute field names.

Blocks / MCP

MCPAgentBlock

Bases: BaseBlock, BaseModel, ABC
from sdg_hub.core.blocks import MCPAgentBlock

LLM agent block that connects to remote MCP servers for tool use.

Fields

NameTypeDefaultDescription
block_namestrrequiredUnique identifier for this block instance
block_typestr"mcp"--
input_colsUnion[str, list[str], dict[str, Any], NoneType]--Input columns: str, list, or dict
output_colsUnion[str, list[str], dict[str, Any], NoneType]--Output columns: str, list, or dict
mcp_server_urlstrrequiredURL of the remote MCP server
mcp_headersUnion[dict[str, str], NoneType]--HTTP headers for MCP server authentication
modelstrrequiredModel identifier in LiteLLM format
api_keyOptional[pydantic.types.SecretStr]--API key for the LLM provider
api_baseOptional[str]--Base URL for the LLM API
max_iterationsint10Maximum number of agentic loop iterations
system_promptOptional[str]--System prompt to prepend to conversations

Methods

MCPAgentBlock.generate
(self, samples: pandas.DataFrame, **kwargs: Any) -> pandas.DataFrame

Generate responses using LLM with MCP tools.

ParameterTypeDefault
samplespandas.DataFramerequired
kwargsAnyrequired
Returns pandas.DataFrame
MCPAgentBlock.model_post_init
(self, _MCPAgentBlock__context) -> None

Initialize after Pydantic validation.

ParameterTypeDefault
_MCPAgentBlock__contextrequired
MCPAgentBlock.validate_max_iterationsclassmethod
(cls, v)

Ensure max_iterations is positive.

ParameterTypeDefault
vrequired
MCPAgentBlock.validate_single_input_colclassmethod
(cls, v)

Ensure exactly one input column.

ParameterTypeDefault
vrequired
MCPAgentBlock.validate_single_output_colclassmethod
(cls, v)

Ensure exactly one output column.

ParameterTypeDefault
vrequired
Flow / Base

Flow

Bases: BaseModel
from sdg_hub import Flow

Pydantic-based flow for chaining data generation blocks.

Fields

NameTypeDefaultDescription
blockslist[BaseBlock]Ordered list of blocks to execute in the flow
metadataFlowMetadatarequiredFlow metadata including name, version, author, etc.

Methods

Flow.add_block
(self, block: BaseBlock) -> 'Flow'

Add a block to the flow, returning a new Flow instance.

ParameterTypeDefault
blockBaseBlockrequired
Returns Flow
Flow.dry_run
(self, dataset: Union[pandas.DataFrame, Dataset], sample_size: int = 2, runtime_params: Optional[dict[str, dict[str, Any]]] = None, max_concurrency: Optional[int] = None, enable_time_estimation: bool = False) -> dict[str, Any]

Perform a dry run of the flow with a subset of data.

ParameterTypeDefault
datasetUnion[pandas.DataFrame, Dataset]required
sample_sizeint2
runtime_paramsUnion[dict[str, dict[str, Any]], NoneType]--
max_concurrencyOptional[int]--
enable_time_estimationboolFalse
Returns dict[str, Any]
Flow.from_yamlclassmethod
(cls, yaml_path: str) -> 'Flow'

Load flow from YAML configuration file.

ParameterTypeDefault
yaml_pathstrrequired
Returns Flow
Flow.generate
(self, dataset: Union[pandas.DataFrame, Dataset], runtime_params: Optional[dict[str, dict[str, Any]]] = None, checkpoint_dir: Optional[str] = None, save_freq: Optional[int] = None, log_dir: Optional[str] = None, max_concurrency: Optional[int] = None) -> Union[pandas.DataFrame, Dataset]

Execute the flow blocks in sequence to generate data.

ParameterTypeDefault
datasetUnion[pandas.DataFrame, Dataset]required
runtime_paramsUnion[dict[str, dict[str, Any]], NoneType]--
checkpoint_dirOptional[str]--
save_freqOptional[int]--
log_dirOptional[str]--
max_concurrencyOptional[int]--
Returns Union[pandas.DataFrame, Dataset]
Flow.get_dataset_requirements
(self) -> Optional[DatasetRequirements]

Get the dataset requirements for this flow, or None if not defined.

Returns Optional[DatasetRequirements]
Flow.get_dataset_schema
(self) -> pandas.DataFrame

Get an empty DataFrame with the correct schema for this flow.

Returns pandas.DataFrame
Flow.get_default_model
(self) -> Optional[str]

Get the default recommended model for this flow, or None if unspecified.

Returns Optional[str]
Flow.get_info
(self) -> dict[str, Any]

Get information about the flow.

Returns dict[str, Any]
Flow.get_model_recommendations
(self) -> dict[str, Any]

Get model recommendations dict with 'default', 'compatible', 'experimental' keys.

Returns dict[str, Any]
Flow.is_agent_config_required
(self) -> bool

Check if agent configuration is required (True if flow has agent blocks).

Returns bool
Flow.is_agent_config_set
(self) -> bool

Check if agent configuration has been set or is not required.

Returns bool
Flow.is_model_config_required
(self) -> bool

Check if model configuration is required (True if flow has LLM blocks).

Returns bool
Flow.is_model_config_set
(self) -> bool

Check if model configuration has been set or is not required.

Returns bool
Flow.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
Flow.print_info
(self) -> None

Print an interactive summary of the Flow in the console using rich.

Flow.reset_agent_config
(self) -> None

Reset agent configuration flag (useful for testing or reconfiguration).

Flow.reset_model_config
(self) -> None

Reset model configuration flag (useful for testing or reconfiguration).

Flow.set_agent_config
(self, agent_framework: Optional[str] = None, agent_url: Optional[str] = None, agent_api_key: Optional[str] = None, blocks: Optional[list[str]] = None, **kwargs: Any) -> None

Configure agent settings for agent blocks in this flow (in-place).

ParameterTypeDefault
agent_frameworkOptional[str]--
agent_urlOptional[str]--
agent_api_keyOptional[str]--
blocksOptional[list[str]]--
kwargsAnyrequired
Flow.set_model_config
(self, model: Optional[str] = None, api_base: Optional[str] = None, api_key: Optional[str] = None, blocks: Optional[list[str]] = None, **kwargs: Any) -> None

Configure model settings for LLM blocks in this flow (in-place).

ParameterTypeDefault
modelOptional[str]--
api_baseOptional[str]--
api_keyOptional[str]--
blocksOptional[list[str]]--
kwargsAnyrequired
Flow.to_yaml
(self, output_path: str) -> None

Save flow configuration to YAML file.

ParameterTypeDefault
output_pathstrrequired
Flow.validate_block_names_unique
(self) -> 'Flow'

Ensure all block names are unique within the flow.

Returns Flow
Flow.validate_blocksclassmethod
(cls, v: list[BaseBlock]) -> list[BaseBlock]

Validate that all blocks are BaseBlock instances.

ParameterTypeDefault
vlist[BaseBlock]required
Returns list[BaseBlock]
Flow.validate_dataset
(self, dataset: Union[pandas.DataFrame, Dataset]) -> list[str]

Validate dataset against flow requirements. Returns list of error messages.

ParameterTypeDefault
datasetUnion[pandas.DataFrame, Dataset]required
Returns list[str]
Flow / Registry

FlowRegistry

from sdg_hub import FlowRegistry

Registry for managing contributed flows.

Methods

FlowRegistry.discover_flowsclassmethod
(cls) -> None

Discover and display all flows in a formatted table.

FlowRegistry.get_flow_metadataclassmethod
(cls, flow_name: str) -> Optional[FlowMetadata]

Get metadata for a registered flow.

ParameterTypeDefault
flow_namestrrequired
Returns Optional[FlowMetadata]
FlowRegistry.get_flow_pathclassmethod
(cls, flow_name_or_id: str) -> Optional[str]

Get the path to a registered flow.

ParameterTypeDefault
flow_name_or_idstrrequired
Returns Optional[str]
FlowRegistry.get_flow_path_safeclassmethod
(cls, flow_name_or_id: str) -> str

Get the path to a registered flow with better error handling.

ParameterTypeDefault
flow_name_or_idstrrequired
Returns str
FlowRegistry.get_flows_by_categoryclassmethod
(cls) -> Dict[str, List[Dict[str, str]]]

Get flows organized by their primary tag.

Returns dict[str, list[dict[str, str]]]
FlowRegistry.list_flowsclassmethod
(cls) -> List[Dict[str, str]]

List all registered flows with their IDs.

Returns list[dict[str, str]]
FlowRegistry.register_search_pathclassmethod
(cls, path: str) -> None

Add a directory to search for flows.

ParameterTypeDefault
pathstrrequired
FlowRegistry.search_flowsclassmethod
(cls, tag: Optional[str] = None, author: Optional[str] = None) -> List[Dict[str, str]]

Search flows by criteria.

ParameterTypeDefault
tagOptional[str]--
authorOptional[str]--
Returns list[dict[str, str]]
Flow / Registry

FlowRegistryEntry

from sdg_hub.core.flow.registry import FlowRegistryEntry

Entry in the flow registry.

Fields

NameTypeDefaultDescription
pathstrrequired--
metadataFlowMetadatarequired--

Methods

FlowRegistryEntry.__init__
(self, path: str, metadata: FlowMetadata) -> None

Initialize self. See help(type(self)) for accurate signature.

ParameterTypeDefault
pathstrrequired
metadataFlowMetadatarequired
Flow / Metadata

FlowMetadata

Bases: BaseModel
from sdg_hub import FlowMetadata

Metadata for flow configuration and open source contributions.

Fields

NameTypeDefaultDescription
namestrrequiredHuman-readable name
idstr""Unique identifier for the flow, generated from name
descriptionstr""Detailed description
versionstr"1.0.0"Semantic version
authorstr""Author or contributor name
recommended_modelsOptional[RecommendedModels]--Simplified recommended models structure
tagslist[str]Tags for categorization and search
licensestr"Apache-2.0"License identifier
dataset_requirementsOptional[DatasetRequirements]--Requirements for input datasets
output_columnsOptional[list[str]]--Columns to keep in the final output. Original input columns are always preserved.

Methods

FlowMetadata.ensure_id
(self) -> 'FlowMetadata'

Ensure id is set.

Returns FlowMetadata
FlowMetadata.get_best_model
(self, available_models: Optional[list[str]] = None) -> Optional[str]

Get the best recommended model based on availability.

ParameterTypeDefault
available_modelsOptional[list[str]]--
Returns Optional[str]
FlowMetadata.validate_idclassmethod
(cls, v: str) -> str

Validate flow id.

ParameterTypeDefault
vstrrequired
Returns str
FlowMetadata.validate_output_columnsclassmethod
(cls, v: Optional[list[str]]) -> Optional[list[str]]

Validate and clean output columns.

ParameterTypeDefault
vOptional[list[str]]required
Returns Optional[list[str]]
FlowMetadata.validate_recommended_modelsclassmethod
(cls, v: Optional[RecommendedModels]) -> Optional[RecommendedModels]

Validate recommended models structure.

ParameterTypeDefault
vOptional[RecommendedModels]required
Returns Optional[RecommendedModels]
FlowMetadata.validate_tagsclassmethod
(cls, v: list[str]) -> list[str]

Validate and clean tags.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Flow / Metadata

RecommendedModels

Bases: BaseModel
from sdg_hub.core.flow.metadata import RecommendedModels

Simplified recommended models structure.

Fields

NameTypeDefaultDescription
defaultstrrequiredDefault model to use
compatiblelist[str]Compatible models
experimentallist[str]Experimental models

Methods

RecommendedModels.get_all_models
(self) -> list[str]

Get all models (default + compatible + experimental).

Returns list[str]
RecommendedModels.get_best_model
(self, available_models: Optional[list[str]] = None) -> Optional[str]

Get the best model based on availability.

ParameterTypeDefault
available_modelsOptional[list[str]]--
Returns Optional[str]
RecommendedModels.validate_defaultclassmethod
(cls, v: str) -> str

Validate default model name is not empty.

ParameterTypeDefault
vstrrequired
Returns str
RecommendedModels.validate_model_listsclassmethod
(cls, v: list[str]) -> list[str]

Validate model lists contain non-empty names.

ParameterTypeDefault
vlist[str]required
Returns list[str]
Flow / Metadata

DatasetRequirements

Bases: BaseModel
from sdg_hub.core.flow.metadata import DatasetRequirements

Dataset requirements for flow execution.

Fields

NameTypeDefaultDescription
required_columnslist[str]Column names that must be present
optional_columnslist[str]Optional columns that can enhance performance
min_samplesint1Minimum number of samples required
max_samplesOptional[int]--Maximum number of samples to process
column_typesdict[str, str][object Object]Expected types for specific columns
descriptionstr""Human-readable description

Methods

DatasetRequirements.validate_column_namesclassmethod
(cls, v: list[str]) -> list[str]

Validate column names are not empty.

ParameterTypeDefault
vlist[str]required
Returns list[str]
DatasetRequirements.validate_dataset
(self, dataset_columns: list[str], dataset_size: int) -> list[str]

Validate a dataset against these requirements.

ParameterTypeDefault
dataset_columnslist[str]required
dataset_sizeintrequired
Returns list[str]
DatasetRequirements.validate_sample_limits
(self) -> 'DatasetRequirements'

Validate sample limits are consistent.

Returns DatasetRequirements
Flow / Metadata

ModelOption

Bases: BaseModel
from sdg_hub.core.flow.metadata import ModelOption

Represents a model option with compatibility level.

Fields

NameTypeDefaultDescription
namestrrequiredModel identifier
compatibilityModelCompatibility"compatible"Compatibility level with the flow

Methods

ModelOption.validate_nameclassmethod
(cls, v: str) -> str

Validate model name is not empty.

ParameterTypeDefault
vstrrequired
Returns str
Flow / Metadata

ModelCompatibility

Bases: Enum
from sdg_hub.core.flow.metadata import ModelCompatibility

Model compatibility levels.

Connectors / Base

BaseConnector

Bases: BaseModel, ABC
from sdg_hub.core.connectors.base import BaseConnector

Abstract base class for all connectors.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration

Methods

BaseConnector.aexecute
(self, request: Any) -> Any

Execute an asynchronous request.

ParameterTypeDefault
requestAnyrequired
Returns Any
BaseConnector.executeabstract
(self, request: Any) -> Any

Execute a synchronous request.

ParameterTypeDefault
requestAnyrequired
Returns Any
Connectors / Base

ConnectorConfig

Bases: BaseModel
from sdg_hub.core.connectors.base import ConnectorConfig

Base configuration for all connectors.

Fields

NameTypeDefaultDescription
urlOptional[str]--Base URL for the service
api_keyOptional[str]--API key for authentication
timeoutfloat120Request timeout in seconds
max_retriesint3Maximum retry attempts
Connectors / Registry

ConnectorRegistry

from sdg_hub.core.connectors.registry import ConnectorRegistry

Global registry for connector classes.

Methods

ConnectorRegistry.clearclassmethod
(cls) -> 'None'

Clear all registered connectors. Primarily for testing.

Returns None
ConnectorRegistry.getclassmethod
(cls, name: 'str') -> 'type[BaseConnector]'

Get a connector class by name.

ParameterTypeDefault
namestrrequired
Returns type[BaseConnector]
ConnectorRegistry.list_allclassmethod
(cls) -> 'list[str]'

Get all registered connector names.

Returns list[str]
ConnectorRegistry.registerclassmethod
(cls, name: 'str')

Register a connector class.

ParameterTypeDefault
namestrrequired
Connectors / Agent

BaseAgentConnector

Bases: BaseConnector, BaseModel, ABC
from sdg_hub.core.connectors.agent.base import BaseAgentConnector

Base class for agent framework connectors.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration

Methods

BaseAgentConnector.asend
(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]

Async send - convenience wrapper.

ParameterTypeDefault
messageslist[dict[str, Any]]required
session_idstrrequired
Returns dict[str, Any]
BaseAgentConnector.build_requestabstract
(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]

Build framework-specific request payload.

ParameterTypeDefault
messageslist[dict[str, Any]]required
session_idstrrequired
Returns dict[str, Any]
BaseAgentConnector.execute
(self, request: dict[str, Any]) -> dict[str, Any]

Execute a request (BaseConnector interface).

ParameterTypeDefault
requestdict[str, Any]required
Returns dict[str, Any]
BaseAgentConnector.extract_session_idclassmethod
(cls, response: dict[str, Any]) -> str | None

Extract session ID from a framework response.

ParameterTypeDefault
responsedict[str, Any]required
Returns str | None
BaseAgentConnector.extract_textclassmethod
(cls, response: dict[str, Any]) -> str | None

Extract text content from a framework response.

ParameterTypeDefault
responsedict[str, Any]required
Returns str | None
BaseAgentConnector.extract_tool_traceclassmethod
(cls, response: dict[str, Any]) -> list[dict[str, Any]] | None

Extract tool call trace from a framework response.

ParameterTypeDefault
responsedict[str, Any]required
Returns list[dict[str, Any]] | None
BaseAgentConnector.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
BaseAgentConnector.parse_responseabstract
(self, response: dict[str, Any]) -> dict[str, Any]

Parse and validate framework response.

ParameterTypeDefault
responsedict[str, Any]required
Returns dict[str, Any]
BaseAgentConnector.send
(self, messages: list[dict[str, Any]], session_id: str, async_mode: bool = False)

Send messages to the agent.

ParameterTypeDefault
messageslist[dict[str, Any]]required
session_idstrrequired
async_modeboolFalse
Connectors / Agent

LangflowConnector

Bases: BaseAgentConnector, BaseConnector, BaseModel, ABC
from sdg_hub.core.connectors.agent.langflow import LangflowConnector

Connector for Langflow agent framework.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration

Methods

LangflowConnector.build_request
(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]

Build Langflow-specific request payload.

ParameterTypeDefault
messageslist[dict[str, Any]]required
session_idstrrequired
Returns dict[str, Any]
LangflowConnector.extract_session_idclassmethod
(cls, response: dict[str, Any]) -> str | None

Extract session ID from a Langflow response.

ParameterTypeDefault
responsedict[str, Any]required
Returns str | None
LangflowConnector.extract_textclassmethod
(cls, response: dict[str, Any]) -> str | None

Extract text content from a Langflow response.

ParameterTypeDefault
responsedict[str, Any]required
Returns str | None
LangflowConnector.extract_tool_traceclassmethod
(cls, response: dict[str, Any]) -> list[dict[str, Any]] | None

Extract tool call trace from Langflow content_blocks.

ParameterTypeDefault
responsedict[str, Any]required
Returns list[dict[str, Any]] | None
LangflowConnector.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
LangflowConnector.parse_response
(self, response: dict[str, Any]) -> dict[str, Any]

Parse Langflow response.

ParameterTypeDefault
responsedict[str, Any]required
Returns dict[str, Any]
Connectors / Agent

LangGraphConnector

Bases: BaseAgentConnector, BaseConnector, BaseModel, ABC
from sdg_hub.core.connectors.agent.langgraph import LangGraphConnector

Connector for LangGraph agent framework.

Fields

NameTypeDefaultDescription
configConnectorConfigrequiredConnector configuration
assistant_idstr"agent"The assistant ID or graph name to run.
run_configdict[str, Any][object Object]Optional configuration dict passed in the run payload. Merged as the 'config' key in the LangGraph /runs/wait request. Use this to pass runtime parameters to the graph via 'configurable', e.g. ``{'configurable': {'model': 'gpt-4o'}}``.

Methods

LangGraphConnector.build_request
(self, messages: list[dict[str, Any]], session_id: str) -> dict[str, Any]

Build LangGraph run request payload.

ParameterTypeDefault
messageslist[dict[str, Any]]required
session_idstrrequired
Returns dict[str, Any]
LangGraphConnector.extract_session_idclassmethod
(cls, response: dict[str, Any]) -> str | None

Extract session ID from a LangGraph response.

ParameterTypeDefault
responsedict[str, Any]required
Returns str | None
LangGraphConnector.extract_textclassmethod
(cls, response: dict[str, Any]) -> str | None

Extract text from the last AI message in LangGraph state.

ParameterTypeDefault
responsedict[str, Any]required
Returns str | None
LangGraphConnector.extract_tool_traceclassmethod
(cls, response: dict[str, Any]) -> list[dict[str, Any]] | None

Extract tool call trace from LangGraph messages.

ParameterTypeDefault
responsedict[str, Any]required
Returns list[dict[str, Any]] | None
LangGraphConnector.model_post_init
(self: 'BaseModel', context: 'Any', /) -> 'None'

This function is meant to behave like a BaseModel method to initialize private attributes.

ParameterTypeDefault
contextAnyrequired
Returns None
LangGraphConnector.parse_response
(self, response: dict[str, Any]) -> dict[str, Any]

Parse LangGraph response.

ParameterTypeDefault
responsedict[str, Any]required
Returns dict[str, Any]