Overview
Dagster provides a comprehensive Python API for building, testing, and deploying data pipelines. This reference documents all public APIs available in thedagster package.
Core Decorators
The most commonly used decorators for defining pipelines:@asset
Define software-defined assets that represent data products
@op
Define operations that perform computation
@job
Define jobs that orchestrate ops or assets
@resource
Define reusable resources for sharing state and connections
Quick Start Example
API Categories
Assets
Define and materialize data assets with full lineage tracking. Core APIs:@asset- Define a single asset@multi_asset- Define multiple assets from one functionAssetSpec- Specify asset metadata without materialization logicAssetKey- Unique identifier for assetsAssetDep- Express dependencies between assetsAssetIn/AssetOut- Configure asset inputs and outputsAssetSelection- Select groups of assetsSourceAsset- Reference external assets@observable_source_asset- Monitor external assets
@asset_check- Define data quality checksAssetCheckResult- Return check resultsAssetCheckSpec- Specify check configurationbuild_last_update_freshness_checks- Monitor data freshnessbuild_column_schema_change_checks- Detect schema changesbuild_metadata_bounds_checks- Validate metadata bounds
materialize()- Execute assets eagerlymaterialize_to_memory()- Execute and return results in memoryMaterializeResult- Return from asset functions
Ops, Jobs & Graphs
Build computational graphs with ops and compose them into jobs. Core APIs:@op- Define computational units@job- Define executable jobs@graph- Compose ops into reusable graphs@graph_asset/@graph_multi_asset- Turn graphs into assetsOpDefinition/JobDefinition/GraphDefinition- Programmatic definitionsIn/Out/DynamicOut- Configure op inputs and outputsGraphIn/GraphOut- Configure graph boundariesOutput/DynamicOutput- Return values from ops
execute_job()- Execute jobs programmaticallyJobExecutionResult/ExecuteInProcessResult- Inspect resultsDependencyDefinition- Define op dependenciesNodeInvocation- Invoke ops with custom configuration
Resources & IO Managers
Share state, connections, and handle data persistence. Resources:@resource- Define legacy resourcesConfigurableResource- Define Pythonic resources with type safetyResourceParam- Annotate resource parametersResourceDefinition- Programmatic resource definitionbuild_resources()- Test resources in isolation
IOManager- Handle asset and op output persistence@io_manager- Define IO managersConfigurableIOManager- Pythonic IO manager base classUPathIOManager- Universal path IO manager for cloud storageFilesystemIOManager- Local filesystem IO managerInMemoryIOManager- Memory-based IO manager for testingInputManager- Load inputs independently
fs_io_manager- Filesystem persistencemem_io_manager- In-memory persistencecustom_path_fs_io_manager- Custom path filesystem persistence
Configuration
Type-safe configuration for resources, ops, and assets. Pythonic Config:Config- Base class for op/asset configConfigurableResource- Base class for resource configResourceDependency- Declare resource dependencies
Field- Define configuration fieldsShape- Define nested configurationSelector- Choose one config optionPermissive/PermissiveConfig- Allow arbitrary keysMap- Define key-value mappingsEnvVar- Load from environment variables
String,Int,Float,Bool- Primitive typesArray- List typesEnum/EnumValue- Enumerated valuesNoneable- Optional valuesScalarUnion- Union of scalar typesAny/Nothing- Special types
StringSource/IntSource/BoolSource- Load from environmentconfig_from_files()- Load from YAML/JSON filesconfig_from_yaml_strings()- Parse YAML strings
Partitions & Backfills
Handle time-series and dimensional data partitioning. Partitions:DailyPartitionsDefinition- Daily time windowsHourlyPartitionsDefinition- Hourly time windowsWeeklyPartitionsDefinition- Weekly time windowsMonthlyPartitionsDefinition- Monthly time windowsStaticPartitionsDefinition- Fixed set of partitionsDynamicPartitionsDefinition- Runtime-defined partitionsMultiPartitionsDefinition- Multiple partition dimensionsTimeWindow- Time range for partitionPartition- Individual partition
IdentityPartitionMapping- 1:1 partition mappingTimeWindowPartitionMapping- Map time windowsAllPartitionMapping- Depend on all upstream partitionsLastPartitionMapping- Depend on most recent partitionMultiPartitionMapping/DimensionPartitionMapping- Multi-dimensional mappings
@partitioned_config- Generate partition-specific config@daily_partitioned_config/@hourly_partitioned_config- Time-based configs@static_partitioned_config/@dynamic_partitioned_config- Other configs
BackfillPolicy- Control backfill behaviorAddDynamicPartitionsRequest/DeleteDynamicPartitionsRequest- Manage dynamic partitions
Schedules & Sensors
Automate pipeline execution based on time or events. Schedules:@schedule- Define time-based schedulesScheduleDefinition- Programmatic schedule definitionScheduleEvaluationContext- Access schedule contextbuild_schedule_from_partitioned_job()- Auto-generate from partitionsDefaultScheduleStatus- Control default enabled state
@sensor- Define event-driven sensors@asset_sensor- Trigger on asset materializations@multi_asset_sensor- Trigger on multiple assets@run_status_sensor/@run_failure_sensor- React to run statusSensorDefinition/AssetSensorDefinition- Programmatic definitionsSensorEvaluationContext- Access sensor contextSensorResult/RunRequest/SkipReason- Sensor return types
AutomationCondition- Declarative automation rulesAutoMaterializePolicy- Auto-materialize assetsAutoMaterializeRule- Custom automation rulesFreshnessPolicy- Keep data freshbuild_sensor_for_freshness_checks()- Monitor freshness
Execution Context
Access runtime information within ops and assets. Contexts:OpExecutionContext- Op execution contextAssetExecutionContext- Asset execution contextAssetCheckExecutionContext- Asset check execution contextInputContext/OutputContext- IO manager contextsInitResourceContext- Resource initialization contextHookContext- Hook execution context
build_op_context()- Create test op contextbuild_asset_context()- Create test asset contextbuild_asset_check_context()- Create test check contextbuild_input_context()/build_output_context()- Create IO contextsbuild_init_resource_context()- Create resource context
Metadata & Events
Attach rich metadata to executions and emit events. Metadata Values:MetadataValue- Base metadata typeTextMetadataValue/MarkdownMetadataValue- Text contentIntMetadataValue/FloatMetadataValue- Numeric valuesUrlMetadataValue/PathMetadataValue- Links and pathsJsonMetadataValue- JSON dataTableMetadataValue/TableSchemaMetadataValue- Tabular dataDagsterAssetMetadataValue/DagsterRunMetadataValue- Cross-references
TableSchema/TableColumn- Define table structureTableColumnLineage/TableColumnDep- Column-level lineageTableRecord- Individual table rows
AssetMaterialization- Record asset creationAssetObservation- Record asset observationsExpectationResult- Data quality expectationsOutput- Op output eventsFailure- Explicit failureRetryRequested- Request retry with backoff
with_source_code_references()- Attach code locationsLocalFileCodeReference/UrlCodeReference- Reference typeslink_code_references_to_git()- Link to Git
Types & Type System
Define and validate data types. Type System:DagsterType- Define custom types@usable_as_dagster_type- Make Python types usablePythonObjectDagsterType- Wrap Python typesList,Dict,Set,Tuple,Optional- Collection typesTypeCheck- Type checking resultsDagsterTypeLoader- Load types from config
check_dagster_type()- Validate typesmake_python_type_usable_as_dagster_type()- Register types
Executors
Control how ops execute. Built-in Executors:in_process_executor- Single process executionmultiprocess_executor- Multi-process executionmulti_or_in_process_executor- Configurable executor
@executor- Define custom executorsExecutorDefinition- Programmatic executor definitionExecutor- Base executor classInitExecutorContext- Executor initialization context
Hooks
React to op success or failure. Hook APIs:@success_hook- Run on op success@failure_hook- Run on op failureHookDefinition- Programmatic hook definitionHookContext- Access hook contextHookExecutionResult- Return from hooks
Loggers
Configure structured logging. Built-in Loggers:colored_console_logger- Color-coded console outputjson_console_logger- JSON-formatted logsdefault_loggers- Standard logger set
@logger- Define custom loggersLoggerDefinition- Programmatic logger definitionInitLoggerContext- Logger initialization contextget_dagster_logger()- Get logger instance
Storage & Persistence
Manage pipeline state and data storage. Instance:DagsterInstance- Core Dagster instanceinstance_for_test()- Test instance
DagsterRun- Run metadataDagsterRunStatus- Run status enumRunRecord/RunsFilter- Query runsEventLogRecord/EventLogEntry- Event records
FileHandle/LocalFileHandle- File referenceslocal_file_manager- File manager resourceAssetValueLoader- Load asset valuesUPathDefsStateStorage- Store component state
Pipes
Execute external code with Dagster integration. Core APIs:PipesSubprocessClient- Execute subprocessesPipesClient- Base client classPipesSession- Pipes execution sessionPipesExecutionResult- Execution results
PipesContextInjector- Inject Dagster contextPipesMessageReader- Read messages from external processPipesEnvContextInjector- Pass context via environmentPipesFileContextInjector/PipesTempFileContextInjector- Pass via filesPipesBlobStoreMessageReader- Read from cloud storageopen_pipes_session()- Context manager for sessions
Testing
Test pipelines in isolation. Testing Utilities:build_op_context()- Mock op contextbuild_asset_context()- Mock asset contextbuild_sensor_context()- Mock sensor contextbuild_schedule_context()- Mock schedule contextinstance_for_test()- Test Dagster instancematerialize_to_memory()- Execute in memory
validate_run_config()- Validate job configuration
Components
Build reusable component libraries. Component Types:Component- Base component classStateBackedComponent- Stateful componentsFunctionComponent- Function-based componentsPythonScriptComponent/UvRunComponent- Script executionSqlComponent/TemplatedSqlComponent- SQL executionDefsFolderComponent- Load from folders
load_defs()- Load component definitionsbuild_component_defs()- Build from componentsComponentTree- Component hierarchyscaffold_component()- Generate component scaffolding
Resolvable- Resolvable valuesResolutionContext- Resolution contextResolvedAssetSpec- Resolved asset specifications
Definitions
Package and organize pipeline code. Core:Definitions- Bundle all definitions@repository- Define repositories (legacy)RepositoryDefinition- Programmatic repositories
load_assets_from_current_module()- Auto-load assetsload_assets_from_modules()/load_assets_from_package_name()- Load from packagesload_asset_checks_from_modules()- Load checksload_definitions_from_module()- Load all definitions
Errors
Handle and raise Dagster-specific errors. Common Errors:DagsterError- Base error classDagsterInvalidDefinitionError- Invalid definitionDagsterInvariantViolationError- Invariant violationDagsterExecutionInterruptedError- Interrupted executionDagsterTypeCheckError- Type check failureDagsterConfigMappingFunctionError- Config error
Utilities
Helper functions and utilities. Utilities:configured()- Create configured variantsfile_relative_path()- Resolve relative pathswith_resources()- Bind resources to assetsreconstructable()- Make jobs reconstructablemake_values_resource()- Create simple resourcesmake_email_on_run_failure_sensor()- Email alerts
serialize_value()/deserialize_value()- Serialize objects
BetaWarning- Beta feature warningPreviewWarning- Preview feature warning
Migration Guides
- From Airflow: See Airflow Integration Guide
- Asset-based APIs: Modern asset-based APIs are preferred over legacy op/job patterns
- Pythonic Config: Use
ConfigurableResourceinstead of@resourcedecorator
Related Resources
Quickstart
Build your first pipeline in 5 minutes
Core Concepts
Learn fundamental Dagster concepts
Examples
Browse example projects
Community
Get help on Slack
