Assets
Assets are the core building blocks of Dagster. An asset is a data object that represents a file, table, model, or any persistent artifact produced by your data pipeline. Assets describe what data should exist and how to compute it, rather than prescribing a specific execution order.Why Assets Matter
Assets provide a declarative approach to data pipelines:- Data-centric thinking: Focus on the data artifacts you need, not the tasks that produce them
- Automatic lineage: Dagster tracks dependencies between assets automatically
- Observability: See when assets were last updated, their freshness, and materialization history
- Selective execution: Materialize individual assets or subsets based on your needs
- Cross-pipeline dependencies: Assets can depend on outputs from different jobs and pipelines
Basic Asset Definition
Define an asset using the@asset decorator. The function name becomes the asset name:
- The asset name is
my_asset - When materialized, it writes data to a JSON file
- Dagster tracks when this asset was last materialized
Asset Dependencies
Assets can depend on other assets. Dagster infers dependencies from function parameters:Dagster automatically determines that
downstream_asset depends on upstream_asset by matching the parameter name to the upstream asset name.Non-Argument Dependencies
For dependencies that don’t pass data (e.g., checking that a table exists before querying), use thedeps parameter:
Asset Configuration
The@asset decorator accepts many parameters to customize behavior:
- Naming & Organization
- Metadata & Description
- Partitions & Backfills
- Automation
Multi-Assets
Sometimes a single computation produces multiple assets. Use@multi_asset for this:
Asset Materialization
Materializing an asset means computing its value and persisting it. You can materialize assets:- In the UI: Click the “Materialize” button
- Via CLI:
dagster asset materialize - Programmatically: Using schedules, sensors, or automation conditions
- In tests: Call the asset function directly or use
materialize()
Asset Checks
Asset checks validate the quality of your assets:Source Assets
Source assets represent external data that Dagster doesn’t manage:Best Practices
Keep assets focused and composable
Keep assets focused and composable
Each asset should represent a single logical data artifact. Break large computations into multiple assets that can be materialized independently.
Use meaningful names
Use meaningful names
Asset names should clearly describe the data they represent. Use
key_prefix to organize assets into namespaces like ["raw", "staging", "analytics"].Add metadata and descriptions
Add metadata and descriptions
Document your assets with descriptions and metadata. This helps team members understand what each asset contains and how it’s used.
Leverage the asset graph
Leverage the asset graph
Think about your data as a graph of dependencies. The Dagster UI visualizes this graph, making it easy to understand data lineage.
Related Documentation
- Ops, Jobs & Graphs - Lower-level computation primitives
- Partitions & Backfills - Time-based and custom partitioning
- IO Managers - Control how asset data is stored and loaded
- Automation - Automatically materialize assets based on conditions
API Reference
@asset- Asset decorator@multi_asset- Multi-asset decoratorAssetIn- Configure asset inputsAssetOut- Configure asset outputsAssetExecutionContext- Runtime context for assets
