Tasks
Tasks are the building blocks of Datablast pipelines. Each task represents a unit of work that can be executed as part of your data processing workflow.
Task Types
Section titled “Task Types”SQL Tasks
Section titled “SQL Tasks”Execute SQL queries against various databases:
- BigQuery (
bq.sql) - Google Cloud BigQuery - Snowflake (
sf.sql) - Snowflake data warehouse - Athena (
athena.sql) - AWS Athena - PostgreSQL (
pg.sql) - PostgreSQL databases
Python Tasks
Section titled “Python Tasks”Execute Python scripts for data processing:
- Python (
python) - Custom Python logic and ML workflows
Sensor Tasks
Section titled “Sensor Tasks”Wait for external conditions before proceeding:
- BigQuery Sensors - Wait for tables, partitions, or query results
- Cloud Storage Sensors - Wait for files in GCS or S3
Configuration Methods
Section titled “Configuration Methods”Tasks can be configured using:
- YAML Files - Separate configuration files
- Annotations - Configuration directly in code files
Basic Configuration
Section titled “Basic Configuration”name: task.name # Unique task identifiertype: bq.sql # Task typedescription: Task description # Human-readable descriptiondepends: - task1 - task2run: script.sql # Script file to executeLearn More
Section titled “Learn More”For detailed configuration examples and advanced features:
- SQL Tasks - SQL task configuration and examples
- Python Tasks - Python task configuration and examples
- Sensor Tasks - Sensor configuration and examples
For best practices and advanced topics:
- SQL Development - SQL best practices and optimization
- Python Development - Python best practices and optimization
- Development Guides - Platform capabilities and best practices