Tasks

Tasks are the building blocks of Datablast pipelines. Each task represents a unit of work that can be executed as part of your data processing workflow.

Task Types

SQL Tasks

Execute SQL queries against various databases:

BigQuery (bq.sql) - Google Cloud BigQuery
Snowflake (sf.sql) - Snowflake data warehouse
Athena (athena.sql) - AWS Athena
PostgreSQL (pg.sql) - PostgreSQL databases

Python Tasks

Execute Python scripts for data processing:

Python (python) - Custom Python logic and ML workflows

Sensor Tasks

Wait for external conditions before proceeding:

BigQuery Sensors - Wait for tables, partitions, or query results
Cloud Storage Sensors - Wait for files in GCS or S3

Configuration Methods

Tasks can be configured using:

YAML Files - Separate configuration files
Annotations - Configuration directly in code files

Basic Configuration

name: task.name                    # Unique task identifier
type: bq.sql                       # Task type
description: Task description      # Human-readable description
depends:
  - task1
  - task2
run: script.sql                    # Script file to execute

Learn More

For detailed configuration examples and advanced features:

SQL Tasks - SQL task configuration and examples
Python Tasks - Python task configuration and examples
Sensor Tasks - Sensor configuration and examples

For best practices and advanced topics:

SQL Development - SQL best practices and optimization
Python Development - Python best practices and optimization
Development Guides - Platform capabilities and best practices