Skip to content

Datablast Quickstart Guide

Welcome to the Datablast Quickstart! In ≈10 minutes you will:

  1. Prepare a repository for Datablast Scheduler
  2. Define your first pipeline
  3. Add SQL & Python tasks
  4. Configure notifications
  5. Deploy and monitor the run

Create a Git repository with the following layout. Datablast automatically scans it to discover assets.

  • pipeline.yml
  • Directorytasks/
    • sample_task.sql
  • pipeline.yml – high-level schedule & config
  • tasks/ – contains all SQL, Python or YAML tasks

More details: Project Structure


pipeline.yml

schedule: "0 3 * * *" # UTC cron
notifications:
slack:
- name: demo-notifications
connection: "demo-slack"
failure: ":red_circle: Pipeline has failed!"

This pipeline will run daily at 03:00 UTC and post a Slack message on failures.

Learn every field in Pipeline Config.


Create tasks/daily_orders.sql:

-- @blast.name: marts.daily_orders
-- @blast.type: bq.sql
SELECT *
FROM `raw.orders`
WHERE status = 'paid';

tasks/python/cleanup.py:

util.cleanup_files
# @blast.type: python
def run(**kwargs):
print(Cleaning temporary files …)

Explore task options: SQL Tasks · Python Tasks


Datablast ships with Slack & Discord integrations.

Add a Slack web-hook ID to your pipeline.

Full guide: Notifications


  1. Push your repository to Git (GitHub/GitLab/Bitbucket).
  2. Datablast detects changes & schedules the pipeline.
  3. Watch runs in the Datablast UI.