Sensor Overview
Sensor tasks in Datablast allow tasks to wait for external conditions before proceeding. This guide covers the available sensor types and basic configuration methods.
Sensor Types
Section titled “Sensor Types”BigQuery Sensors
Section titled “BigQuery Sensors”- Table Sensor: Wait for BigQuery tables to be available
- Partition Sensor: Wait for specific table partitions
- Query Sensor: Wait for query results to meet conditions
Cloud Storage Sensors
Section titled “Cloud Storage Sensors”- GCS Object Sensor: Wait for Google Cloud Storage objects
- S3 Key Sensor: Wait for Amazon S3 objects
Custom Sensors
Section titled “Custom Sensors”- Custom Logic: Implement custom sensor logic
- API Sensors: Wait for API responses
- Database Sensors: Wait for database conditions
Basic Configuration
Section titled “Basic Configuration”YAML Configuration
Section titled “YAML Configuration”name: "wait.for.users.table"type: "bq.sensor.table"description: "Wait for users table to be available"parameters: table_id: "project.dataset.users" project_id: "my-project"Annotation-based Configuration
Section titled “Annotation-based Configuration”# @blast.name: wait.for.users.table# @blast.type: bq.sensor.table# @blast.description: Wait for users table to be available# @blast.parameters.table_id: project.dataset.users# @blast.parameters.project_id: my-projectBigQuery Sensors
Section titled “BigQuery Sensors”Table Sensor
Section titled “Table Sensor”Wait for BigQuery tables to be available.
name: "wait.for.users.table"type: "bq.sensor.table"description: "Wait for users table to be available"parameters: table_id: "project.dataset.users" project_id: "my-project"Partition Sensor
Section titled “Partition Sensor”Wait for specific table partitions to be available.
name: "wait.for.daily.partition"type: "bq.sensor.partition"description: "Wait for today's partition to be available"parameters: table_id: "project.dataset.events" partition_id: "{{ ds }}" # Today's date project_id: "my-project"Query Sensor
Section titled “Query Sensor”Wait for query results to meet specific conditions.
name: "wait.for.data.availability"type: "bq.sensor.query"description: "Wait for data to be available"parameters: sql: "SELECT COUNT(*) FROM project.dataset.events WHERE dt = '{{ ds }}'" project_id: "my-project"Cloud Storage Sensors
Section titled “Cloud Storage Sensors”GCS Object Sensor
Section titled “GCS Object Sensor”Wait for Google Cloud Storage objects to be available.
name: "wait.for.gcs.files"type: "gcs.sensor.object_sensor_with_prefix"description: "Wait for files to be uploaded to GCS"parameters: bucket: "my-data-bucket" prefix: "incoming/data/{{ ds }}" project_id: "my-project"S3 Key Sensor
Section titled “S3 Key Sensor”Wait for Amazon S3 objects to be available.
name: "wait.for.s3.file"type: "s3.sensor.key_sensor"description: "Wait for S3 file to be available"parameters: bucket_name: "my-s3-bucket" bucket_key: "data/{{ ds }}/events.parquet"Sensor Parameters
Section titled “Sensor Parameters”Common Parameters
Section titled “Common Parameters”parameters: # Connection parameters project_id: "my-project" connection_id: "my-connection"
# Sensor-specific parameters table_id: "project.dataset.table" partition_id: "{{ ds }}" bucket: "my-bucket" prefix: "data/{{ ds }}"
# Jinja template support date_filter: "{{ ds }}" time_filter: "{{ ts }}"Jinja Template Support
Section titled “Jinja Template Support”Sensors support Jinja templates for dynamic parameter values:
parameters: table_id: "project.dataset.events" partition_id: "{{ ds }}" # Today's date date_filter: "{{ prev_ds }}" # Previous day time_filter: "{{ ts }}" # Current timestampBest Practices
Section titled “Best Practices”Sensor Design
Section titled “Sensor Design”- Clear purpose: Each sensor should have a clear, specific purpose
- Appropriate conditions: Choose conditions that accurately reflect data availability
- Error handling: Implement proper error handling and logging
- Performance: Optimize sensor queries for efficiency
Parameter Configuration
Section titled “Parameter Configuration”- Use templates: Leverage Jinja templates for dynamic values
- Validate parameters: Ensure parameter values are correct
- Document purpose: Include clear descriptions of sensor behavior
- Test thoroughly: Validate sensor behavior in different scenarios
Monitoring
Section titled “Monitoring”- Track execution: Monitor sensor execution times and success rates
- Set alerts: Configure alerts for sensor failures
- Log results: Maintain detailed logs of sensor behavior
- Review regularly: Periodically review and optimize sensors
Troubleshooting
Section titled “Troubleshooting”Common Issues
Section titled “Common Issues”Sensor Timeout
Section titled “Sensor Timeout”- Issue: Sensor waits indefinitely
- Solution: Check parameter values and data availability
- Debug: Review sensor logs and data sources
Parameter Errors
Section titled “Parameter Errors”- Issue: Invalid parameter values
- Solution: Validate parameter formats and values
- Debug: Check parameter syntax and templates
Connection Issues
Section titled “Connection Issues”- Issue: Sensor cannot connect to data source
- Solution: Check connection configurations and credentials
- Debug: Test connections independently
Next Steps
Section titled “Next Steps”- BigQuery Sensors – BigQuery-specific sensor configuration
- Cloud Storage Sensors – GCS and S3 sensor configuration
- Custom Sensors – Custom sensor implementations
- Sensor Parameters – Parameter configuration and validation
- Sensor Monitoring – Monitoring and alerting strategies