What is dbt (data build tool)?
dbt helps analytics engineers/ data engineers transform data inside the warehouse (it’s the T in ELT). You write SQL (plus a bit of YAML for config), and dbt handles dependencies, environments, tests, documentation, and deployment.
Project basics
dbt_project.yml → defines your project (models path, materializations, configs).Models → each model is a single SELECT in a .sql file; dbt builds them into views/tables.Adapters (connectors) → Snowflake, BigQuery, Redshift, Postgres, Databricks, DuckDB, etc.Core commands
dbt run → compiles + runs your SQL models.dbt build → runs models and tests (and snapshots if present).dbt test → executes tests.dbt compile → only compiles (no execution).dbt docs generate/serve → builds and serves your project docs.Materializations (build strategies)
view (default): lightweight, always reads from source each query.table: persists results (fast reads; higher storage).incremental: only processes new/changed data; supports MERGE on many adapters.ephemeral: inlines as CTEs (not persisted).Sources & seeds
sources: reference external tables not built by dbt; you can run freshness checks.seeds: load small CSVs from your repo into the warehouse (dbt seed).Testing
Generic (schema) tests in YAML: unique, not_null, accepted_values, relationships.Singular (data) tests in SQL: custom assertions as SELECT queries.Run: dbt test, dbt test --select test_type:generic|singular, or target a model.Documentation
Write descriptions in YAML or Markdown; dbt generates a searchable HTML site.Customize the landing page with overview.md and add images/assets in your project. - models/customer_spending.sql
SELECT
customer_id,
SUM(order_amount) AS total_spent
FROM
{{ ref('raw_orders') }} -- 'ref' function creates a dependency on the 'raw_orders' table
GROUP BY
customer_id
DBT turns SQL + YAML into reliable, tested, documented, and deployable data transformations for modern ELT.