DBT 101

What is dbt (data build tool)?

dbt helps analytics engineers/ data engineers transform data inside the warehouse (it’s the T in ELT). You write SQL (plus a bit of YAML for config), and dbt handles dependencies, environments, tests, documentation, and deployment.

Project basics

dbt_project.yml → defines your project (models path, materializations, configs).

Models → each model is a single SELECT in a .sql file; dbt builds them into views/tables.

Adapters (connectors) → Snowflake, BigQuery, Redshift, Postgres, Databricks, DuckDB, etc.

Core commands

dbt run → compiles + runs your SQL models.

dbt build → runs models and tests (and snapshots if present).

dbt test → executes tests.

dbt compile → only compiles (no execution).

dbt docs generate/serve → builds and serves your project docs.

Materializations (build strategies)

view (default): lightweight, always reads from source each query.

table: persists results (fast reads; higher storage).

incremental: only processes new/changed data; supports MERGE on many adapters.

ephemeral: inlines as CTEs (not persisted).

Sources & seeds

sources: reference external tables not built by dbt; you can run freshness checks.

seeds: load small CSVs from your repo into the warehouse (dbt seed).

Testing

Generic (schema) tests in YAML: unique, not_null, accepted_values, relationships.

Singular (data) tests in SQL: custom assertions as SELECT queries.

Run: dbt test, dbt test --select test_type:generic|singular, or target a model.

Documentation

Write descriptions in YAML or Markdown; dbt generates a searchable HTML site.

Customize the landing page with overview.md and add images/assets in your project.

        - models/customer_spending.sql
SELECT
customer_id,
SUM(order_amount) AS total_spent
FROM
{{ ref('raw_orders') }} -- 'ref' function creates a dependency on the 'raw_orders' table
GROUP BY
customer_id

DBT turns SQL + YAML into reliable, tested, documented, and deployable data transformations for modern ELT.