A

Apache Airflow

Apache Airflow is an open-source platform for programmatically authoring, scheduling, and monitoring data workflows.

What is Apache Airflow?

Apache Airflow is an open-source platform to programmatically author, schedule, and monitor workflows using directed acyclic graphs (DAGs) of tasks.

Airflow concepts

DAGs (Directed Acyclic Graphs), Operators, Tasks, Schedulers, Executors.

Common misconceptions

  • "Airflow is just cron" — Full orchestration platform
  • "Airflow processes data" — Orchestrates, doesn't process
  • "Airflow is simple" — Learning curve for complex workflows