Hyperion Data Pipeline

Introduction

Hyperion consists of 2 major components for its data pipeline:

Windmill - used for the majority of heavier workloads, Windmill (https://www.windmill.dev/) is an open source data workflow engine providing orchestration of ETL (Extract, Transform, Load) operations, fetching data from external sources and loading them into the Hyperion database.
n8n.io - used for ligher workloads where timinig and capacity is sacrifised for simplicty and development speed.

Rationalle for choosing 2 workflow tools

As briefly mentioned above, whilke Windmill and n8n serve virtually identical use cases, creating simple workflows which don't handle large volumes (i.e. >10 MBs of data) is a lot quicker in n8n, allowing for rapid prototypign and creation of simple enrichment services.

Windmill

Windmill is the main data pipeline for Hyperion and while it can handle Golang, Python, JavaScript etc. the main language for ETL scripts was chosen to be Golang due to its significantly better performance (up to 10x in real-world testing).

It is strongly recommended for anyone wanting to work on Windmill to first read the documentation on the project website: https://www.windmill.dev/docs/intro, especially the Golang script and Flow tutorials.

Also be sure to review:

n8n

Since we're using a free version of n8n, a lot of management features like variables or sharing workflows between users, are missing. Due to those constraints, avoid using n8n for anything mission-critical.

Service diagram

TODO

Introduction​

Rationalle for choosing 2 workflow tools​

Windmill​

n8n​

Service diagram​

Introduction

Rationalle for choosing 2 workflow tools

Windmill

n8n

Service diagram