HedgeIOT: Building a Reproducible JupyterHub/JupyterLab Platform


On HedgeIOT, I was responsible for making the development environment reliable for multiple users, not just for one local setup.

HedgeIOT JupyterLab

The goal was simple: anyone on the team should be able to log in and start working in the same environment without dependency drift or manual setup.

Project Scope

Repo: https://github.com/VU-HedgeIOT/HedgeIOT_Jupyterhub_env

I delivered a JupyterHub/JupyterLab deployment focused on reproducibility and team-scale usability:

  • Multi-user container spawning with DockerSpawner
  • Persistent per-user /work directories
  • Shared read-only /shared_data mount for datasets and example notebooks
  • Admin-focused containers with extra tooling and optional SSH remote development

JupyterHub Architecture and Volume Mounts

What I Built

1) Multi-user runtime that stays consistent

Each user gets an isolated container, but the environment definition is shared and repeatable. This reduced onboarding friction and removed “works on my machine” issues.

2) Storage model for daily work and collaboration

I separated personal and shared storage clearly:

  • /work for each user’s persistent workspace
  • /shared_data for common assets that should remain stable and read-only

This helped avoid accidental data mutation while keeping collaboration practical.

3) Operational tooling for maintainers

For debugging and maintenance, I added admin-capable containers with extra tools and optional remote access pathways.

Technical Challenges I Solved

  • Standardized environment bootstrapping for all users
  • Balanced isolation with shared resources
  • Designed mounts/permissions to prevent accidental dataset changes
  • Enabled smoother troubleshooting for deployment maintainers

Outcome

The team could onboard faster, run notebooks in a consistent environment, and focus on HedgeIOT logic instead of local setup issues.

For me, this project demonstrates practical MLOps/platform engineering experience: container orchestration, reproducibility, user isolation, and operational maintainability.