Data Science TemplateΒΆ

The data-science template is the core of this meta-project. It follows a modular structure inspired by best industry practices.

Folder StructureΒΆ

β”œβ”€β”€ .github/          # GitHub Actions CI/CD workflows
β”œβ”€β”€ .gitlab/          # GitLab CI/CD pipelines
β”œβ”€β”€ app/              # Streamlit dashboard
β”œβ”€β”€ api/              # FastAPI model serving
β”œβ”€β”€ data/             # Local data storage (Git ignored)
β”‚   β”œβ”€β”€ external/
β”‚   β”œβ”€β”€ interim/
β”‚   β”œβ”€β”€ processed/
β”‚   └── raw/
β”œβ”€β”€ dockerfiles/      # Optimized Docker images
β”œβ”€β”€ models/           # Trained models (Git ignored)
β”œβ”€β”€ notebooks/        # Jupyter notebooks for EDA and experimentation
β”œβ”€β”€ src/              # Main project source code
β”‚   └── {{ package_name }}/
β”‚       β”œβ”€β”€ core/     # Core logic and utilities
β”‚       β”œβ”€β”€ model/    # Model training and prediction
β”‚       └── ...
β”œβ”€β”€ tests/            # Pytest test suite
β”œβ”€β”€ Makefile          # Automation tasks
β”œβ”€β”€ pyproject.toml    # Dependency management
└── README.md         # Project documentation

Core ConceptsΒΆ

  • Separation of Concerns: Logic is in src/, UI is in app/, API is in api/.

  • Reproducible Environment: uv.lock ensures every developer uses the exact same versions.

  • Docker Ready: Multi-stage Dockerfiles for optimized production builds.

The MakefileΒΆ

The Makefile is your main entry point for automation:

  • make dev-install: Setup virtual env and project dirs.

  • make check: Run Ruff (lint) and Mypy (types).

  • make test: Run all tests with coverage.

  • make api: Launch the FastAPI server.

  • make app: Launch the Streamlit dashboard.