Full-Stack Scientific Software Engineering

From FastAPI to PyTorch Bindings and Climate Modeling

Jared Frazier

2026-06-13

LinguaLoop

A Modular, Tested, and Packaged Language Learning App

Overview

  • Motivation
    • Practice language listening skills
    • Small segment of YouTube video
    • Your transcript vs reference
docker run -p 49152:49152 lingua-loop

System Architecture

  • API/Web Layer
    • Handle client HTTP requests
    • Handle web server responses
  • Business Logic/Services Layer
    • Parse and normalize user input
    • Compute scoring metrics for transcription attempt
  • Data Layer
    • Use “boundary” (aka integrations) to scrape reference YouTube transcripts
    • Read/write to SQLite database

Testing Strategy

tests/
├── boundary/
│   └── test_youtube_transcript_api.py
├── integration/
│   └── test_integration.py
└── unit/
    ├── api/
    ├── db/
    └── services/
  • Multi-layer testing
  • Extensive use of mocking to isolate dependencies (DB, services, external APIs)
  • Unit tests validate service logic in complete isolation
  • Integration tests run full stack: API → service → database (no mocks)
  • CI/CD on GitHub enforces all tests pass during PR

From Code to Package

Relevance

  • Mocks emulate subsystem behavior for isolated validation
  • Integration tests verify interactions across components
  • CI pipelines provide automated validation
  • Full-stack, start to finish, independent, and no vibe coding
  • Scale (LOC): 1k Python, 500 JavaScript

FTorch

Build System and CI/CD Improvements for One of the Most Widely-Used Fortran-PyTorch Interfaces

Overview

  • Library for coupling PyTorch ML models directly to Fortran code
  • Used in Community Earth System Model, ICON Model, and UK Atomic Energy Authority
  • Developed primarily by University of Cambridge

Contributions

  • CI/CD pipelines for Intel oneAPI and GCC toolchains — multi-compiler support, cross-platform reliability
  • Automatic pkg-config file generation — simplified integration into legacy build systems
  • Static library builds — enabling deployment in environments with dynamic linking restrictions
  • Diagnosed and resolved subtle compilation issues — improved test-suite stability and build reproducibility

Relevance

  • Build & CI/CD engineering — automated multi-compiler testing directly maps to DPPS integration/verification needs
  • Cross-platform reliability — essential for distributed pipelines running across heterogeneous infrastructure
  • Legacy system integration (pkg-config, static builds) — relevant for interfacing DPPS with existing telescope software
  • Distributed, international collaboration — contributions to an open, community-driven project mirror CTAO’s team structure
  • Scale: 1k YAML (CI/CD), 1k CMake, 2.5k Fortran, 500 C++ LOC

ICON

Code Ownership of the
Upper Atmosphere Components of the
Icosahedral Nonhydrostatic (ICON) Model

Overview

  • Unified global weather prediction and climate model co-developed by German Weather Service, MeteoSwiss, etc.
  • Massively parallel with MPI, OpenMP, and OpenACC offloading
  • Operational at DWD since 2015 for weather forecasting in Germany and Switzerland

Contributions

  • Refactored 10k+ LOC in upper-atmosphere components into modular structure, enabling GPU porting and separation of responsibilities
  • Integrated FTorch (Fortran-PyTorch) into the ICON build system
  • Leading adoption of Enzyme for automatic differentiation and modernizing compiler infrastructure at DKRZ

Relevance

  • Large-scale refactoring (10k+ LOC) demonstrates ability to navigate and improve complex, multi-author Fortran codebases — directly relevant to DPPS common software development
  • Build system integration & compiler modernization mirrors the multi-toolchain integration and verification work required by DPPS
  • Handling scientific merge requests across institutions mirrors the distributed, collaborative review process described in the role
  • Scale: 550k Fortran (10k Upper Atmosphere), 3k m4 (GNU Autotools) LOC

Summary

  • LinguaLoop — full-stack Python app with CI/CD, packaging (PyPI), containerization, and multi-layer testing
  • FTorch — build system engineering with multi-compiler CI/CD pipelines and legacy system integration
  • ICON — large-scale Fortran refactoring (10k+ LOC), compiler modernization, and HPC software stewardship

Questions?

Appendix

Demo: LinguaLoop