Enzosol: AI-Powered Protein Design
Published:
Published:
Period | Time / wk | Advanced programming | Software dev | RL | Bayesian | Deliverable |
---|---|---|---|---|---|---|
Q1 (Sep–Nov 2025) | 11h (4/3/2/2) | Python internals (C-API, buffer protocol), NumPy/SciPy memory model; vectorization & numba; pick C++20+pybind11 or Rust+pyo3 for extensions; profiling basics (cProfile , perf ) | Git strategies, pytest + coverage, mypy/ruff, packaging with poetry/pdm; CLI design; docs with Sphinx/MkDocs; CI via GitHub Actions & pre-commit | Refresher: MDPs/bandits; implement tabular DP/MC, ε-greedy; Gymnasium intro; clean-from-scratch code | Probability refresher; conjugate models; prior/posterior predictive checks; PyMC or Stan basics; ArviZ diagnostics | Tiny PyPI package (bio kernel or parser), benchmarks + doc site |
Q2 (Dec 2025–Feb 2026) | 10h (4/2/2/2) | Native acceleration: C++/Rust ext modules; cache-aware data layouts; PyTorch custom ops shim; flamegraphs; microbenchmarks; intro GPU via Triton or CUDA kernels | Repro/data: DVC for datasets, dataset cards; semantic versioning; release wheels (Linux/Mac) | Policy Gradient (REINFORCE) from scratch; advantage baselines; experiment tracking (MLflow/W\&B) | Hierarchical models; GLMs; HMC/NUTS tuning; LOO/WAIC model comparison | Accelerated op (e.g., k-mer tally/UMI dedupe) as a PyTorch/JAX extension + reproducible DVC pipeline |
Q3 (Mar–May 2026) | 11h (3/3/3/2) | GPU depth: memory coalescing, warp/wavefront basics; streams & async; JAX jit/pjit mental model | Workflow engines: Snakemake/Nextflow + Docker; config mgmt with Hydra; API sketch with FastAPI | Value-based RL: DQN (clean-room), target nets, replay buffers; sanity-check OOD & reward scaling | Gaussian Processes for time-series expression; sparse/inducing points; calibration | End-to-end pipeline (Nextflow) producing features → API serving a GPU-accelerated op; DQN repo with reproducible results |
Q4 (Jun–Aug 2026) | 10h (3/2/3/2) | HPC touches: SIMD/AVX, OpenMP; SLURM; distributed training intro (FSDP/torch.distributed) | Observability: metrics/logging/tracing; perf budgets; simple K8s deploy or autoscaling container | PPO/A2C with robust training loops; eval protocol; basic safe/constrained tricks | Bayesian deconvolution for multi-omics; VI/ADVI; prior sensitivity & SBC | Year-capstone: open-source “fast-omics-kernels” + preprint-style tech report OR “RL-guided assay selection (sim)” with Bayesian uncertainty; public demo & docs |
Published:
This infographic translates complex scientific concepts into an easily digestible visual narrative, covering everything from data acquisition challenges to the latest deep learning integration methods and their applications in disease understanding.
Published:
Generally, there isn’t much going on in the summer…
Published:
Published:
Applications for this degree typically open from April 1 to May 15. Applicants must have a bachelor’s degree in biology or a related field, English proficiency at C1 level (IELTS 6.5), and must pass a knowledge test. The test covers general biology and includes some programming essay questions. The program spans two years, with the first three semesters dedicated to coursework, followed by an internship (preferably in the lab where you plan to write your master’s thesis), and finally, the thesis. The thesis can be undertaken in any lab that specializes in computational biology or bioinformatics, offering considerable flexibility.