This is Parsa Hariri.
Hello! My name is Parsa, and I work as a computational biologist and bioinformatician at the University of Göttingen. I specialize in modeling biological networks, such as those found in the brain and embryos. If you’re interested in a scientific discussion, want to know more about my educational journey, or have any other questions, please don’t hesitate to reach out. I also organize some talks about bioinformatics, you can learn more about it here
Latest Projects
master's thesis
Image segmentation is a fundamental task in computer vision, with applications in medical imaging, autonomous driving, and object recognition. Recent advancements in machine learning have led to the development of powerful models like U-Net, diffusion models, and transformers, which show promise in segmenting images with high precision.
GANs for Biomedical Image Augmentation
Generative Adversarial Networks (GANs) are a class of neural networks consisting of two models: a generator and a discriminator. The generator creates fake data, while the discriminator evaluates it against real data. Both networks train together in a competitive process, where the generator improves at creating realistic outputs, and the discriminator becomes better at distinguishing between real and fake data. Over time, the generator learns to produce high-quality, realistic images that are hard for the discriminator to distinguish from real data.
Skin Cancer Detection with ViTs
Skin cancer is one of the most common types of cancer worldwide, with millions of new cases diagnosed each year. Early detection is critical in improving patient outcomes, as it increases the chances of successful treatment. In recent years, the advancement of artificial intelligence (AI) techniques has opened new possibilities for detecting skin cancer at an early stage using automated methods. Specifically, Vision Transformers (ViTs) have emerged as a powerful tool in the field of medical imaging, providing state-of-the-art performance in tasks like image classification and segmentation. The goal of this project is to explore the use of ViTs for the classification of skin cancer images. Using data from the ISIC 2024 Challenge, this project investigates the efficacy of ViTs in identifying high-risk cancerous lesions from high-resolution 3D Total Body Photography (3D-TBP) images. By comparing multiple variants of ViT models, this study aims to determine which architecture offers the best performance for skin cancer detection. The dataset used in this project is provided by the ISIC 2024 Challenge, which focuses on detecting skin cancer from high-resolution 3D-TBP images. These images pose a challenge due to their high resolution, requiring efficient model architectures that can handle large amounts of data while maintaining precision in prediction. The LeViT model achieved the highest accuracy (94%), while models like CaiT and Simple ViT also performed well. Token-to-Token ViT ran into memory issues, likely due to its high-resolution patching technique.
Latest Posts
Enzosol: AI-Powered Protein Design
Must Learn
Period | Time / wk | Advanced programming | Software dev | RL | Bayesian | Deliverable |
---|---|---|---|---|---|---|
Q1 (Sep–Nov 2025) | 11h (4/3/2/2) | Python internals (C-API, buffer protocol), NumPy/SciPy memory model; vectorization & numba; pick C++20+pybind11 or Rust+pyo3 for extensions; profiling basics (cProfile , perf ) | Git strategies, pytest + coverage, mypy/ruff, packaging with poetry/pdm; CLI design; docs with Sphinx/MkDocs; CI via GitHub Actions & pre-commit | Refresher: MDPs/bandits; implement tabular DP/MC, ε-greedy; Gymnasium intro; clean-from-scratch code | Probability refresher; conjugate models; prior/posterior predictive checks; PyMC or Stan basics; ArviZ diagnostics | Tiny PyPI package (bio kernel or parser), benchmarks + doc site |
Q2 (Dec 2025–Feb 2026) | 10h (4/2/2/2) | Native acceleration: C++/Rust ext modules; cache-aware data layouts; PyTorch custom ops shim; flamegraphs; microbenchmarks; intro GPU via Triton or CUDA kernels | Repro/data: DVC for datasets, dataset cards; semantic versioning; release wheels (Linux/Mac) | Policy Gradient (REINFORCE) from scratch; advantage baselines; experiment tracking (MLflow/W\&B) | Hierarchical models; GLMs; HMC/NUTS tuning; LOO/WAIC model comparison | Accelerated op (e.g., k-mer tally/UMI dedupe) as a PyTorch/JAX extension + reproducible DVC pipeline |
Q3 (Mar–May 2026) | 11h (3/3/3/2) | GPU depth: memory coalescing, warp/wavefront basics; streams & async; JAX jit/pjit mental model | Workflow engines: Snakemake/Nextflow + Docker; config mgmt with Hydra; API sketch with FastAPI | Value-based RL: DQN (clean-room), target nets, replay buffers; sanity-check OOD & reward scaling | Gaussian Processes for time-series expression; sparse/inducing points; calibration | End-to-end pipeline (Nextflow) producing features → API serving a GPU-accelerated op; DQN repo with reproducible results |
Q4 (Jun–Aug 2026) | 10h (3/2/3/2) | HPC touches: SIMD/AVX, OpenMP; SLURM; distributed training intro (FSDP/torch.distributed) | Observability: metrics/logging/tracing; perf budgets; simple K8s deploy or autoscaling container | PPO/A2C with robust training loops; eval protocol; basic safe/constrained tricks | Bayesian deconvolution for multi-omics; VI/ADVI; prior sensitivity & SBC | Year-capstone: open-source “fast-omics-kernels” + preprint-style tech report OR “RL-guided assay selection (sim)” with Bayesian uncertainty; public demo & docs |