Blog

Technical articles on biomedical data science, machine learning pipelines, and reproducible research.

BPCells, LogNormalize and Integration: Design Decisions Behind a Single-Nucleus RNA-seq Pipeline in R

June 01, 2026 · 11 min read

Why a snRNA-seq pipeline on a 16 GB laptop is built on BPCells on-disk matrices, LogNormalize instead of SCTransform, and a side-by-side RPCA/Harmony integration comparison — the reasoning, not the boilerplate.

single-cell snRNA-seq Seurat BPCells R bioinformatics integration Alzheimer
Read more

Bulk RNA-seq with DESeq2: The Design Decisions That Make Differential Expression Trustworthy

June 01, 2026 · 9 min read

What actually decides whether a DESeq2 result holds up: the design formula, log-fold-change shrinkage, and a DESeq2-vs-edgeR cross-check — applied to 192 5xFAD Alzheimer mouse samples. The combined-tissue model returns a cross-validated microglial (disease-associated) signature: 418 DE genes in DESeq2, 465 in edgeR, 417 concordant.

bulk RNA-seq DESeq2 edgeR R bioinformatics Alzheimer transcriptomics differential expression
Read more

From MaxQuant to Results: How I Structure an LFQ Proteomics Pipeline in R

May 28, 2026 · 14 min read

A walkthrough of the modular LFQ proteomics pipeline I use in practice — from MaxQuant ProteinGroups output through normalisation, mixed imputation, limma/eBayes differential abundance, and visualisation. Real code, real dataset, and the decisions that most tutorials quietly skip.

proteomics LFQ R bioinformatics limma DEP MaxQuant
Read more

Missing Data Imputation in Label-Free Quantitative Proteomics: A Mixed Strategy Approach

May 07, 2026 · 12 min read

Missing values are unavoidable in label-free quantitative proteomics. Learn when and how to apply MNAR versus MAR imputation strategies using a robust mixed approach that classifies missingness patterns at the protein-condition level.

proteomics imputation data science R bioinformatics
Read more

layout: default title: Blog permalink: /blog/ —

Blog

Technical articles on biomedical data science, machine learning pipelines, and reproducible research.

BPCells, LogNormalize and Integration: Design Decisions Behind a Single-Nucleus RNA-seq Pipeline in R

June 01, 2026 · 11 min read

Why a snRNA-seq pipeline on a 16 GB laptop is built on BPCells on-disk matrices, LogNormalize instead of SCTransform, and a side-by-side RPCA/Harmony integration comparison — the reasoning, not the boilerplate.

single-cell snRNA-seq Seurat BPCells R bioinformatics integration Alzheimer
Read more

Bulk RNA-seq with DESeq2: The Design Decisions That Make Differential Expression Trustworthy

June 01, 2026 · 9 min read

What actually decides whether a DESeq2 result holds up: the design formula, log-fold-change shrinkage, and a DESeq2-vs-edgeR cross-check — applied to 192 5xFAD Alzheimer mouse samples. The combined-tissue model returns a cross-validated microglial (disease-associated) signature: 418 DE genes in DESeq2, 465 in edgeR, 417 concordant.

bulk RNA-seq DESeq2 edgeR R bioinformatics Alzheimer transcriptomics differential expression
Read more

From MaxQuant to Results: How I Structure an LFQ Proteomics Pipeline in R

May 28, 2026 · 14 min read

A walkthrough of the modular LFQ proteomics pipeline I use in practice — from MaxQuant ProteinGroups output through normalisation, mixed imputation, limma/eBayes differential abundance, and visualisation. Real code, real dataset, and the decisions that most tutorials quietly skip.

proteomics LFQ R bioinformatics limma DEP MaxQuant
Read more

Missing Data Imputation in Label-Free Quantitative Proteomics: A Mixed Strategy Approach

May 07, 2026 · 12 min read

Missing values are unavoidable in label-free quantitative proteomics. Learn when and how to apply MNAR versus MAR imputation strategies using a robust mixed approach that classifies missingness patterns at the protein-condition level.

proteomics imputation data science R bioinformatics
Read more