Proteomics & Single-Cell Genomics

01

Mass Spectrometry

Mass Spectrometry–Based Proteomics

Tandem Mass Tag (TMT)-based quantitative proteomics enables multiplexed quantification of protein abundance across multiple conditions with high dynamic range and sensitivity. I employ label-free quantification (LFQ) and isobaric labeling (TMT 6-, 10-, 11-, and 16-plex) coupled with liquid chromatography–tandem mass spectrometry (LC-MS/MS) to measure proteome-wide changes.

Using subcellular fractionation (soluble, membrane, and insoluble fractions), I isolate organelles and proteostatic compartments, then quantify dysregulated proteins in complex disease models. For Parkinson's disease (PD) models, combined subcellular fractionation and TMT-proteomics identified enrichment of aggresome/autophagy markers (HDAC6, p62/SQSTM1, LAMP1, LAMP2, TFEB) in insoluble fractions under dual-hit (PFF + IFN-γ) conditions, linking proteostatic dysfunction to neuroinflammation-driven neuronal loss.

Downstream analysis uses Proteome Discoverer, MaxQuant, and Perseus for protein quantification, normalization, and statistical testing. Pathway analysis via GSEA, GO enrichment, and IPA reveals dysregulated biological processes (e.g., interferon-γ signaling, lysosomal degradation, antigen presentation).

Volcano Plot — Differential Protein Expression

A. Differential Proteomics

PFF + IFN-γ vs. Control in iPSC-derived dopamine neurons (log2 FC, -log10 padj)

Red: upregulated proteins (stress, autophagy markers). Blue: downregulated (dopaminergic, synaptic markers). Dashed line: p = 0.05 threshold.

Pathway Enrichment — Top GO/KEGG Terms

B. Pathway Analysis

GO Biological Process and KEGG Pathway enrichment (-log10 FDR)

Red: upregulated pathways in dual-hit condition. Blue: downregulated pathways in neurons.

Protein Abundance Across Conditions

C. Protein Quantification

Selected lysosomal and neuroprotection markers across conditions (normalized TMT intensity)

Group bars: Control, PFF, IFN-γ, PFF+IFN-γ. Key proteins: LAMP1, LAMP2, TFEB, NRF2, TH, HDAC6, SQSTM1, LC3B.

02

Single-Cell Transcriptomics

Single-Cell RNA Sequencing (scRNA-seq)

10x Genomics Chromium-based scRNA-seq captures transcriptomes from thousands of individual cells, enabling unbiased identification of cell type heterogeneity, developmental trajectories, and stress-responsive populations. I process raw sequencing data through CellRanger for alignment, barcode demultiplexing, and gene-cell matrix construction.

Analysis leverages Seurat (R) and Scanpy (Python) for clustering, dimension reduction (UMAP, t-SNE), and differential expression testing. In iPSC-derived neuronal cultures and patient-derived Parkinson's neurons, scRNA-seq reveals cellular heterogeneity including dopamine neurons (TH+, DAT+, NURR1+), cortical neurons (MAP2+), astrocytes (GFAP+, AQP4+), microglia-like cells (AIF1+), neural progenitors (SOX2+, NESTIN+), and stress-responsive/apoptotic populations. Single-cell transcriptomics is essential for resolving cell-type-specific responses to disease stimuli and identifying biomarkers for disease progression.

UMAP Clustering — Cellular Diversity

A. Single-Cell Transcriptome Landscape

UMAP projection of ~1000 cells from iPSC-derived neuronal culture (~8 clusters)

Each point: single cell. Colors: cell type (DA neurons, cortical neurons, astrocytes, progenitors, microglia, cycling cells, stressed/apoptotic, floor plate).

Gene Expression Markers — Dot Plot

B. Marker Gene Expression

Bubble plot: marker genes (columns) vs. cell clusters (rows). Size = % expressing, color = avg expression.

Genes: TH, DAT, NURR1, MAP2, GFAP, AIF1, MKI67, SOX2. Clusters: DA Neurons, Cortical, Astrocytes, Microglia, Progenitors.

Gene Expression Across Cell Types

C. Neurodegeneration Gene Expression

PD-relevant genes (TH, SNCA, LRRK2, GBA) across cell types. Grouped bar chart.

Shows distribution across DA Neurons, Cortical Neurons, Progenitors. Key for identifying cell-type-specific vulnerability.

03

Bulk Transcriptomics

Bulk Transcriptomics & Pathway Analysis

Bulk RNA-seq quantifies transcript abundance across samples at high sensitivity and dynamic range. I process raw FASTQ files through STAR or HISAT2 for alignment to the reference genome, then featureCounts or Salmon for transcript quantification. Differential expression analysis uses DESeq2, edgeR, or limma with false discovery rate (FDR) correction.

Gene set enrichment analysis (GSEA) and GO/KEGG pathway analysis map dysregulated genes to biological functions. This approach is powerful for identifying drivers of disease phenotypes and drug response signatures. Combined with proteomics, bulk RNA-seq validates protein-level findings and quantifies transcript abundance.

MA Plot — Log Fold Change vs. Mean Expression

A. Differential Expression (MA Plot)

PFF + IFN-γ vs. Control. X-axis: log2 mean expression, Y-axis: log2 fold change.

Red dots: significantly upregulated (FDR < 0.05, |log2FC| > 1). Blue dots: significantly downregulated.

GSEA Enrichment Curve

B. Gene Set Enrichment Analysis (GSEA)

Running enrichment score for HALLMARK_INTERFERON_GAMMA_RESPONSE in PFF+IFN-γ condition.

Curve indicates cumulative enrichment of IFN-γ response genes. Positive NES: upregulated in dual-hit condition.

04

Bioinformatics Pipeline

Integrated Bioinformatics Workflow

Multi-omics analysis requires seamless integration of sample preparation, sequencing, and computational processing. The pipeline below outlines each stage from wet-lab library construction through final interpretation and visualization:

+

Methods

Detailed Methodology & Techniques

Comprehensive descriptions of key experimental techniques, assay platforms, and analytical methods referenced throughout this page.

TMT Labeling

I perform TMT-based quantitative proteomics using 10-plex and 16-plex labeling for multiplexed protein quantification across conditions. Workflow includes protein extraction, tryptic digestion, TMT labeling, high-pH reversed-phase fractionation, and LC-MS/MS analysis on Orbitrap platforms. TMT enables precise relative quantification of >6,000 proteins per experiment, which I apply to compare disease vs control, drug-treated vs vehicle, and time-course samples in iPSC neuronal models.

Multi-Omics Integration Pipeline

Parallel proteomics, single-cell transcriptomics, and bulk RNA-seq workflows converging into an integrated bioinformatics analysis for disease biomarker discovery and drug response mapping.

LANE 01

TMT Proteomics

LC-MS/MS · Quantitative Protein

Sample Preparation

Cell lysis, reduction, alkylation, trypsin digest. Subcellular fractionation (soluble, membrane, insoluble).

TMT Labeling

Isobaric TMT 10/11/16-plex labeling. Multiplexed quantification across conditions.

LC-MS/MS Acquisition

High-resolution Orbitrap mass spectrometry. DDA or DIA acquisition modes.

Database Search

Proteome Discoverer / MaxQuant / Perseus. FDR correction, normalization, imputation.

Differential Expression

Volcano plots (log₂FC vs. −log₁₀ padj). Subcellular compartment–resolved analysis.

LANE 02

scRNA-seq

10x Genomics Chromium

Cell Capture

10x Chromium microfluidic capture. GEM generation & barcoding. >5,000 cells per condition.

Library Prep & Sequencing

cDNA amplification, library construction. Illumina NovaSeq / NextSeq. ~50k reads per cell.

Alignment & QC

Cell Ranger pipeline. Doublet removal, mitochondrial QC. Count matrix generation.

Clustering & Annotation

Seurat / Scanpy. UMAP/t-SNE visualization. Automated cell-type annotation + manual curation.

Trajectory & DEG

Pseudotime analysis (Monocle). Differential gene expression per cluster. Ligand–receptor mapping.

LANE 03

Bulk Transcriptomics

RNA-seq · RT-qPCR

RNA Extraction

TRIzol or column-based. RIN > 8.0 QC. Poly-A selection or ribo-depletion.

Library & Sequencing

Stranded mRNA library prep. Illumina paired-end sequencing. ≥30M reads per sample.

Alignment & Counts

STAR / HISAT2 alignment. featureCounts gene quantification. Batch correction.

DESeq2 Analysis

Differential expression (Wald test). MA plots, heatmaps, PCA. Shrinkage estimators.

Pathway Enrichment

GSEA, GO (BP/MF/CC), KEGG. IFN-γ signaling, lysosomal degradation, antigen presentation.

CONVERGENCE

Integrated Analysis

Cross-Omics Correlation

Protein–mRNA concordance analysis. Identify post-transcriptionally regulated targets.

Pathway Consensus

Overlay GSEA results across proteomics and transcriptomics for robust pathway calls.

Biomarker Nomination

Multi-omics concordant hits for target engagement and pharmacodynamic biomarkers.

Cell-Type Deconvolution

Map single-cell signatures onto bulk data to resolve cellular composition shifts.

Network Analysis

Protein–protein interaction networks layered with transcriptional regulation data.

Therapeutic Prioritization

Druggable target identification from convergent dysregulated nodes across datasets.

Software & Analysis Ecosystem

Proteome Discoverer MaxQuant Perseus Seurat Scanpy DESeq2 Cell Ranger STAR GSEA IPA Monocle Python R / Bioconductor Plotly

Fig. — TMT Proteomics: Protein Identification Depth

Number of quantified proteins across TMT-10plex fractionation strategies

Orbitrap Exploris 480. 24 fractions. Proteome Discoverer 2.5.

LC-MS/MS Analysis

I operate and analyze data from LC-MS/MS workflows on Thermo Orbitrap (Exploris 480, Q Exactive HF) and triple-quadrupole platforms. Applications span discovery proteomics (DDA), targeted quantification (PRM/MRM), phosphoproteomics (TiO2/IMAC enrichment), and small molecule analysis (HPLC-MS for dopamine, metabolites). I handle all stages from sample preparation through database search and statistical analysis.

LC-MS/MS: Proteome Coverage by Acquisition Mode

Number of unique proteins identified across DDA, DIA, and PRM acquisition strategies

Orbitrap Exploris 480. 120-min gradient. 24 fractions (DDA). Human iPSC-neuron lysate.

Pathway Analysis

I perform Gene Set Enrichment Analysis (GSEA) and pathway analysis using ranked gene/protein lists from differential expression analyses. Tools include fgsea (R), GSEApy (Python), and Enrichr. I analyze GO biological process, KEGG, Reactome, and Hallmark gene sets to identify dysregulated pathways. This analysis revealed lysosomal, autophagy, and inflammatory pathways as the most significantly altered in PFF+IFN-γ treated dopaminergic neurons.

Fig. — GSEA: Top Enriched Pathways

Normalized enrichment scores for top dysregulated pathways in PFF+IFN-γ DA neurons

fgsea analysis. Hallmark + KEGG gene sets. FDR < 0.05.

Bioinformatics Tools

I use DESeq2 for bulk RNA-seq differential expression, Seurat (R) and Scanpy (Python) for single-cell RNA-seq analysis, and custom Python/R pipelines for proteomics data processing. Analyses include normalization, batch correction (Harmony, ComBat), clustering, trajectory inference (Monocle3, PAGA), cell type annotation (SingleR, CellTypist), and integration across modalities. All pipelines are version-controlled and documented in Jupyter/R Markdown notebooks.

Bioinformatics Pipeline: Gene Counts by Analysis Stage

Number of features retained at each quality-control filtering step

Bulk RNA-seq pipeline. STAR alignment → featureCounts → DESeq2. Thresholds: >10 counts in ≥3 samples.

Volcano Plot Analysis

I generate volcano plots as the primary visualization for differential expression results from proteomics and transcriptomics. Plots display log2 fold change vs −log10 adjusted p-value, with thresholds for significance (FDR < 0.05) and effect size (|log2FC| > 1). I annotate key genes/proteins of interest and use color coding to highlight pathway-specific hits (e.g., lysosomal genes in red, mitochondrial in blue). Generated using ggplot2 (R) or matplotlib/seaborn (Python).

Volcano Plot: PFF+IFN-γ vs Control Proteomics

log₂(fold change) vs −log₁₀(adjusted p-value) for differential protein expression

Significance thresholds: |log₂FC| > 1 (dashed vertical), padj < 0.05 (dashed horizontal). Red: up, Blue: down.

Dimensionality Reduction

I apply UMAP and t-SNE for visualization of high-dimensional single-cell and proteomics datasets. Parameters are carefully tuned (n_neighbors, min_dist for UMAP; perplexity for t-SNE) to balance local and global structure preservation. I use these embeddings for cluster visualization, trajectory analysis, and identifying rare cell populations in scRNA-seq datasets containing 10,000–100,000+ cells.

UMAP: Proteomics Sample Clustering

UMAP projection of proteomics samples colored by experimental condition

UMAP on top 500 variable proteins. n_neighbors=15, min_dist=0.3. Conditions: Ctrl, PFF, IFN-γ, PFF+IFN-γ.

Proteome Discoverer

I use Thermo Proteome Discoverer (v2.4/2.5) for database searching, protein identification, and quantification of LC-MS/MS proteomics data. Workflows include Sequest HT and MS Amanda search engines, Percolator for FDR control, and TMT reporter ion quantification. I configure custom processing and consensus workflows for specific experimental designs including phosphoproteomics (ptmRS node) and label-free quantification (Minora feature detection).

Proteome Discoverer: Search Engine Comparison

Protein identifications by search engine at 1% FDR

Same raw file. UniProt Human (20,600 entries). Percolator FDR 1%. MS Amanda + Sequest HT consensus.

MaxQuant Analysis

I use MaxQuant for label-free quantification (LFQ) and SILAC-based proteomics data analysis. MaxQuant's Andromeda search engine and match-between-runs feature enable deep proteome coverage from single-shot analyses. Downstream statistical analysis uses Perseus for imputation, normalization, t-tests, and hierarchical clustering. I also integrate MaxQuant output with custom Python scripts for pathway analysis and multi-omics data integration.

MaxQuant: LFQ Quantification Depth

Proteins quantified with and without match-between-runs (MBR)

MaxQuant v2.4.3. LFQ min ratio count: 2. Match time window: 0.7 min. Human neuron lysate.