CSE 599 · Academic Case Study

FNO-Diffusion for Brain MRI Segmentation

Goal: evaluate brain tumor segmentation on BraTS 2021 by replicating two baselines (FNO segmentation and diffusion U-shape) and testing a hybrid that integrates FNO into diffusion and supervised branches. Hypothesis: FNO global modeling plus diffusion refinement could outperform each model alone.

Implementation: PyTorch on 1× NVIDIA Tesla V100, with training time of about 2–8 hours depending on model.

CSE 599 BraTS 2021 MRI Segmentation Diffusion FNO PyTorch
69.44% · Diffusion with U-shape (best)
65.13% · FNO baseline
57.03% · FNO-Diffusion hybrid

Quick Facts

Core dataset and experiment setup.

Dataset

BraTS 2021 (2D slices from 3D volumes)

Task

Brain tumor segmentation (multi-class)

Metric

DICE coefficient

Compute

1× NVIDIA Tesla V100

Data split

70% train / 20% val / 10% test

Training time

~2–8 hours

Dataset + Preprocessing

  • BraTS 2021 volume size: 240×240×155, reshaped to slice-wise 2D images.
  • Data augmentation: random horizontal and vertical flips.
  • Labels converted to one-hot format with background and tumor subregion classes.
Class labels legend figure from the PDF report.
Class labels legend (Figure 1, PDF page 3).

Problem & Motivation

Accurate tumor boundary delineation is clinically important, while manual annotation is slow and error-prone.

  • Reliable segmentation is important for treatment planning and follow-up.
  • Boundary precision is hard in heterogeneous lesions.
  • This project tests whether global spectral modeling plus diffusion refinement helps.

Approach

Three model tracks under the same data split and DICE evaluation.

A) FNO Segmentation Baseline

Global context via spectral convolution with Fourier layers.

B) Diffusion with U-shape Baseline

Dual-path diffusion-guided supervision with U-shape locality and skip connections.

C) Proposed FNO-Diffusion Hybrid

Replace UNet modules with FNO blocks in diffusion and supervised branches.

FNO architecture diagram from Figure 2.
FNO architecture (Figure 2, PDF page 3).
Dual-path diffusion supervision diagram from Figure 3.
Dual-path diffusion supervision (Figure 3, PDF page 4).
Training Details (optional)

FNO: modes k1=k2=10, width 16, 3 repeated blocks per branch, batch 8, epochs 50, Adam lr 3e-4, GELU, Dice + CE (lambda=0.5).

Diffusion U-shape: Adam lr 1e-2, batch 32, max 300 epochs, early stop 50, EMA 0.99, Dice + CE, dynamic class weights, unsupervised weight 10.

FNO-Diffusion: SGD momentum 0.9, weight decay 3e-5, lr 0.001, batch 32, EMA 0.99, timesteps 1000 (sampling 10), FNO modes [16,16], width 32, blocks/channel 3, time embedding 512, dropout 0.5.

Visual Results

Three-column qualitative comparison: Input, Ground Truth, Prediction.

FNO Baseline (Figure 4, PDF page 8)

FNO input MRI slice.
Input MRI
FNO ground-truth mask.
Ground Truth
FNO prediction mask.
Prediction

DICE 65.13% · Captures global context but loses boundary precision.

Diffusion with U-shape (Figure 5, PDF page 8)

Diffusion input MRI slice.
Input MRI
Diffusion ground-truth mask.
Ground Truth
Diffusion prediction mask.
Prediction

DICE 69.44% · Best qualitative and quantitative segmentation quality.

FNO-Diffusion Hybrid (Figure 6, PDF page 8)

FNO-Diffusion input MRI slice.
Input MRI
FNO-Diffusion ground-truth mask.
Ground Truth
FNO-Diffusion prediction mask.
Prediction

DICE 57.03% · Hybrid underperformed, especially on fine local boundaries.

Quantitative Results

Ranked DICE comparison on the test set.

Rank Model DICE
1 Diffusion with U-shape 69.44%
2 FNO baseline 65.13%
3 FNO-Diffusion hybrid 57.03%

#1 Diffusion with U-shape

DICE: 69.44%

#2 FNO baseline

DICE: 65.13%

#3 FNO-Diffusion hybrid

DICE: 57.03%

DICE by Model

Lessons Learned

What was easy

  • Diffusion U-shape pipeline was easier than MedSegDiff-V2 due to clearer modular structure.
  • Forward diffusion, denoising, and loss components were easier to isolate for debugging.
  • Swapping components (including FNO blocks) was straightforward in the modular baseline.

What was difficult

  • Reproducing MedSegDiff-V2 for multiclass failed despite reasonable binary performance.
  • Training was unstable and hard to tune for multiclass segmentation.
  • Timestep embeddings with FNO blocks caused shape mismatches and conditioning issues.
MedSegDiff-V2 binary segmentation result from Figure 7.
Binary segmentation reasonable; failed to generalize to multiclass segmentation.

Future Work

1. Prioritize stable baselines

Start from a reliable U-Net diffusion baseline before introducing complex hybrid modules.

2. Keep 2D before 3D

Start with 2D slices and only extend to 3D after strong stability is confirmed.

3. Run loss ablations

Test dynamic weighting strategies and loss balancing with controlled ablation studies.

4. Preserve local pathways

Use spectral blocks for global context while retaining conv/attention paths for boundary detail.

Project Context

Role

  • Shuzhen Zhang

Scope: Course project focused on baseline reproduction and hybrid-model evaluation.

Course: CSE 599

Year: 2026