Spaces:

theodore-ioann
/

Skin-Lesion-Segmentation

Running

App Files Files Community

theodore-ioann commited on May 28

Commit

5b303e8

verified ·

1 Parent(s): fe47d51

Upload 15 files

Browse files

Files changed (16) hide show

.gitattributes +2 -0
README.md +48 -14
app.py +81 -0
examples/ISIC_0012880.jpg +3 -0
examples/ISIC_0015972.jpg +3 -0
inception.pt +3 -0
requirements.txt +9 -0
results/class_distribution.png +0 -0
results/gmm_distribution.png +0 -0
results/kmeans_distribution.png +0 -0
results/segformer_distribution.png +0 -0
results/unet_distribution.png +0 -0
segformer.pt +3 -0
supervised.py +141 -0
unet.pt +3 -0
utils.py +205 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+examples/ISIC_0012880.jpg filter=lfs diff=lfs merge=lfs -text
+examples/ISIC_0015972.jpg filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -1,14 +1,48 @@
----
-title: Skin Lesion Segmentation
-emoji: 🔥
-colorFrom: indigo
-colorTo: pink
-sdk: gradio
-sdk_version: 5.31.0
-app_file: app.py
-pinned: false
-license: mit
-short_description: Different models trained on skin lesion segmentation
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# ISIC 2018 Skin Lesion Segmentation
+This project explores unsupervised and supervised image segmentation methods applied to the **ISIC 2018 skin lesion dataset**. It compares simple segmentation techniques like **KMeans** and **Gaussian Mixture Models (GMM)** against deep learning models (Unet, Inception-inspired CNN and SegFormer). The deep models are trained on ISIC data, evaluated on the test set and its performance is compared with the baseline models.
+## Goals
+- Segment skin lesions from dermoscopic images.
+- Compare baseline unsupervised methods (KMeans, GMM) with deep learning models (Unet, Inception-inspired CNN, SegFormer).
+- Evaluate masks using standard metrics: **IoU**, **F1-score**, **Accuracy**.
+- Visualize results with overlays (predictions vs. ground truth).
+- Explore which morphological operations can improve the quality of the segmentations.
+    (erosion, dilation, opening, closing)
+## Dataset
+- **ISIC 2018 Challenge - Task 1**
+- ~2,600 dermoscopic images and ground truth lesion masks.
+- Downloaded from [ISIC Archive](https://challenge.isic-archive.com/data/#2018).
+The images and masks are stored in different folders.
+- The dataset is split into training, validation, and test sets.
+- Due to the nature of the task, there is a notable class imbalance, which we have to take into account when training and evaluating the models.
+![Class Distribution](results/class_distribution.png)
+## Unsupervised Methods
+- **KMeans**: Clustering algorithm that partitions the image into K clusters based on distances between pixel values.
+- Results:
+![KMeans Results](results/kmeans_distribution.png)
+- **Gaussian Mixture Models (GMM)**: Probabilistic model that assumes the presence of multiple Gaussian distributions in the data.
+- Results:
+![GMM Results](results/gmm_distribution.png)
+## Supervised Methods
+- **Unet**: A convolutional neural network architecture designed for biomedical image segmentation.
+- Results:
+![Unet Results](results/unet_distribution.png)
+- **Inception CNN**: A custom architecture inspired by the Inception model, designed for image segmentation tasks.
+- Results:
+<!-- ![Inception Results](results/inception_distribution.png) -->
+- **SegFormer**: A transformer-based model that captures long-range dependencies in images, achieving state-of-the-art results in various vision tasks.
+- Results:
+![SegFormer Results](results/segformer_distribution.png)
+## Evaluation results
+From the evaluation of the models on the test set, we can see that the deep learning models outperform the unsupervised methods in terms of IoU, F1-score, and accuracy. The SegFormer model achieves the best results, followed by Unet and Inception CNN. Overall, it is clear that the deep learning models are able to segment skin lesions better than the unsupervised methods. It is also worth noting that all models significantly outperform the majority baseline (which would achieve an accuracy of 76.4%).
+## How to Run
+1. **Setup**: Install dependencies with `pip install -r requirements.txt`.
+2. **Training**: Run `python main.py --model segformer --epochs 20 --visualize True` to train the SegFormer (or any of the other models).
+3. **Testing**: Use the `python main.py --model segformer --visualize True` to evaluate and visualize model predictions on the test set, with a `segformer.pt` file in the same directory.

app.py ADDED Viewed

	@@ -0,0 +1,81 @@

+import gradio as gr
+import torch
+from PIL import Image
+from sklearn.cluster import KMeans
+from sklearn.mixture import GaussianMixture
+from utils import *
+from supervised import *
+device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+# Load Models
+models = {
+    "unet": UNet(num_classes=2).to(device),
+    "segformer": Segformer(num_classes=2).to(device),
+    "inception": Inception(num_classes=2).to(device),
+    "kmeans": KMeans(n_clusters=2),
+    "gmm": GaussianMixture(n_components=2),
+}
+models["unet"].load_state_dict(torch.load("unet.pt", map_location=device))
+models["segformer"].load_state_dict(torch.load("segformer.pt", map_location=device))
+models["inception"].load_state_dict(torch.load("inception.pt", map_location=device))
+for model in models.values():
+    if isinstance(model, (UNet, Segformer, Inception)):
+        model.eval()
+# Inference function
+def inference(image, model_name, postprocess_mode):
+    model = models[model_name]
+    status_text = f"✅ Inference with {model_name.upper()} and postprocessing mode: {postprocess_mode}"
+    bw_mask, overlay = predict_and_visualize_single(model, image, postprocess_mode=postprocess_mode)
+    return overlay, bw_mask, status_text
+# Gradio Interface
+with gr.Blocks(theme=gr.themes.Base(primary_hue="rose", secondary_hue="slate")) as demo:
+    gr.Markdown("## 🩺 Skin Lesion Segmentation")
+    gr.Markdown("Upload a skin image, choose a model, and view segmentation results.")
+    with gr.Row():
+        with gr.Column(scale=1):
+            image_input = gr.Image(type='numpy', label="📷 Upload Image")
+            model_choice = gr.Radio(
+                choices=["unet", "segformer", "inception", "kmeans", "gmm"],
+                label="Model",
+                value="unet"
+            )
+            post_choice = gr.Radio(
+                choices=["none", "open", "close", "erosion", "dilation"],
+                label="Postprocessing",
+                value="none"
+            )
+            run_btn = gr.Button("▶ Run Segmentation")
+        with gr.Column(scale=2):
+            with gr.Row():
+                overlay_output = gr.Image(type='numpy', label="🎯 Overlay")
+                mask_output = gr.Image(type='numpy', label="🖤 Predicted Mask")
+            status = gr.Textbox(label="Status", interactive=False)
+    with gr.Row():
+        gr.Examples(
+            examples=["./examples/ISIC_0012880.jpg", "./examples/ISIC_0015972.jpg"],
+            inputs=[image_input],
+            label="Use Example Images"
+        )
+    with gr.Accordion("ℹ️ Legend", open=False):
+        gr.Markdown("""
+        - **🔴 Red**: Predicted lesion overlay
+        - **⚫ White**: Binary mask
+        - **Postprocessing**: Cleans up noisy segmentation
+        """)
+    run_btn.click(
+        fn=inference,
+        inputs=[image_input, model_choice, post_choice],
+        outputs=[overlay_output, mask_output, status]
+    )
+demo.launch(share=True)

examples/ISIC_0012880.jpg ADDED Viewed

Git LFS Details

SHA256: b41abada1baaeb678002cea86ac171b6b802ab24440c6ec0d0ae78076d29eb23
Pointer size: 132 Bytes
Size of remote file: 2.51 MB

examples/ISIC_0015972.jpg ADDED Viewed

Git LFS Details

SHA256: fd21eec581f03f74b3abb98e0251ebf124cb73202a459e9bf605ae360287c768
Pointer size: 132 Bytes
Size of remote file: 3.83 MB

inception.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:18c17a1d87be5906b31a76558132e3c3fc16e643747b8e0859c25cb914eadce9
+size 14355657

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+torch
+torchvision
+numpy
+pillow
+scikit-learn
+matplotlib
+opencv-python
+gradio
+transformers

results/class_distribution.png ADDED Viewed

results/gmm_distribution.png ADDED Viewed

results/kmeans_distribution.png ADDED Viewed

results/segformer_distribution.png ADDED Viewed

results/unet_distribution.png ADDED Viewed

segformer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0dcb4e5c9d19ab4caffaa324e4cf26090bcc9d37fb47840e38d81268691d8341
+size 14946661

supervised.py ADDED Viewed

	@@ -0,0 +1,141 @@

+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import torchvision.transforms as T
+from transformers import SegformerForSemanticSegmentation, SegformerImageProcessor
+#=======================================
+#========= UNet Architecture ===========
+#=======================================
+class UNet(nn.Module):
+    def __init__(self, in_channels=3, num_classes=2):
+        super(UNet, self).__init__()
+        def conv_block(in_c, out_c):
+            return nn.Sequential(
+                nn.Conv2d(in_c, out_c, kernel_size=3, padding=1),
+                nn.BatchNorm2d(out_c),
+                nn.ReLU(inplace=True),
+                nn.Conv2d(out_c, out_c, kernel_size=3, padding=1),
+                nn.BatchNorm2d(out_c),
+                nn.ReLU(inplace=True)
+            )
+        self.encoder1 = conv_block(in_channels, 64)
+        self.pool1 = nn.MaxPool2d(2)
+        self.encoder2 = conv_block(64, 128)
+        self.pool2 = nn.MaxPool2d(2)
+        self.encoder3 = conv_block(128, 256)
+        self.pool3 = nn.MaxPool2d(2)
+        self.bottleneck = conv_block(256, 512)
+        self.upconv3 = nn.ConvTranspose2d(512, 256, kernel_size=2, stride=2)
+        self.decoder3 = conv_block(512, 256)
+        self.upconv2 = nn.ConvTranspose2d(256, 128, kernel_size=2, stride=2)
+        self.decoder2 = conv_block(256, 128)
+        self.upconv1 = nn.ConvTranspose2d(128, 64, kernel_size=2, stride=2)
+        self.decoder1 = conv_block(128, 64)
+        self.final = nn.Conv2d(64, num_classes, kernel_size=1)
+    def forward(self, x):
+        enc1 = self.encoder1(x)
+        enc2 = self.encoder2(self.pool1(enc1))
+        enc3 = self.encoder3(self.pool2(enc2))
+        bottleneck = self.bottleneck(self.pool3(enc3))
+        dec3 = self.upconv3(bottleneck)
+        dec3 = torch.cat((dec3, enc3), dim=1)
+        dec3 = self.decoder3(dec3)
+        dec2 = self.upconv2(dec3)
+        dec2 = torch.cat((dec2, enc2), dim=1)
+        dec2 = self.decoder2(dec2)
+        dec1 = self.upconv1(dec2)
+        dec1 = torch.cat((dec1, enc1), dim=1)
+        dec1 = self.decoder1(dec1)
+        return self.final(dec1)
+#=======================================
+#======= Inception Architecture ========
+#=======================================
+class InceptionBlock(nn.Module):
+    def __init__(self, in_channels, out_channels):
+        super(InceptionBlock, self).__init__()
+        self.b1 = nn.Sequential(nn.Conv2d(in_channels, out_channels, kernel_size=1),
+        nn.ReLU(inplace=True))
+        self.b2 = nn.Sequential(
+            nn.Conv2d(in_channels, out_channels, kernel_size=1),
+            nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1),
+            nn.ReLU(inplace=True)
+        )
+        self.b3 = nn.Sequential(
+            nn.Conv2d(in_channels, out_channels, kernel_size=1),
+            nn.Conv2d(out_channels, out_channels, kernel_size=5, padding=2),
+            nn.ReLU(inplace=True)
+        )
+        self.b4 = nn.Sequential(
+            nn.MaxPool2d(kernel_size=3, stride=1, padding=1),
+            nn.Conv2d(in_channels, out_channels, kernel_size=1),
+            nn.ReLU(inplace=True)
+        )
+    def forward(self, x):
+        b1 = self.b1(x)
+        b2 = self.b2(x)
+        b3 = self.b3(x)
+        b4 = self.b4(x)
+        return torch.cat([b1, b2, b3, b4], dim=1)
+class Inception(nn.Module):
+    def __init__(self, in_channels=3, num_classes=2):
+        super(Inception, self).__init__()
+        self.weights_init()
+        self.inception1 = InceptionBlock(in_channels, 64)
+        self.inception2 = InceptionBlock(256, 128)
+        self.inception3 = InceptionBlock(512, 256)
+        self.conv1x1 = nn.Conv2d(1024, num_classes, kernel_size=1)
+        self.upsample = nn.Upsample(scale_factor=8, mode='bilinear', align_corners=True)
+    def weights_init(self):
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
+                nn.init.kaiming_normal_(m.weight, nonlinearity="relu")
+    def forward(self, x):
+        height, width = x.shape[2], x.shape[3]
+        x = self.inception1(x)
+        x = self.inception2(x)
+        x = self.inception3(x)
+        x = self.conv1x1(x)
+        x = F.interpolate(x, size=(height, width), mode='bilinear', align_corners=True)
+        return x
+#=======================================
+#======= Swin Transformer ==============
+#=======================================
+class Segformer(nn.Module):
+    def __init__(self, model_name='nvidia/segformer-b0-finetuned-ade-512-512', num_classes=2):
+        super(Segformer, self).__init__()
+        self.model = SegformerForSemanticSegmentation.from_pretrained(
+            model_name,
+            num_labels=num_classes,
+            ignore_mismatched_sizes=True
+        )
+        self.processor = SegformerImageProcessor.from_pretrained(model_name)
+        self.normalizer = T.Normalize(mean=self.processor.image_mean, std=self.processor.image_std)
+    def forward(self, x):
+        x = self.normalizer(x)
+        logits = self.model(pixel_values=x).logits  # Shape: [B, C, H', W']
+        logits = F.interpolate(logits, size=(x.shape[2], x.shape[3]), mode='bilinear', align_corners=True)
+        return logits

unet.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d58601e6433b7ebeab5ec4249ca02e413b62b28cd9e69b1719a368cb6deae5bb
+size 30861979

utils.py ADDED Viewed

	@@ -0,0 +1,205 @@

+import cv2
+import torch
+import numpy as np
+from PIL import Image
+import matplotlib.pyplot as plt
+from supervised import UNet, Segformer, Inception
+from sklearn.cluster import KMeans
+from sklearn.mixture import GaussianMixture
+from torchvision import transforms
+from sklearn.metrics import accuracy_score, jaccard_score, f1_score, confusion_matrix, ConfusionMatrixDisplay
+def postprocess(masks, mode="open", kernel_size=5, iters=1):
+    kernel = np.ones((kernel_size, kernel_size), np.uint8)
+    if mode == "open":
+        new_masks = [cv2.morphologyEx(mask.astype(np.uint8), cv2.MORPH_OPEN, kernel, iterations=iters) for mask in masks]
+    elif mode == "close":
+        new_masks = [cv2.morphologyEx(mask.astype(np.uint8), cv2.MORPH_CLOSE, kernel, iterations=iters) for mask in masks]
+    elif mode == "erosion":
+        new_masks = [cv2.erode(mask.astype(np.uint8), kernel, iterations=iters) for mask in masks]
+    elif mode == "dilation":
+        new_masks = [cv2.dilate(mask.astype(np.uint8), kernel, iterations=iters) for mask in masks]
+    else:
+        new_masks = masks
+    return new_masks
+def fix_labels(pred_masks, gt_masks, lesion_positive=True):
+    """
+    Flip predicted masks if needed based on GT, and ensure lesion is 1.
+    If lesion_positive=True, final output has lesion as 1.
+    """
+    fixed_preds = []
+    for pred, gt in zip(pred_masks, gt_masks):
+        pred = pred.astype(np.uint8)
+        gt = (gt > 0).astype(np.uint8)
+        # Flatten for metric comparison
+        pred_flat = pred.flatten()
+        gt_flat = gt.flatten()
+        # Try both label assignments
+        iou_0 = jaccard_score(gt_flat, (pred_flat == 0))
+        iou_1 = jaccard_score(gt_flat, (pred_flat == 1))
+        # Flip if label 0 gives better IoU
+        if iou_0 > iou_1:
+            pred = 1 - pred
+        # Optional: ensure lesion is positive (class 1)
+        if lesion_positive:
+            # If GT has more lesion pixels than background, make sure pred does too
+            gt_lesion_ratio = np.sum(gt) / gt.size
+            pred_lesion_ratio = np.sum(pred) / pred.size
+            if pred_lesion_ratio < 0.5 and gt_lesion_ratio > 0.5:
+                pred = 1 - pred
+        fixed_preds.append(pred)
+    return fixed_preds
+def evaluate_masks(pred_masks, gt_masks):
+    """
+    Evaluate predicted masks.
+    Returns mean metrics (accuracy, iou, f1).
+    """
+    acc_list = []
+    iou_list = []
+    f1_list = []
+    cm = np.zeros((2, 2), dtype=int)
+    for pred, gt in zip(pred_masks, gt_masks):
+        pred_flat = pred.flatten()
+        gt_flat = (gt.flatten() > 0).astype(np.uint8)
+        acc0 = accuracy_score(gt_flat, (pred_flat == 0))
+        acc1 = accuracy_score(gt_flat, (pred_flat == 1))
+        acc = accuracy_score(gt_flat, pred_flat)
+        iou = jaccard_score(gt_flat, pred_flat)
+        f1 = f1_score(gt_flat, pred_flat)
+        acc_list.append(acc)
+        iou_list.append(iou)
+        f1_list.append(f1)
+        cm += confusion_matrix(gt_flat, pred_flat, labels=[0, 1])
+    mean_acc = np.mean(acc_list)
+    mean_iou = np.mean(iou_list)
+    mean_f1 = np.mean(f1_list)
+    print(f"Mean Accuracy: {mean_acc:.4f}")
+    print(f"Mean IoU (Jaccard): {mean_iou:.4f}")
+    print(f"Mean F1 Score (Dice): {mean_f1:.4f}")
+    disp = ConfusionMatrixDisplay(cm, display_labels=["Background", "Lesion"])
+    disp.plot(cmap="Blues", values_format="d")
+    plt.title("Confusion Matrix (Aggregated)")
+    plt.show()
+    # Plot histograms
+    plt.figure(figsize=(15, 4))
+    plt.subplot(1, 3, 1)
+    plt.hist(acc_list, bins=10, color='r', alpha=0.6, edgecolor='black')
+    plt.title("Accuracy Distribution")
+    plt.subplot(1, 3, 2)
+    plt.hist(iou_list, bins=10, color='g', alpha=0.6, edgecolor='black')
+    plt.title("IoU Distribution")
+    plt.subplot(1, 3, 3)
+    plt.hist(f1_list, bins=10, color='skyblue', alpha=0.6, edgecolor='black')
+    plt.title("F1 Score Distribution")
+    plt.tight_layout()
+    plt.show()
+def overlay_mask(image, mask, color=(255, 0, 0), alpha=0.5):
+    """
+    Overlay a binary mask on top of an image.
+    - image: (H, W, 3) numpy array, RGB
+    - mask: (H, W) numpy array, 0/1 values or 0/255
+    - color: RGB tuple for mask color
+    - alpha: transparency factor (0=transparent, 1=opaque)
+    """
+    image = image.copy()
+    # Make sure mask is binary 0 or 1
+    if mask.max() > 1:
+        mask = (mask > 127).astype(np.uint8)
+    # Create colored mask
+    colored_mask = np.zeros_like(image)
+    colored_mask[:, :, 0] = color[0]
+    colored_mask[:, :, 1] = color[1]
+    colored_mask[:, :, 2] = color[2]
+    # Apply mask
+    mask_3d = np.repeat(mask[:, :, np.newaxis], 3, axis=2)
+    overlay = np.where(mask_3d, (1 - alpha) * image + alpha * colored_mask, image)
+    return overlay.astype(np.uint8)
+def visualize_overlay(image, gt_mask, pred_mask, post_mask=None, alpha=0.5):
+    """
+    Plot original image + overlay GT mask and Predicted mask.
+    """
+    plt.figure(figsize=(18, 6))
+    # Original
+    plt.subplot(1, 3, 1)
+    plt.imshow(image)
+    plt.title("Original Image")
+    plt.axis("off")
+    # Ground Truth Overlay
+    overlay_gt = overlay_mask(image, gt_mask, color=(0, 255, 0), alpha=alpha)
+    plt.subplot(1, 3, 2)
+    plt.imshow(overlay_gt)
+    plt.title("Ground Truth Overlay (Green)")
+    plt.axis("off")
+    # Predicted Overlay
+    overlay_pred = overlay_mask(image, pred_mask, color=(255, 0, 0), alpha=alpha)
+    plt.subplot(1, 3, 3)
+    plt.imshow(overlay_pred)
+    plt.title("Prediction Overlay (Red)")
+    plt.axis("off")
+    plt.tight_layout()
+    plt.show()
+def predict_and_visualize_single(model, image_path, postprocess_mode='none', alpha=0.5, device='cpu'):
+    image = Image.fromarray(image_path).convert('RGB')
+    original_np = np.array(image.resize((128, 128)))
+    transform = transforms.Compose([
+        transforms.Resize((128, 128)),
+        transforms.ToTensor()
+    ])
+    input_tensor = transform(image).unsqueeze(0).to(device)
+    if isinstance(model, (UNet, Segformer, Inception)):
+        with torch.no_grad():
+            output = model(input_tensor)
+            if isinstance(output, dict):
+                output = output.get("logits") or output.get("out")
+            pred_mask = torch.argmax(output.squeeze(), dim=0).cpu().numpy()
+    elif isinstance(model, (KMeans, GaussianMixture)):
+        model.fit(original_np.reshape(-1, 3))
+        pred_mask = model.predict(original_np.reshape(-1, 3)).reshape(128, 128)
+    if postprocess_mode != 'none':
+        pred_mask = postprocess([pred_mask], mode=postprocess_mode)[0]
+    bw_mask = (pred_mask * 255).astype(np.uint8)
+    overlay = overlay_mask(original_np, pred_mask, color=(255, 0, 0), alpha=alpha)
+    # Resize outputs to 384x384
+    bw_mask = cv2.resize(pred_mask.astype(np.uint8) * 255, (256, 256), interpolation=cv2.INTER_NEAREST)
+    overlay = cv2.resize(overlay_mask(original_np, pred_mask, color=(255, 0, 0), alpha=alpha),
+        (256, 256),
+        interpolation=cv2.INTER_LINEAR
+    )
+    return bw_mask, overlay