File size: 4,287 Bytes
babf969
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
> **Note:** This repo contains only deployment/demo files.  
> For full source, notebooks, and complete code, see [Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI](https://github.com/Kev-HL/capsule-defect-segmentation-api).  

# Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI

This project addresses a real-world computer vision challenge: detecting and localizing defects on medicinal capsules via image classification and segmentation.  
The aim is to deliver a complete pipeline—data preprocessing, model training and evaluation, and deployment, demonstrating practical ML engineering from scratch to API.

---

## Main Repo

This is a minimal clone with only the necessary files from the main repo.  
For full source, notebooks, and complete code, see [Capsule Defect Detection and Segmentation with ConvNeXt+U-Net and FastAPI](https://github.com/Kev-HL/capsule-defect-segmentation-api).  

---

## Project Overview

End-to-end defect detection and localization using the **Capsule** class from the **MVTec AD dataset**.  
Key steps include:
- Data preprocessing, formatting, and augmentation
- Model design (pre-trained backbone + custom heads)
- Training, evaluation, and hyperparameter tuning
- Dockerized FastAPI deployment for inference

*Portfolio project to showcase ML workflow and engineering.*

---

## Key Results

- Evaluation dataset: MVTec AD 'capsule' class, 70/15/15 train/val/test split
- Quantitative results on test evaluation:
  - Classification accuracy: **83 %**
  - Classification defect-only accuracy: **75 %**
  - Defect presence accuracy: **91 %**
  - Segmentation quality (mIoU / Dice): **0.79 / 0.73**
  - Segmentation defect-only quality (mIoU / Dice): **0.70 / 0.55**
- Model artifacts:
  - Original model size (.keras / SavedModel): **345 MB**
  - Raw Converted TFLite size (.tflite): **119 MB**
  - Optimized Converted TFLite size (.tflite): **31 MB** (Dynamic Range Quantization applied)
- Container / runtime:
  - Docker image size: **317 MB**
  - Runtime used: **tflite-runtime + Uvicorn/FastAPI**
  - Avg inference latency (inference only, set tensor + invoke): **239 ms**
  - Avg inference latency (single POST request, measured): **271 ms**
  - Average memory usage during inference: **321 MB**
  - Startup time (local): **72 ms**
- Observations:
  - The app returns expected visualizations and class labels for the MVTec-style test images.
  - POST inference latency measured locally, expect increased latency on real use (network delays)
  - Given the small and highly imbalanced dataset (351 samples, 242 'good' and 109 defective distributed in 5 defect types, ~22 per defect), coupled with the nature of the samples (only distinctive feature is the defect, which in most cases has a small size and varied shape), performance is not as strong as desired, and results lack statistical confidence for a real-case use. Without more data would be difficult to get a reasonable improvement.

---

## Dataset

- *Capsule* class from [MVTec AD dataset](https://www.mvtec.com/company/research/datasets/mvtec-ad)
- License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
- Dataset folder contains license file  
- Usage is strictly non-commercial/educational

---

## Tech Stack

- Python
- TensorFlow
- Scikit-Learn
- Numpy / Pandas
- OpenCV / Pillow
- Ray Tune (Experiment tracking)
- OmegaConf (Config management)
- Docker, FastAPI, Uvicorn (Deployment)

---

## Folder Structure

```
data/       # Dataset and annotations
app/        # Inference and deployment code and files
models/     # Saved trained models and training logs
```

---

## How to Run

**Build image for deployment:**  
- Requirements:
  - `models/final_model/final_model.tflite` (included)
  - `app/` folder and contents  (included)
  - `Dockerfile` (included)
  - `.dockerignore` (included)
- From the project root, build and run the Docker image:
```sh
docker build -t cv-app .
docker run -p 8000:8000 cv-app
```
- Open http://0.0.0.0:8000 in your browser to access the demo UI  

_Note: For the full source code and steps on how to recreate the model, visit the full repo (see "Main Repo" section near the top)_  

---

## Contact

For questions reach out via GitHub (Kev-HL).