TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR
TerraSeg is the first self-supervised, domain-agnostic model for LiDAR ground / non-ground segmentation. It is trained on the OmniLiDAR dataset using pseudo-labels produced by the self-supervised PseudoLabeler, and uses an adapted Point Transformer v3 backbone with dataset-specific normalization disabled.
- Code: https://github.com/TedLentsch/TerraSeg
- Paper: https://arxiv.org/abs/2603.27344 (Lentsch et al., CVPR 2026)
Released checkpoints
| File | Variant | Params | Mean mIoU (val) |
|---|---|---|---|
terraseg_s.pth |
Small | ~12M | 93.43 |
terraseg_b.pth |
Base | ~46M | 94.02 |
Mean mIoU is averaged across the three evaluation splits used in the paper: nuScenes val, SemanticKITTI val, and Waymo Perception val. No manual annotations are used during training; supervision comes entirely from the self-supervised PseudoLabeler (contribution of TerraSeg paper).
Relationship to the paper numbers
These released checkpoints are re-trained with the cleaned, public code release on OmniLiDAR excluding View-of-Delft (VoD). VoD's license made redistribution as part of OmniLiDAR awkward, so it has been permanently dropped from the released training mix. The cleaned training script also incorporates several stability improvements over the experimental script that produced the paper numbers.
The qualitative findings are unchanged: TerraSeg-B remains stronger than TerraSeg-S, and both released checkpoints reach higher mean mIoU than the corresponding paper rows. If you need to reproduce the paper's exact numbers, see the arXiv paper linked above.
License and upstream restrictions
The TerraSeg model weights in this repository are released under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license. The accompanying source code at https://github.com/TedLentsch/TerraSeg is released separately under the Apache License 2.0; the difference reflects the upstream-data constraints below.
Why CC BY-NC-SA 4.0 for the weights
For public release, TerraSeg is trained on a mixture of 11 major autonomous driving datasets (the OmniLiDAR aggregation, minus VoD). The weights file inherits the most restrictive combination of the upstream dataset licenses, which is CC BY-NC-SA 4.0. Specifically:
- Several datasets (nuScenes, SemanticKITTI, Argoverse 2, MAN TruckScenes) are themselves released under CC BY-NC-SA 4.0. The community-standard practice for autonomous-driving models trained on these datasets is to inherit the NonCommercial + ShareAlike terms in any released model artifact.
- The Waymo Open Dataset Terms of Use explicitly define machine-learning weights and parameters trained on the dataset as "Derivative IP" and restrict their distribution to non-commercial purposes. Because TerraSeg's training mix includes Waymo Perception, the released weights inherit this restriction directly (see the Waymo-specific notice below).
- Other datasets in the mix (ONCE, KITTI-360, AevaScenes) carry custom academic, non-commercial licenses with similar effective restrictions.
CC BY-NC-SA 4.0 is the simplest standardized license that satisfies all of these constraints simultaneously.
Waymo Open Dataset-specific notice
Per the Waymo Open Dataset Terms of Use, the TerraSeg weights constitute Derivative IP of the Waymo Open Dataset and are therefore distributed for non-commercial purposes only. They must not be used in or deployed to any production system, used to assist in the operation of a real-world vehicle, or used in connection with primarily commercial activities. Commercial use of the weights would require an independent commercial agreement with Waymo and with the other upstream dataset providers.
Upstream dataset attributions
If you use the TerraSeg weights, please cite this work and the 11 upstream datasets that contribute to OmniLiDAR. The references below are taken directly from the TerraSeg paper (see Table 2 there for dataset statistics).
| Dataset | Primary reference | License |
|---|---|---|
| SemanticKITTI | Behley et al., ICCV 2019 | CC BY-NC-SA 4.0 |
| Lyft Level 5 AV Dataset | Kesten et al., 2019 | CC BY 4.0 |
| nuScenes | Caesar et al., CVPR 2020 | CC BY-NC-SA 4.0 |
| Waymo Open Dataset | Sun et al., CVPR 2020 | Waymo Open Dataset Terms of Use (non-commercial) |
| PandaSet | Xiao et al., ITSC 2021 | PandaSet custom (CC BY 4.0-aligned, commercial use permitted) |
| ONCE | Mao et al., NeurIPS 2021 | Custom academic / non-commercial |
| KITTI-360 | Liao, Xie, Geiger, T-PAMI 2022 | Custom academic / non-commercial |
| Argoverse 2 Lidar | Wilson et al., NeurIPS 2021 | CC BY-NC-SA 4.0 |
| Zenseact Open Dataset (ZOD) | Alibeigi et al., ICCV 2023 | CC BY 4.0 |
| MAN TruckScenes | Fent et al., NeurIPS 2024 | CC BY-NC-SA 4.0 |
| AevaScenes | Narasimhan et al., 2025 | Custom non-commercial research license |
Quick start (Python)
import torch
from terraseg import TerraSegPredictor
predictor = TerraSegPredictor(
variant="S", # "S" for Small (~12M) or "B" for Base (~46M)
checkpoint_path="hf://TedLentsch/TerraSeg/terraseg_s.pth",
)
coord = torch.randn(50_000, 3, device="cuda") # Your (N, 3) point cloud in meters.
labels = predictor.predict(coord=coord) # Shape (N,) with datatype uint8. Labels: 0 = ground, 1 = non-ground.
TerraSeg runs in FP32; the predictor optionally accepts compile_model=True to wrap the model with torch.compile for extra throughput on supported GPUs.
Quick start (ROS2)
The accompanying ROS2 package (terraseg_ros2) wraps the predictor and subscribes to a sensor_msgs/PointCloud2 topic. See the main repository README for the full launch recipe and topic API.
Citation
If you use TerraSeg in your research, please cite:
@inproceedings{lentsch2026terraseg,
title={TerraSeg: Self-Supervised Ground Segmentation for Any LiDAR},
author={Lentsch, Ted and Montiel-Marín, Santiago and Caesar, Holger and Gavrila, Dariu M},
booktitle={Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}
and the upstream dataset references summarized in the attribution table above.