Real-Time Multi-Object Tracking with Ultralytics
YOLOv8 collapses detection and tracking into a single .track() call. No external tracker to wire up, no separate association step — load a model, pick a backend (ByteTrack or BoT-SORT), and you have a working multi-object tracker.
One call in, tracked IDs out. The same model(...) interface you already use for detection now returns persistent object IDs across frames.
Why use .track()?
- All-in-one pipeline — detection and tracking happen in a single call.
- Swappable backends — switch with
tracker='bytetrack.yaml'or'botsort.yaml'. - Familiar API — identical to the YOLOv8
model(...)detection interface.
Install & import
pip install ultralytics --upgrade
Then, in your script or notebook:
from ultralytics import YOLO
import cv2
Minimal tracking example
The only thing that changes between the two backends is the tracker argument.
ByteTrack
model = YOLO('yolov8n.pt')
results = model.track(
source='input.mp4',
tracker='bytetrack.yaml',
show=True,
)
BoT-SORT
model = YOLO('yolov8n.pt')
results = model.track(
source='input.mp4',
tracker='botsort.yaml', # the only line that differs
show=True,
)
Key parameters
| Parameter | What it does | Example |
|---|---|---|
source |
Input video, image, or camera stream | 'video.mp4', 0 |
tracker |
Tracker config to use | 'bytetrack.yaml' |
conf |
Detection confidence threshold | conf=0.4 |
iou |
IoU threshold for association | iou=0.5 |
persist |
Keep IDs across brief disappearances | persist=True |
device |
Run on CPU or GPU | 'cpu', 'cuda' |
stream |
Frame-by-frame generator mode | stream=True |
Drawing tracked IDs with OpenCV
For a live camera feed, pass each frame in and read box.id back out:
import cv2
from ultralytics import YOLO
model = YOLO('yolov8n.pt')
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read()
if not ret:
print("No frames received")
break
# Switch tracker here: 'bytetrack.yaml' or 'botsort.yaml'
results = model.track(frame, tracker='bytetrack.yaml', persist=True)
if results[0].boxes is not None:
for box in results[0].boxes:
x1, y1, x2, y2 = map(int, box.xyxy[0].tolist())
track_id = int(box.id[0]) if box.id is not None else -1
class_id = int(box.cls[0]) if box.cls is not None else -1
class_name = model.names[class_id] if class_id in model.names else "unknown"
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
cv2.putText(frame, f'{class_name}-{track_id}', (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
cv2.imshow('Tracked', frame)
if cv2.waitKey(1) & 0xFF == 27: # ESC to quit
break
cap.release()
cv2.destroyAllWindows()
Tips & tricks
- Use
stream=Truefor real-time processing inside Python loops. - Filter to specific classes by checking
box.clsbefore drawing. - Pick
yolov8n/yolov8sfor speed,yolov8l/yolov8xfor accuracy. - Use
track_idto compute dwell time, entry/exit counts, and object trails. - Flip the
trackerargument to benchmark ByteTrack against BoT-SORT on your own footage.
Where it fits
- Surveillance and security systems
- Sports and player tracking
- Retail and footfall analytics
- Industrial and assembly-line inspection
- Smart traffic monitoring
Takeaway: one.track()call gives you a full detection-plus-tracking pipeline. Start withyolov8n.ptand ByteTrack, then scale the model or swap the backend once you know your accuracy and speed targets.