TrackEval 代码逻辑,文件结构详解 / MOT Challenge 评测指标


Data format

- Note: Training Test data in https://motchallenge.net/ is not the required(default)  format of TrackEval

MOT Challenge train/val/test

det.txt

3,-1,1433,512,60,100,0,-1,-1,-1
3,-1,1048,437,49,124,0,-1,-1,-1
3,-1,1087,552,78,177,0,-1,-1,-1
3,-1,1504,514,51,101,0,-1,-1,-1

gt.txt

48,1,335,811,128,270,1,1,0.85675
49,1,335,809,130,272,1,1,0.85326
50,1,335,808,132,272,1,1,0.85579

img1/

f"{:06d}.jpg"

000001.jpg
000002.jpg
000003.jpg

seqinfo.ini

[Sequence]
name=MOT20-01
imDir=img1
frameRate=25a
seqLength=429
imWidth=1920
imHeight=1080
imExt=.jpg

TrackEval

, , , , , , , , ,

All frame numbers, target IDs and bounding boxes are 1-based.

Here is an example from the sample data offered by TrackEval

1,6.0,343.8669738769531,828.7033081054688,124.1097412109375,248.24200439453125,1,-1,-1,-1
1,7.0,1023.822265625,606.1856689453125,83.2244873046875,195.18109130859372,1,-1,-1,-1
1,8.0,1067.532958984375,513.0377197265625,52.221435546875,142.29217529296875,1,-1,-1,-1

Folder Hierarchy

  • gt

    • gt.txt

      txt details
      11,1,227,812,140,269,1,1,0.83704
      12,1,230,811,137,270,1,1,0.83764
      13,1,233,810,135,271,1,1,0.83456
      14,1,236,809,133,272,1,1,0.8315
      15,1,239,808,131,273,1,1,0.82847
      
    • seqinfo.ini(the same as MOT17 training data)

      txt details
      [Sequence]
      name=MOT20-01
      imDir=img1
      frameRate=25
      seqLength=429
      imWidth=1920
      imHeight=1080
      imExt=.jpg
      
  • trackers

refer to default_dataset_config = trackeval.datasets.MotChallenge2DBox.get_default_dataset_config() in run_mot_challenge.py

"GT_FOLDER": "/path/to/TrackEval/data/gt/mot_challenge/",
"TRACKERS_FOLDER": "/path/to/TrackEval/data/trackers/mot_challenge/",
"OUTPUT_FOLDER": None,
"TRACKERS_TO_EVAL": [
    "MPNTrack"
],
"TRACKER_SUB_FOLDER": "data",
"OUTPUT_SUB_FOLDER": "",
curr_file = os.path.join(self.tracker_fol, tracker,
    self.tracker_sub_fol, seq + '.txt')

refer to mot_challenge_2d_box.py to see how it get txt path

Pipeline

Data Preperation & arguments

  • GT_FOLDER: sequence with ground truth
    • in the format of MOT17
    • refer to mot_challenge_2d_box.py
`default_dataset_config` details
code_path = utils.get_code_path()
'GT_FOLDER': os.path.join(code_path, 'data/gt/mot_challenge/'),  # Location of GT data
'TRACKERS_FOLDER': os.path.join(code_path, 'data/trackers/mot_challenge/'),  # Trackers location
`dataset_config` details
{
    "PRINT_CONFIG": True,
    "GT_FOLDER": "/path/to/TrackEval/data/gt/mot_challenge/",
    "TRACKERS_FOLDER": "/path/to/TrackEval/data/trackers/mot_challenge/",
    "OUTPUT_FOLDER": None,
    "TRACKERS_TO_EVAL": [
        "MPNTrack"
    ],
    "CLASSES_TO_EVAL": [
        "pedestrian"
    ],
    "BENCHMARK": "MOT17",
    "SPLIT_TO_EVAL": "train",
    "INPUT_AS_ZIP": False,
    "DO_PREPROC": True,
    "TRACKER_SUB_FOLDER": "data",
    "OUTPUT_SUB_FOLDER": "",
    "TRACKER_DISPLAY_NAMES": None,
    "SEQMAP_FOLDER": None, ...
}
`eval_config` details
{
    "USE_PARALLEL": False,
    "NUM_PARALLEL_CORES": 1,
    "BREAK_ON_ERROR": True,
    "RETURN_ON_ERROR": False,
    "LOG_ON_ERROR": "/path/to/TrackEval/error_log.txt",
    "PRINT_RESULTS": True,
    "PRINT_ONLY_COMBINED": False,
    "PRINT_CONFIG": True,
    "TIME_PROGRESS": True,
    "DISPLAY_LESS_PROGRESS": False,
    "OUTPUT_SUMMARY": True,
    "OUTPUT_EMPTY_CLASSES": True,
    "OUTPUT_DETAILED": True,
    "PLOT_CURVES": True
}
`metrics_config` details
{
    "METRICS": [
        "HOTA",
        "CLEAR",
        "Identity",
        "VACE"
    ],
    "THRESHOLD": 0.5
}

TrackEval has its own list arguments parser(command line to Python's Argument Parser)

Init Dataset

_get_seq_info

gt_set = self.config['BENCHMARK'] + '-' + self.config['SPLIT_TO_EVAL']
self.gt_set = gt_set
if self.config["SEQMAP_FOLDER"] is None:
  seqmap_file = os.path.join(self.config['GT_FOLDER'], 'seqmaps', self.gt_set + '.txt')

seqmap

name
TUD-Stadtmitte
TUD-Campus
PETS09-S2L1
ETH-Bahnhof
ETH-Sunnyday
ETH-Pedcross2
ADL-Rundle-6
ADL-Rundle-8
KITTI-13
KITTI-17
Venice-2

Eval Sequence

  • Init evaluator = Evaluator in eval.py
    • evaluator.evaluate()
  • evaluate_sequence in eval.py
raw_data = dataset.get_raw_seq_data(tracker, seq)
    seq_res = {}
    for cls in class_list:
        seq_res[cls] = {}
        data = dataset.get_preprocessed_seq_data(raw_data, cls)
        for metric, met_name in zip(metrics_list, metric_names):
            seq_res[cls][met_name] = metric.eval_sequence(data)
    return seq_res

Load Data

Data are loaded in a dict. Key is frame_id(timestep), and the value is the splitted row.

['21', '1', '912', '484', '97', '109', '0', '7', '1']

convert to ndarray

time_data = np.asarray(read_data[time_key], dtype=np.float)
raw_data['dets'][t] = np.atleast_2d(time_data[:, 2:6])
raw_data['ids'][t] = np.atleast_1d(time_data[:, 1]).astype(int)

Calculate IoU Similarity

MotChallenge2DBox._calculate_similarities -> MotChallenge2DBox._calculate_box_ious

similarity_scores = []
for t, (gt_dets_t, tracker_dets_t) in enumerate(zip(raw_data['gt_dets'], raw_data['tracker_dets'])):
    ious = self._calculate_similarities(gt_dets_t, tracker_dets_t)
    similarity_scores.append(ious)
raw_data['similarity_scores'] = similarity_scores

How to Calculate IoU?

code details
# layout: (x0, y0, x1, y1)
min_ = np.minimum(bboxes1[:, np.newaxis, :], bboxes2[np.newaxis, :, :])
max_ = np.maximum(bboxes1[:, np.newaxis, :], bboxes2[np.newaxis, :, :])

intersection = np.maximum(min_[..., 2] - max_[..., 0], 0) * np.maximum(min_[..., 3] - max_[..., 1], 0)

area1 = (bboxes1[..., 2] - bboxes1[..., 0]) * (bboxes1[..., 3] - bboxes1[..., 1])
area2 = (bboxes2[..., 2] - bboxes2[..., 0]) * (bboxes2[..., 3] - bboxes2[..., 1])

union = area1[:, np.newaxis] + area2[np.newaxis, :] - intersection
intersection[area1 <= 0 + np.finfo('float').eps, :] = 0
intersection[:, area2 <= 0 + np.finfo('float').eps] = 0
intersection[union <= 0 + np.finfo('float').eps] = 0
union[union <= 0 + np.finfo('float').eps] = 1
ious = intersection / union

get_preprocessed_seq_data in mot_challenge_2d_box.py

Cancel the distractor

matching_scores[matching_scores < 0.5 - np.finfo('float').eps] = 0
match_rows, match_cols = linear_sum_assignment(-matching_scores)

[Note??]: gt_dets = raw_data['gt_dets'][timestep]

tracker_dets = raw_data['tracker_dets'][timestep]

Data is finally loaded as a dict like below.

Eval continued. (CLEAR metrics)

Calculates CLEAR metrics for one sequence

clear.py CLEAR.eval_sequence

Init counters

self.fields:

['MOTA', 'MOTP', 'MODA', 'CLR_Re', 'CLR_Pr', 'MTR', 'PTR', 'MLR', 'sMOTA', 'CLR_F1', 'FP_per_frame', 'MOTAL', 'MOTP_sum', 'CLR_TP', ...]
# Variables counting global association
num_gt_ids = data['num_gt_ids']
gt_id_count = np.zeros(num_gt_ids)  # For MT/ML/PT
gt_matched_count = np.zeros(num_gt_ids)  # For MT/ML/PT
gt_frag_count = np.zeros(num_gt_ids)  # For Frag

# Note that IDSWs are counted based on the last time each gt_id was present (any number of frames previously),
# but are only used in matching to continue current tracks based on the gt_id in the single previous timestep.
prev_tracker_id = np.nan * np.zeros(num_gt_ids)  # For scoring IDSW
prev_timestep_tracker_id = np.nan * np.zeros(num_gt_ids)  # For matching IDSW

LOOP OVER TIMESTEPS

  • FP & FN
    • Wrong Detection
    • Missing Detection

Degenerate case

code details
if len(gt_ids_t) == 0:
    res['CLR_FP'] += len(tracker_ids_t)
    continue
if len(tracker_ids_t) == 0:
    res['CLR_FN'] += len(gt_ids_t)
    gt_id_count[gt_ids_t] += 1
    continue

Match tracker_dets to gt_dets by maximum bipartite matching

[Note??]: HERE, we match tracker and gt without previous info!

Different from the CLEAR paper
It suggests Tracking Algorithms to do so.
Here, we want to determine which gt we use to calculate statistics for a given track.
[Note??]: gt_ids and track_ids has no relation at first.

code details
# Hungarian algorithm to find best matches
match_rows, match_cols = linear_sum_assignment(-score_mat)
actually_matched_mask = score_mat[match_rows, match_cols] > 0 + np.finfo('float').eps
match_rows = match_rows[actually_matched_mask]
match_cols = match_cols[actually_matched_mask]

matched_gt_ids = gt_ids_t[match_rows]
matched_tracker_ids = tracker_ids_t[match_cols]

We will get corresponding gt and det

Calculate IDSW

  • not the beginning of a sequence (prev_matched_tracker_id!=np.nan)
  • ID changed from prev_matched_tracker_id

prev_timestep_tracker_id is a dict with ground truth track ids as keys.

Check whether every ground truth track is matched to tracks with inconsistent IDs.

code details
# Calc IDSW for MOTA
prev_matched_tracker_ids = prev_tracker_id[matched_gt_ids]
is_idsw = (np.logical_not(np.isnan(prev_matched_tracker_ids))) & (
    np.not_equal(matched_tracker_ids, prev_matched_tracker_ids))
res['IDSW'] += np.sum(is_idsw)

Calculate basic TP, FP, FN

code details
# Calculate and accumulate basic statistics
num_matches = len(matched_gt_ids)
res['CLR_TP'] += num_matches
res['CLR_FN'] += len(gt_ids_t) - num_matches
res['CLR_FP'] += len(tracker_ids_t) - num_matches
if num_matches > 0:
    res['MOTP_sum'] += sum(similarity[match_rows, match_cols])

Record gt matchings

code details
# Update counters for MT/ML/PT/Frag and record for IDSW/Frag for next timestep
gt_id_count[gt_ids_t] += 1
gt_matched_count[matched_gt_ids] += 1
not_previously_tracked = np.isnan(prev_timestep_tracker_id)
prev_tracker_id[matched_gt_ids] = matched_tracker_ids
prev_timestep_tracker_id[:] = np.nan
prev_timestep_tracker_id[matched_gt_ids] = matched_tracker_ids
# ==[Note??]==: prev_timestep_tracker_id here is actually a dict 
# with gt_id as keys and as corresponding track_id as values.
# Here, it has already been updated for the next timestep.
# So, is actually semantically clearer to write
# cur_timestep_tracker_id[matched_gt_ids] = matched_tracker_ids(record)
# prev_timestep_tracker_id = cur_timestep_tracker_id(iteratively update)
currently_tracked = np.logical_not(
    np.isnan(prev_timestep_tracker_id))
gt_frag_count += np.logical_and(not_previously_tracked,
                                currently_tracked)

Calculate MT/ML/PT/Frag/MOTP

tracked_ratio = gt_matched_count[gt_id_count > 0] / gt_id_count[gt_id_count > 0]
  • 80%
  • 20%
  • LT<20%
  • Frag =
code details
tracked_ratio = gt_matched_count[gt_id_count > 0] / gt_id_count[
    gt_id_count > 0]
res['MT'] = np.sum(np.greater(tracked_ratio, 0.8))
res['PT'] = np.sum(np.greater_equal(tracked_ratio, 0.2)) - res['MT']
res['ML'] = num_gt_ids - res['MT'] - res['PT']
res['Frag'] = np.sum(np.subtract(gt_frag_count[gt_frag_count > 0], 1))
res['MOTP'] = res['MOTP_sum'] / np.maximum(1.0, res['CLR_TP'])

Calculate MOTP

code details
if num_matches > 0:
  res['MOTP_sum'] += sum(similarity[match_rows, match_cols])

For other sub-metrics, refer to CLEAR._compute_final_fields

Combine Sequence

Sequences have all been valued.

res = {}
  for curr_seq in sorted(seq_list):
      res[curr_seq] = eval_sequence(
          curr_seq, dataset, tracker, class_list,
          metrics_list, metric_names)

The results passed in has several group of metrics.

Then they are combined.

# Combine results over all sequences and then over all classes

# collecting combined cls keys (cls averaged, det averaged, super classes)
combined_cls_keys = []
res['COMBINED_SEQ'] = {}
# combine sequences for each class
for c_cls in class_list:
    res['COMBINED_SEQ'][c_cls] = {}
    for metric, metric_name in zip(metrics_list,
                                    metric_names):
        # ==[Note??]==: Actually extract all 
        curr_res = {
            seq_key: seq_value[c_cls][metric_name]
            for seq_key, seq_value in res.items()
            if seq_key != 'COMBINED_SEQ'
        }
        res['COMBINED_SEQ'][c_cls][
            metric_name] = metric.combine_sequences(
                curr_res)

curr_res is a collection of metrics

Then res['COMBINED_SEQ']["pedestrian"]["HOTA"] combine the sequences' metrics by processing curr_res

e.g. for CLEAR metrics

# override abstract function
def combine_sequences(self, all_res):
    """Combines metrics across all sequences"""
    res = {}
    for field in self.summed_fields:
        res[field] = self._combine_sum(all_res, field)
    res = self._compute_final_fields(res)
    return res

The rest is some post-processing, e.g., metric.print_table, metric.summary_results, etc.


Metrics

https://github.com/JonathonLuiten/TrackEval#currently-implemented-metrics

Metric Family Sub metrics Paper Code Notes
HOTA metrics HOTA, DetA, AssA, LocA, DetPr, DetRe, AssPr, AssRe paper code Recommended tracking metric
CLEARMOT metrics MOTA, MOTP, MT, ML, Frag, etc. paper code
Identity metrics IDF1, IDP, IDR paper code
VACE metrics ATA, SFDA paper code
Track mAP metrics Track mAP paper code Requires confidence scores
J & F metrics J&F, J, F paper code Only for Seg Masks
ID Euclidean ID Euclidean paper code

ref

  • TrackEval