2D Box
Object Tracking
License: CC BY-NC-SA 4.0


A large, high-diversity, one-shot database for generic object tracking in the wild

Key Features

  • Large-Scale
    The dataset contains more than 10,000 video segments of real-world moving objects and over 1.5 million manually labeled bounding boxes.
  • Generic Classes
    The dataset is backboned by WordNet and it covers a majority of 560+ classes of real-world moving objects and 80+ classes of motion patterns.
  • One-Shot
    The dataset encourages the development of generic purposed trackers by following the one-shot rule that object classes between train and test sets are zero-overlapped.
  • Unified Training Data
    The fair comparison of deep trackers is ensured with the protocol that all approaches are using the same training data provided by the dataset.
  • Extra Labeling
    The dataset provides extra labels including object visible ratios and motion classes as additional supervision for handling specific challenges.
  • Efficient Evaluation
    The test set embodies 84 object classes and 32 motion classes with only 180 video segments, allowing for efficient evaluation.


Please cite this paper if GOT-10k helps your research. [PDF] [BibTex]]

Data Annotation

Each sequence folder contains 4 annotation files and 1 meta file. A brief description of these files follows (let N denotes sequence length):

  • groundtruth.txt -- An N×4 matrix with each line representing object location [xmin, ymin, width, height] in one frame.
  • cover.label -- An N×1 array representing object visible ratios, with levels ranging from 0~8.
  • absense.label -- An binary N×1 array indicating whether an object is absent or present in each frame.
  • cut_by_image.label -- An binary N×1 array indicating whether an object is cut by image in each frame.
  • meta_info.ini -- Meta information about the sequence, including object and motion classes, video URL and more.

Values 0~8 in file cover.label correspond to ranges of object visible ratios: 0%, (0%, 15%], (15%~30%], (30%, 45%], (45%, 60%], (60%, 75%], (75%, 90%], (90%, 100%) and 100% respectively.

Data Format

The downloaded and extracted full dataset should follow the file structure:

|-- GOT-10k/
|-- train/
| |-- GOT-10k_Train_000001/
| | ......
| |-- GOT-10k_Train_009335/
| |-- list.txt
|-- val/
| |-- GOT-10k_Val_000001/
| | ......
| |-- GOT-10k_Val_000180/
| |-- list.txt
|-- test/
| |-- GOT-10k_Test_000001/
| | ......
| |-- GOT-10k_Test_000180/
| |-- list.txt



The benchmark offers light-weighted and compile-free toolkits written in pure Python and MATLAB. You will find tutorials and examples in the corresponding repositories.


  title={GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  publisher={Institute of Electrical and Electronics Engineers (IEEE)},
  author={Huang, Lianghua and Zhao, Xin and Huang, Kaiqi},



Data Summary
Provided by
Institute of Automation Chinese Academy of Sciences
Institute of Automation, Chinese Academy of Sciences (CASIA), as one of the earliest national automation institutes in China, was established in October, 1956.
Start Building AI Now