State of the art on the MARS dataset

We summarize the state-of-the-art methods on the MARS dataset. We will report both mAP and rank-1, 5, 10, 20 accuracies. Note that this may not be the only performance measurement. Other metrics, such as recognition time, are also important. Please contact me at

Reference MARS Notes
"MARS: A Video Benchmark for Large-Scale Person Re-identification", Liang Zheng, Zhi Bie, Yifan Sun, Jingdong Wang, Chi Su, Shengjin Wang, Qi Tian, ECCV 2016 2.66.412.40.8 HOG3D [1] + kissme [2], Euclidean distance, single query [3] + kissme [2], single query.
18.633.045.98.0HistLBP [4] + XQDA [5], single query
30.646.259.215.5BoW [6] + kissme [2], single query
60.077.987.942.4IDE, average pooling, Euclidean distance, single query + kissme, max pooling, Euclidean distance, single query
68.382.689.449.3IDE + kissme, max pooling, Euclidean distance, multiple query
Current state of the art
"Learning Compact Appearance Representation for Video-based Person Re-Identification", Wei Zhang, Shengnan Hu, Kan Liu, Arxiv 2017 55.5 70.2 80.2- A frame selection step is used before feature pooling
"Re-ranking Person Re-identification with k-reciprocal Encoding", Zhun Zhong, Liang Zheng, Donglin Cao, Shaozi Li, CVPR 2017. 67.78 - -57.98IDE (CaffeNet) + re-ranking, single query.
73.94 - -68.45 IDE (ResNet50) + re-ranking, single query.
"See the forest for the trees: Joint spatial and temporal recurrent neural networks for video-based person re-identification", Zhen Zhou, Yan Huang, Wei Wang, Liang Wang, and Tieniu Tan, CVPR 2017 70.6 90.0 97.650.7 Single query. Handles both spatial and temporal information.
"Quality Aware Network for Set to Set Recognition", Yu Liu, Junjie Yan, Wanli Ouyang, CVPR 2017 73.74 84.90 91.6251.70 P-QAN (googlenet), single query. Numbers are provided by the authors, not reported in the paper
"In Defense of the Triplet Loss for Person Re-Identification", Alexander Hermans, Lucas Beyer and Bastian Leibe, Arxiv 2017. 79.80 91.36 -67.70 Using the fine-tuned TriNet and Euclidean distance, single query.
81.21 90.76 -77.43TriNet + re-ranking [7]
Use the dataset for training, but do not report results
"Simple Online and Realtime Tracking with a Deep Association Metric", Nicolai Wojke, Alex Bewley, Dietrich Paulus, ArXiv 2017. - - --The CNN model is trained on MARS


[1] Klaser, A., Marsza lek, M., Schmid, C.: A spatio-temporal descriptor based on 3dgradients. In: BMVC (2008).
[2] Kostinger, M., Hirzer, M., Wohlhart, P., Roth, P.M., Bischof, H.: Large scale metric learning from equivalence constraints. In: CVPR. pp. 2288–2295 (2012) [3] Han, J., Bhanu, B.: Individual recognition using gait energy image. Pattern Analysis and Machine Intelligence, IEEE Transactions on 28(2), 316–322 (2006) [4] F. Xiong, M. Gou, O. Camps, and M. Sznaier. Person reidentification using kernel-based metric learning methods. In ECCV, 2014.
[5] S. Liao, Y. Hu, X. Zhu, and S. Z. Li. Person re-identification by local maximal occurrence representation and metric learning. In CVPR, 2015.
[6] Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person reidentification: A benchmark. In: CVPR (2015).
[7] Z. Zhong, L. Zheng, D. Cao, and S. Li. Re-ranking Person Re-identification with k-reciprocal Encoding. In CVPR 2017