2

SFGN: Representing the Sequence with One Super Frame

Video-based person re-identification (V-Re-ID) is more robust than image-based person re-identification (I-Re-ID) due to the additional temporal information. However, the high storage overhead of video sequences largely stems the applications of …

Self-Correction for Human Parsing

Labeling pixel-level masks for fine-grained semantic segmentation tasks, e.g. human parsing, remains a challenging task. The ambiguous boundary between different semantic parts and those categories with similar appearance usually are confusing, …

Hierarchical Temporal Modeling with Mutual Distance Matching for Video Based Person Re-Identification

Comparing to image-based person re-identification (re-ID) problems, video-based person re-ID can take advantage of more cues from appearance and temporal information, and therefore receives widespread attention recently. However, due to the different …