Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation | IEEE Conference Publication | IEEE Xplore