publications

publications by categories in reversed chronological order.

2024

  1. Localizing Events in Videos with Multimodal Queries
    Gengyuan Zhang, Mang Ling Ada Fok, Yan Xia, and 5 more authors
    arXiv preprint arXiv:2406.10079, 2024

2023

  1. SPOT! Revisiting Video-Language Models for Event Understanding
    Gengyuan Zhang, Jinhe Bi, Jindong Gu, and 1 more author
    arXiv preprint arXiv:2311.12919, 2023
  2. Multi-event Video-Text Retrieval
    Gengyuan Zhang, Jisen Ren, Jindong Gu, and 1 more author
    In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
  3. Can Vision-Language Models be a Good Guesser? Exploring VLMs for Times and Location Reasoning
    Gengyuan Zhang, Yurui Zhang, Kerui Zhang, and 1 more author
    arXiv preprint arXiv:2307.06166, 2023
  4. A systematic survey of prompt engineering on vision-language foundation models
    Jindong Gu, Zhen Han, Shuo Chen, and 7 more authors
    arXiv preprint arXiv:2307.12980, 2023

2022

  1. CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering
    Yao Zhang, Haokun Chen, Ahmed Frikha, and 5 more authors
    arXiv preprint arXiv:2211.10567, 2022

2021

  1. Time-dependent Entity Embedding is not All You Need: A Re-evaluation of Temporal Knowledge Graph Completion Models under a Unified Framework
    Zhen Han*, Gengyuan Zhang*, Yunpu Ma, and 1 more author
    In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Nov 2021