publications
This publication list may be outdated. For the most up-to-date and comprehensive list, please visit my Google Scholar profile.
2026
- preprint
When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents2026 - preprint
- ICLRTrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation ModelsIn The Fourteenth International Conference on Learning Representations (ICLR), 2026
- ACL
Lost in Execution: On the Multilingual Robustness of Tool Calling in Large Language ModelsIn Proceedings of the The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026 - ICML
" Someone Hid It": Query-Agnostic Black-Box Attacks on LLM-Based RetrievalIn Proceedings of International Conference on Machine Learning (ICML), 2026 - ACL
Are LLMs Reliable Rankers? Rank Manipulation via Two-Stage Token OptimizationIn Proceedings of the The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026 - ACL
Value-Action Alignment in Large Language Models under Privacy-Prosocial ConflictIn Findings of the The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026 - ACL
Topology Matters: Measuring Memory Leakage in Multi-Agent LLMsIn Findings of the The 64th Annual Meeting of the Association for Computational Linguistics (ACL), 2026 - AAAI
Mitigating Hallucinations in Large Language Models via Causal ReasoningIn Proceedings of the AAAI Conference on Artificial Intelligence, 2026
2025
- IJCNLP-AACL
AD-AGENT: A Multi-agent Framework for End-to-end Anomaly DetectionIn Findings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (IJCNLP-AACL), 2025 - preprint
- preprint
- preprint
A Large-Scale Simulation on Large Language Models for Decision-Making in Political Science2025 - preprint
DrugAgent: A Theory-Driven LLM Multi-Agent System for Automating Machine Learning Programming in Drug DiscoveryAvailable at SSRN 5746063, 2025 - preprintDrugAgent: Automating AI-aided Drug Discovery Programming through LLM Multi-Agent Collaboration2025
- ICCV
Secure On-Device Video OOD Detection Without BackpropagationIn Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2025 - preprintOn the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and PerspectiveOct 2025
- NLP-ADBench: NLP Anomaly Detection BenchmarkIn Findings of the Association for Computational Linguistics: EMNLP 2025, Nov 2025
2024
- preprint
- preprintTowards More Accurate US Presidential Election via Multi-step Reasoning with Large Language ModelsNov 2024
-
- JMLR
PyGOD: A Python Library for Graph Outlier DetectionJournal of Machine Learning Research, Nov 2024
2023
- NeurIPSADGym: Design Choices for Deep Anomaly DetectionAdvances in Neural Information Processing Systems, Nov 2023
- preprintInclusive Decision Making via Contrastive Learning and Domain AdaptationMIS Quarterly (Under Major Revision), Nov 2023
- preprint