Accepted Main Conference Papers
- EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association
Weiqi Wang, Limeng Cui, Xin Liu, Sreyashi Nag, Wenju Xu, Chen Luo, Sheikh Muhammad Sarwar, Yang Li, Hansu Gu, Hui Liu, Changlong Yu, Jiaxin Bai, Yifan Gao, Haiyang Zhang, Qi He, Shuiwang Ji, Yangqiu Song
- TAGExplainer: Narrating Graph Explanations for Text-Attributed Graph Learning Models
Bo Pan, Zhen Xiong, Guanchen Wu, Zheng Zhang, Yifei Zhang, Yuntong Hu, Liang Zhao
- M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja, Lester James Validad Miranda, Shayekh Bin Islam, Rishabh Maheshwary, Drishti Sharma, Gusti Triandi Winata, Nathan Lambert, Sebastian Ruder, Sara Hooker, Marzieh Fadaee
- ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming
Xinwei Yang, Zhaofeng Liu, Chen Huang, Jiashuai Zhang, Tong Zhang, Yifan Zhang, Wenqiang Lei
- The Impossibility of Fair LLMs
Jacy Reese Anthis, Kristian Lum, Michael Ekstrand, Avi Feller, Chenhao Tan
- Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
Ermo Hua, Biqing Qi, Kaiyan Zhang, Kai Tian, Xingtai Lv, Ning Ding, Bowen Zhou
- Bias in Language Models: Beyond Trick Tests and Towards RUTEd Evaluation
Kristian Lum, Jacy Reese Anthis, Kevin Robinson, Chirag Nagpal, Alexander Nicholas D’Amour
- Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models
Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
- The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report Generation and How to Incorporate It
Aaron Nicolson, Shengyao Zhuang, Jason Dowling, Bevan Koopman
- CLEME2.0: Towards Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction
Jingheng Ye, Zishan Xu, Yinghui Li, Linlin Song, Qingyu Zhou, Hai-Tao Zheng, Ying Shen, Wenhao Jiang, Hong-Gee Kim, Ruitong Liu, Xin Su, Zifei Shan
- Towards LLM-powered Attentive Listener: A Pragmatic Approach through Quantity Self-Repair
Junlin Li, Bo Peng, Yu-Yin Hsu
- StrucText-Eval: Evaluating Large Language Model’s Reasoning Ability in Structure-Rich Text
Zhouhong Gu, Haoning Ye, Xingzhou Chen, Zeyang Zhou, Hongwei Feng, Yanghua Xiao
- Literature Meets Data: A Synergistic Approach to Hypothesis Generation
Haokun Liu, Yangqiaoyu Zhou, Mingxuan Li, Chenfei Yuan, Chenhao Tan
- GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization
Zhouhong Gu, Xingzhou Chen, Xiaoran Shi, Tao Wang, Suhang Zheng, Tianyu Li, Hongwei Feng, Yanghua Xiao
- Tree-of-Evolution: Tree-Structured Instruction Evolution for Code Generation in Large Language Models
Ziyang Luo, Kaixin Li, Hongzhan Lin, Yuchen Tian, Mohan Kankanhalli, Jing Ma
- Delving into Multilingual Ethical Bias: The MSQAD with Statistical Hypothesis Tests for Large Language Models
Seunguk Yu, Juhwan Choi, YoungBin Kim
- ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision
Dosung Lee, Wonjun Oh, Boyoung Kim, Minyoung Kim, Joonsuk Park, Paul Hongsuck Seo
- MIRAGE: Exploring How Large Language Models Perform in Complex Social Interactive Environments
Yin Cai, Zhouhong Gu, Zhaohan Du, Zheyu Ye, Shaosheng Cao, Yiqian xu, Hongwei Feng, Ping Chen
- FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models
Hongzhan Lin, Yang Deng, Yuxuan Gu, Wenxuan Zhang, Jing Ma, See-Kiong Ng, Tat-Seng Chua
- Statistical Deficiency for Task Inclusion Estimation
Loïc Fosse, Frederic Bechet, Benoit Favre, Géraldine Damnati, Gwénolé Lecorvé, Maxime DARRIN, Philippe Formont, Pablo Piantanida
- Towards Robust and Efficient Federated Low-Rank Adaptation with Heterogeneous Clients
Jabin Koo, Minwoo Jang, Jungseul Ok
- Dynamic Label Name Refinement for Few-Shot Dialogue Intent Classification
Gyutae Park, Ingeol Baek, Byeongjeong Kim, Joongbo Shin, Hwanhee Lee
- LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs
Kaibo Liu, Zhenpeng Chen, Yiyang Liu, Jie Zhang, Mark Harman, Yudong Han, Yun Ma, Yihong Dong, Ge Li, Gang Huang
- Capture the Key in Reasoning to Enhance CoT Distillation Generalization
Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu
- How to Enable Effective Cooperation Between Humans and NLP Models: A Survey of Principles, Formalizations, and Beyond
Chen Huang, Yang Deng, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua, Jimmy Huang
- Enhancing Hyperbole and Metaphor Detection with Their Bidirectional Dynamic Interaction and Emotion Knowledge
Li Zheng, Sihang Wang, Hao Fei, Zuquan Peng, Fei Li, Jianming Fu, Chong Teng, Donghong Ji
- UniICL: An Efficient ICL Framework Unifying Compression, Selection, and Generation
Jun Gao, Qi Lv, Zili Wang, Tianxiang Wu, Ziqiang Cao, Wenjie Li
- BelarusianGLUE: Towards a Natural Language Understanding Benchmark for Belarusian
Maksim Aparovich, Volha Harytskaya, Vladislav Poritski, Oksana Volchek, Pavel Smrz
- A Survey on Foundation Language Models for Single-cell Biology
Fan Zhang, Hao Chen, Zhihong Zhu, Ziheng Zhang, Zhenxi Lin, Ziyue Qiao, Yefeng Zheng, Xian Wu
- RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios
Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang
- Extending LLM Context Window with Adaptive Grouped Positional Encoding: A Training-Free Method
Xinhao Xu, Jiaxin Li, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding
- Semantic Exploration with Adaptive Gating for Efficient Problem Solving with Language Models
Sungjae Lee, Hyejin Park, Jaechang Kim, Jungseul Ok
- HotelMatch-LLM: Joint Multi-Task Training of Small and Large Language Models for Efficient Multimodal Hotel Retrieval
Arian Askari, Emmanouil Stergiadis, Ilya Gusev, Moran Beladev
- Can Multimodal Large Language Models Understand Spatial Relations?
Jingping Liu, Ziyan Liu, Zhedong Cen, Yan Zhou, Yinan Zou, Weiyan Zhang, Haiyun Jiang, Tong Ruan
- $S^3$ - Semantic Signal Separation
Márton Kardos, Jan Kostkan, Kenneth Enevoldsen, Arnault-Quentin Vermillet, Kristoffer Nielbo, Roberta Rocca
- TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs
Lanxiang Hu, Tajana Rosing, Hao Zhang
- JuStRank: Benchmarking LLM Judges for System Ranking
Ariel Gera, Odellia Boni, Yotam Perlitz, Roy Bar-Haim, Lilach Eden, Asaf Yehudai
- Generating Diverse Training Samples for Relation Extraction with Large Language Models
Zexuan Li, Hongliang Dai, Piji Li
- MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts
Dominik Macko, Jakub Kopál, Robert Moro, Ivan Srba
- Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection
Cilin Yan, Jingyun Wang, Lin Zhang, Ruihui Zhao, Xiaopu Wu, Kai Xiong, Qingsong Liu, Guoliang Kang, Yangyang Kang
- Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation
Aneta Zugecova, Dominik Macko, Ivan Srba, Robert Moro, Jakub Kopál, Katarína Marcinčinová, Matúš Mesarčík
- EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents
Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, Heng Ji
- BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving
Teng Wang, Wing Yin YU, Zhenqi He, Zehua Liu, HaileiGong, Han Wu, Xiongwei Han, Wei Shi, Ruifeng She, Fangzhou Zhu, Tao Zhong
- LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation
Jakub Šmíd, Pavel Priban, Pavel Kral
- Fusing Highly Specialized Language Models for Comprehensive Expertise
Ning Ding, Yulin Chen, Ganqu Cui, Xingtai Lv, Weilin Zhao, Kaiyan Zhang, Ruobing Xie, Bowen Zhou, Zhiyuan Liu, Maosong Sun
- HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases
Meng-Chieh Lee, Qi Zhu, Costas Mavromatis, Zhen Han, Soji Adeshina, Vassilis N. Ioannidis, Huzefa Rangwala, Christos Faloutsos
- Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms
Rajvardhan Oak, Muhammad Haroon, Claire Wonjeong jo, Magdalena Wojcieszak, Anshuman Chhabra
- Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review
Yidong Gan, Maciej Rybinski, Ben Hachey, Jonathan K. Kummerfeld
- MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection
Ziyan Liu, Chunxiao Fan, Haoran Lou, Yuexin Wu, Kaiwei Deng
- EvoWiki: Evaluating LLMs on Evolving Knowledge
Wei Tang, Yixin Cao, Yang Deng, Jiahao Ying, Bo Wang, Yizhe Yang, Yuyue Zhao, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Yong Liao
- Rethinking Repetition Problems of LLMs in Code Generation
Yihong Dong, Yuchen Liu, Xue Jiang, Zhi Jin, Ge Li
- PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension
Kun Ouyang, Yuanxin Liu, Shicheng Li, Yi Liu, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun
- ProcessBench: Identifying Process Errors in Mathematical Reasoning
Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin
- Model Extrapolation Expedites Alignment
Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng
- ATLANTIS: Weak-to-Strong Learning via Importance Sampling
Yi Liu, Guoyin Wang, Shicheng Li, Feifan Song, Xu Sun
- MPVStance: Mitigating Hallucinations in Stance Detection with Multi-Perspective Verification
ZhaoDan Zhang, Zhao Zhang, Jin Zhang, Hui Xu, Xueqi Cheng
- Personality-Guided Code Generation Using Large Language Models
Yaoqi Guo, Zhenpeng Chen, Jie Zhang, Yang Liu, Yun Ma
- PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling
Haojie Xie, Yirong Chen, Xiaofen Xing, Jingkai Lin, Xiangmin Xu
- BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework
Xu Zou
- Rethinking KenLM: Good and Bad Model Ensembles for Efficient Text Quality Filtering in Large Web Corpora
Yungi Kim, Hyunsoo Ha, Sukyung Lee, Jihoo Kim, Seonghoon Yang, Chanjun Park
- Automatic detection of dyslexia based on eye movements during reading in Russian
Anna Laurinavichyute, Anastasiya Lopukhina, David Robert Reich
- LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating
CHAO DENG, Jiale Yuan, Pi Bu, Peijie Wang, Zhong-Zhi Li, Jian Xu, Xiao-Hui Li, Yuan Gao, Jun Song, Bo Zheng, Cheng-Lin Liu
- ObfusLM: Privacy-preserving Language Model Service against Embedding Inversion Attacks
Yu Lin, Ruining Yang, Yunlong Mao, Qizhi Zhang, Jue Hong, Quanwei Cai, Ye Wu, Huiqi Liu, zhiyu chen, Bing Duan, Sheng Zhong
- Interlocking-free Selective Rationalization Through Genetic-based Learning
Federico Ruggeri, Gaetano Signorelli
- Re-identification of De-identified Documents with Autoregressive Infilling
Lucas Georges Gabriel Charpentier, Pierre Lison
- Modeling Uncertainty in Composed Image Retrieval via Probabilistic Embedding
Haomiao Tang, Jinpeng Wang, Yuang Peng, GuangHao Meng, Ruisheng Luo, Bin Chen, Long Chen, Yaowei Wang, Shu-Tao Xia
- Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models
Junfeng Tian, Da Zheng, Yang Chen, Rui Wang, colin zhang, Debing Zhang
- Doc-React: Multi-page Heterogeneous Document Question-answering
Junda Wu, Yu Xia, Tong Yu, Xiang Chen, Sai Sree Harsha, Akash V Maharaj, Ruiyi Zhang, Victor Bursztyn, Sungchul Kim, Ryan A. Rossi, Julian McAuley, Yunyao Li, Ritwik Sinha
- CECT dataset: Overcoming Data Scarcity in Context-Aware E-Commerce MT
Mikołaj Pokrywka, Wojciech Kusa, Mieszko Rutkowski, Mikołaj Koszowski
- APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts
Honghua Dong, Qidong Su, Yubo Gao, Zhaoyu Li, Yangjun Ruan, Gennady Pekhimenko, Chris J. Maddison, Xujie Si
- A Measure of the System Dependence of Automated Metrics
Pius von Däniken, Jan Milan Deriu, Mark Cieliebak
- Evaluating Lexical Proficiency in Neural Language Models
Cristiano Ciaccio, Alessio Miaschi, Felice Dell’Orletta
- Autoregressive Speech Synthesis without Vector Quantization
Lingwei Meng, Long Zhou, Shujie LIU, Sanyuan Chen, Bing Han, Shujie HU, Yanqing Liu, Jinyu Li, sheng zhao, Xixin Wu, Helen M. Meng, Furu Wei
- Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM’s Nest
Letian Peng, Zilong Wang, Feng Yao, Jingbo Shang
- FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Large Language Models
Raghav Singhal, Kaustubh Ponkshe, Praneeth Vepakomma
- Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar, Kanchan Chandra
- Capturing Author Self Beliefs in Social Media Language
Siddharth Mangalik, Adithya V Ganesan, Abigail B. Wheeler, Nicholas Kerry, Jeremy D. W. Clifton, H. Schwartz, Ryan L. Boyd
- Neural Topic Modeling with Large Language Models in the Loop
Xiaohao Yang, He Zhao, Weijie Xu, YUANYUAN QI, Jueqing Lu, Dinh Phung, Lan Du
- HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
Abhilasha Ravichander, Shrusti Ghela, David Wadden, Yejin Choi
- Synergizing LLMs with Global Label Propagation for Multimodal Fake News Detection
Shuguo Hu, Jun Hu, Huaiwen Zhang
- “Yes, My LoRD.” Guiding Language Model Extraction with Locality Reinforced Distillation
Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, RongHua Li, Jianliang Xu, Haibo Hu
- Jailbreak Large Vision-Language Models Through Multi-Modal Linkage
Yu Wang, Xiaofei Zhou, Yichen Wang, Geyuan Zhang, Tianxing He
- Wait, that’s not an option: LLMs Robustness with Incorrect Multiple-Choice Options
Gracjan Góral, Emilia Wiśnios, Piotr Sankowski, Paweł Budzianowski
- The Hidden Attention of Mamba Models
Ameen Ali Ali, Itamar Zimerman, Lior Wolf
- KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding
Shi Luohe, Zuchao Li, Lefei Zhang, Baoyuan Qi, Liu Guoming, hai zhao
- LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models
YAN WANG, Ling Ding, Tien N Nguyen, Shaohua Wang, Yanan Zheng
- MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset
Weiqi Wang, Yangqiu Song
- Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions
Hang Li, Tianlong Xu, Kaiqi Yang, Yucheng Chu, Yanling Chen, Yichi Song, Qingsong Wen, Hui Liu
- Real-time Fake News from Adversarial Feedback
Sanxing Chen, Yukun Huang, Bhuwan Dhingra
- Improve Vision Language Model Chain-of-thought Reasoning
Ruohong Zhang, Bowen Zhang, Yanghao Li, Haotian Zhang, Zhiqing Sun, Zhe Gan, Yinfei Yang, Ruoming Pang, Yiming Yang
- On the Mutual Influence of Gender and Occupation in LLM Representations
Haozhe An, Connor Baumler, Abhilasha Sancheti, Rachel Rudinger
- Disentangling Memory and Reasoning Ability in Large Language Models
Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang
- Open-World Attribute Mining for E-Commerce Products with Multimodal Self-Correction Instruction Tuning
Jiaqi Li, Yanming Li, Xiaoli Shen, Chuanyi Zhang, Guilin Qi, Sheng Bi
- Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attributions Explainability
Joakim Edin, Andreas Geert Motzfeldt, Casper L. Christensen, Tuukka Ruotsalo, Lars Maaløe, Maria Maistro
- Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling
Yang Yuguang, Yu Pan, Jixun Yao, xiang zhang, Jianhao Ye, Hongbin Zhou, Lei Xie, Lei Ma, Jianjun Zhao
- LangSAMP: Language-Script Aware Multilingual Pretraining
Yihong Liu, Haotian Ye, Chunlan Ma, Mingyang Wang, Hinrich Schuetze
- RelationalCoder: Relational Representation of Complex Tables for Program-Based Processing and Reasoning
Haoyu Dong, Yue Hu, Huailiang Peng, Yanan Cao
- Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study
Bolei Ma, Berk Yoztyurk, Anna-Carolina Haensch, Xinpeng Wang, Markus Herklotz, Frauke Kreuter, Barbara Plank, Matthias Aßenmacher
- TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos
Fanheng Kong, Jingyuan Zhang, Hongzhi Zhang, Shi Feng, Daling Wang, Linhao Yu, Xingguang Ji, Yu Tian, V. W., Fuzheng Zhang
- Self-Instructed Derived Prompt Generation Meets In-Context Learning: Unlocking New Potential of Black-Box LLMs
Zhuo Li, Yuhao Du, Jinpeng Hu, Xiang Wan, Anningzhe Gao
- Binary Classifier Optimization for Large Language Model Alignment
Seungjae Jung
- UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs’ Memorization
Md Nayem Uddin, Amir Saeidi, Divij Handa, Agastya Seth, Tran Cao Son, Eduardo Blanco, Steven Corman, Chitta Baral
- From Information to Insight: Leveraging LLMs for Open Aspect-Based Educational Summarization
Yang Zhong, Diane Litman
- AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset
Charles Nimo, Tobi Olatunji, Abraham Toluwase Owodunni, Tassallah Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Ezinwanne C. Aka, Folafunmi Omofoye, Foutse Yuehgoh, Timothy Faniran, Bonaventure F. P. Dossou, Moshood O. Yekini, Jonas Kemp, Katherine A Heller, Jude Chidubem Omeke, Chidi Asuzu MD, Naome A Etori, Aïmérou Ndiaye, Ifeoma Okoh, Evans Doe Ocansey, Wendy Kinara, Michael Best, Irfan Essa, Stephen Edward Moore, Chris Fourie, Mercy Nyamewaa Asiedu
- Root Defense Strategies: Ensuring Safety of LLM at the Decoding Level
Xinyi Zeng, Yuying Shang, Jiawei Chen, Jingyuan Zhang, Yu Tian
- In-the-wild Audio Spatialization with Flexible Text-guided Localization
Tianrui Pan, Jie Liu, Zewen Huang, Jie Tang, Gangshan Wu
- L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
Hyesung Jeon, Yulhwa Kim, Jae-Joon Kim
- Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion
Jianqing Zhu, Huang Huang, Zhihang Lin, Juhao Liang, Zhengyang Tang, Khalid Almubarak, Mosen Alharthi, Bang An, Juncai He, Xiangbo Wu, Fei Yu, Junying Chen, MA Zhuoheng, Yuhao Du, He Zhang, Saied Alshahrani, Emad A. Alghamdi, Lian Zhang, Ruoyu Sun, Haizhou Li, Benyou Wang, Jinchao Xu
- What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs
Sangyeop Kim, Yohan Lee, Yongwoo Song, Kimin Lee
- ECERC: Evidence-Cause Attention Network for Multi-Modal Emotion Recognition in Conversation
Tao Zhang, Zhenhua Tan
- CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System
Li Hu, Guoqiang Chen, Xiuwei Shang, Shaoyin Cheng, Benlong Wu, LiGangyang, Xu Zhu, Weiming Zhang, Nenghai Yu
- Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals’ Subjective Text Perceptions
Matthias Orlikowski, Jiaxin Pei, Paul Röttger, Philipp Cimiano, David Jurgens, Dirk Hovy
- Exploring Forgetting in Large Language Model Pre-Training
Chonghua Liao, Ruobing Xie, Xingwu Sun, Haowen Sun, Zhanhui Kang
- Call for Rigor in Reporting Quality of Instruction Tuning Data
Hyeonseok Moon, Jaehyung Seo, Heuiseok Lim
- Bias in the Mirror : Are LLMs opinions robust to their own adversarial attacks
Virgile Rennard, Christos Xypolopoulos, Michalis Vazirgiannis
- AndroidLab: Developing and Evaluating Android Agents in A Reproducible Environment
Yifan Xu, Xiao Liu, Xueqiao Sun, Siyi Cheng, Hao Yu, Hanyu Lai, Shudan Zhang, Dan Zhang, Jie Tang, Yuxiao Dong
- Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment
Yongxin Huang, Kexin Wang, Goran Glavaš, Iryna Gurevych
- Multimodal Transformers are Hierarchical Modal-wise Heterogeneous Graphs
Yijie Jin, Junjie Peng, Xuanchao Lin, Haochen Yuan, Lan Wang, Cangzhi Zheng
- Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking
Yichi Zhang, Zhuo Chen, Lingbing Guo, yajing Xu, Shaokai Chen, Mengshu Sun, Binbin Hu, Zhiqiang Zhang, Lei Liang, Wen Zhang, Huajun Chen
- LLäMmlein 🐑: Transparent, Compact and Competitive German-Only Language Models from Scratch
Jan Pfister, Julia Wunderle, Andreas Hotho
- Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues
Youngmin Kim, Jiwan Chung, Jisoo Kim, sunghyun lee, Sangkyu Lee, Junhyeok Kim, Cheoljong Yang, Youngjae Yu
- How Much Do Pretrained Language Models Know About Word Senses?
Simone Teglia, Simone Tedeschi, Roberto Navigli
- When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations
Huaizhi Ge, Yiming Li, Qifan Wang, Yongfeng Zhang, Ruixiang Tang
- HateDay: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter
Manuel Tonneau, Diyi Liu, Niyati Malhotra, Scott A. Hale, Samuel Fraiberger, Victor Orozco-Olvera, Paul Röttger
- LegalAgentBench: Evaluating LLM Agents in Legal Domain
Haitao Li, Junjie Chen, Jingli Yang, Qingyao Ai, Wei Jia, Youfeng Liu, Kai Lin, Yueyue WU, Guozhi Yuan, Yiran HU, Wuyue Wang, Yiqun LIU, Minlie Huang
- Inference Compute-Optimal Video Vision Language Models
Peiqi Wang, ShengYun Peng, Xuewen Zhang, Hanchao Yu, Yibo Yang, Lifu Huang, Fujun Liu, Qifan Wang
- Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models
Anirudh Sundar, Sinead Williamson, Katherine Metcalf, Barry-John Theobald, Skyler Seto, Masha Fedzechkina
- Digital Gatekeepers: Google’s Role in Curating Hashtags and Subreddits
Amrit Poudel, Yifan Ding, Tim Weninger, Jürgen Pfeffer
- Behind Closed Words: Creating and Investigating the forePLay Annotated Dataset for Polish Erotic Discourse
Anna Kołos, Katarzyna Lorenc, Emilia Wiśnios, Agnieszka Karlińska
- Assessment and manipulation of latent constructs in pre-trained language models using psychometric scales
Maor Reuben, Ortal Slobodin, Idan-Chaim Cohen, Aviad Elyashar, Orna Braun-Lewensohn, Odeya Cohen, Rami Puzis
- Did Translation Models Get More Robust Without Anyone Even Noticing?
Ben Peters, Andre Martins
- Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset
Dan SU, Kezhi Kong, Ying Lin, Joseph Jennings, Brandon Norick, Markus Kliegl, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings
Hans William Alexander Hanley, Zakir Durumeric
- Contrastive Perplexity for Controlled Generation: An Application in Detoxifying Large Language Models
Tassilo Klein, Moin Nabi
- INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent
Haohang Li, Yupeng Cao, Yangyang Yu, Shashidhar Reddy Javaji, Zhiyang Deng, Yueru He, Yuechen Jiang, Zining Zhu, K.P. Subbalakshmi, Jimin Huang, Lingfei Qian, Xueqing Peng, Jordan W. Suchow, Qianqian Xie
- Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
Benjamin Warner, Antoine Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Gallagher, Raja Biswas, Faisal Ladhak, Tom Aarsen, Griffin Thomas Adams, Jeremy Howard, Iacopo Poli
- Gender Inclusivity Fairness Index (GIFI): A Multilevel Framework for Evaluating Gender Diversity in Large Language Models
Zhengyang Shan, Emily Diana, Jiawei Zhou
- D.Va: Validate Your Demonstration First Before You Use It
Qi Zhang, Zhiqing Xiao, Ruixuan Xiao, Lirong Gao, Junbo Zhao
- Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists?
Jiwan Chung, Janghan Yoon, Junhyeong Park, Sangeyl Lee, Joowon Yang, Sooyeon Park, Youngjae Yu
- MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation
Chia-Yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin-Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, Na Zou
- Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning
Hui Liu, Wenya Wang, Hao Sun, Chris XING TIAN, Chenqi Kong, Xin Dong, Haoliang Li
- Direct Prompt Optimization with Continuous Representations
Yangkun Wang, Zihan Wang, Jingbo Shang
- uMedSum: A Unified Framework for Advancing Medical Abstractive Summarization
Aishik Nagar, Yutong Liu, Andy T. Liu, Viktor Schlegel, Vijay Prakash Dwivedi, Arun-Kumar Kaliya-Perumal, GUNA PRATHEEP KALANCHIAM, Yili Tang, Robby T. Tan
- GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement
Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen
- Context-Aware Sentiment Forecasting via LLM-based Multi-Perspective Role-Playing Agents
Fanhang Man, Huandong Wang, Jianjie Fang, Zhaoyi Deng, Baining Zhao, Xinlei Chen, Yong Li
- TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data
Xiang Huang, Jiayu Shen, Shanshan Huang, Sitao Cheng, Xiaxia Wang, Yuzhong Qu
- AndroidGen: Building an Android Language Agent under Data Scarcity
Hanyu Lai, Junjie Gao, Xiao Liu, Yifan Xu, Shudan Zhang, Yuxiao Dong, Jie Tang
- Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation
Mingxuan Xia, Haobo Wang, Yixuan Li, Zewei Yu, Jindong Wang, Junbo Zhao, Runze Wu
- BQA: Body Language Question Answering Dataset for Video Large Language Models
Shintaro Ozaki, Kazuki Hayashi, Miyu Oba, Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe
- A Survey of Post-Training Scaling in Large Language Models
Hanyu Lai, Xiao Liu, Junjie Gao, Jiale Cheng, Zehan Qi, Yifan Xu, Shuntian Yao, Dan Zhang, Jinhua Du, Zhenyu Hou, Xin Lv, Minlie Huang, Yuxiao Dong, Jie Tang
- Position-aware Automatic Circuit Discovery
Tal Haklay, Hadas Orgad, David Bau, Aaron Mueller, Yonatan Belinkov
- HyperFM: Fact-Centric Multimodal Fusion for Link Prediction over Hyper-Relational Knowledge Graphs
Yuhuan Lu, Weijian Yu, Xin Jing, Dingqi Yang
- Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model
Gregor Geigle, Florian Schneider, Carolin Holtermann, Chris Biemann, Radu Timofte, Anne Lauscher, Goran Glavaš
- Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation
Dimitris Gkoumas, Maria Liakata
- Ensemble Watermarks for Large Language Models
Georg Niess, Roman Kern
- $\mathsf{Con Instruction}$: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities
Jiahui Geng, Thy Thy Tran, Preslav Nakov, Iryna Gurevych
- TRACT: Regression-Aware Fine-tuning Meets Chain-of-Thought Reasoning for LLM-as-a-Judge
Cheng-Han Chiang, Hung-yi Lee, Michal Lukasik
- DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation
Hanghui Guo, Jia Zhu, Shimin Di, Weijie Shi, Zhangze Chen, Jiajie Xu
- Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura
- ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
Junjie Ye, Zhengyin Du, Xuesong Yao, Weijian Lin, Yufei Xu, Zehui Chen, Zaiyuan Wang, Sining Zhu, Zhiheng Xi, Siyu Yuan, Tao Gui, Qi Zhang, Xuanjing Huang, Jiecao Chen
- Mixture of insighTful Experts (MoTE): The Synergy of Reasoning Chains and Expert Mixtures in Self-Alignment
Zhili Liu, Yunhao GOU, Kai Chen, Lanqing HONG, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James Kwok
- MAPS: Motivation-Aware Personalized Search via LLM-Driven Consultation Alignment
Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Ming He, Jianping Fan, Xiao Zhang, Jun Xu
- Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework
Jundong Xu, Hao Fei, Meng Luo, Qian Liu, Liangming Pan, William Yang Wang, Preslav Nakov, Mong-Li Lee, Wynne Hsu
- LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs
Jianghao Chen, Junhong Wu, Yangyifan Xu, Jiajun Zhang
- Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training
Yuanfan Li, Zhaohan Zhang, Chengzhengxu Li, Chao Shen, Xiaoming Liu
- Cultural Learning-Based Culture Adaptation of Language Models
Chen Cecilia Liu, Anna Korhonen, Iryna Gurevych
- A-TASC: Asian TED-Based Automatic Subtitling Corpus
Yuhan Zhou, Naoki Yoshinaga
- Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu
- Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs
Yuchen Fu, Zifeng Cheng, Zhiwei Jiang, Zhonghui Wang, Yafeng Yin, Zhengliang Li, Qing Gu
- No Questions are Stupid, but some are Poorly Posed: Understanding Poorly-Posed Information-Seeking Questions
Neha Srikanth, Rachel Rudinger, Jordan Lee Boyd-Graber
- Understanding Common Ground Misalignment in Goal-Oriented Dialog: A Case-Study with Ubuntu Chat Logs
Rupak Sarkar, Neha Srikanth, Taylor Hudson, Rachel Rudinger, Claire Bonial, Philip Resnik
- Grounded, or a Good Guesser? A Per-Question Balanced Dataset to Separate Blind from Grounded Models for Embodied Question Answering
Miles Shelton, Nate Wingerd, Kritim K Rijal, Ayush Garg, Adelina Gutic, Brett Barnes, Catherine Finegan-Dollak
- Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models
Olga Loginova, Oleksandr Bezrukov, Alexey Kravets
- Towards Reward Fairness in RLHF: From a Resource Allocation Perspective
Sheng Ouyang, Yulan Hu, Ge Chen, Qingyang Li, Fuzheng Zhang, Yong Liu
- Taming LLMs with Gradient Grouping
Siyuan Li, Juanxi Tian, Zedong Wang, Xin Jin, Zicheng Liu, Wentao Zhang, Dan Xu
- LazyReview A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews
Sukannya Purkayastha, Zhuang Li, Anne Lauscher, Lizhen Qu, Iryna Gurevych
- Revisiting Common Assumptions about Arabic Dialects in NLP
Amr Keleg, Sharon Goldwater, Walid Magdy
- Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification
Ravi Patel, Angus Brayne, Rogier Hintzen, Daniel Jaroslawicz, Georgiana Neculae, Dane S. Corneil
- Whose Boat Does it Float? Improving Personalization in Preference Tuning via Inferred User Personas
Nishant Balepur, Vishakh Padmakumar, Fumeng Yang, Shi Feng, Rachel Rudinger, Jordan Lee Boyd-Graber
- Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above
Nishant Balepur, Rachel Rudinger, Jordan Lee Boyd-Graber
- Detection of Human and Machine-Authored Fake News in Urdu
Muhammad Zain Ali, Yuxia Wang, Bernhard Pfahringer, Tony C Smith
- An Efficient Task-Oriented Dialogue Policy: Evolutionary Reinforcement Learning Injected by Elite Individuals.
Yangyang Zhao, Ben Niu, Libo Qin, Shihan Wang
- SR-LLM: Rethinking the Structured Representation in Large Language Model
Jiahuan Zhang, Tianheng Wang, Ziyi Huang, Yulong Wu, HANQING WU, DongbaiChen, Linfeng Song, Yue Zhang, guozheng rao, Kaicheng Yu
- Learning Sparsity for Effective and Efficient Music Performance Question Answering
Xingjian Diao, Tianzhen Yang, Chunhui Zhang, Weiyi Wu, Ming Cheng, Jiang Gui
- Taming Language Models for Text-attributed Graph Learning with Decoupled Aggregation
Chuang Zhou, Zhu Wang, Shengyuan Chen, Jiahe Du, Qiyuan Zheng, Zhaozhuo Xu, Xiao Huang
- Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering
Zifeng Cheng, Zhonghui Wang, Yuchen Fu, Zhiwei Jiang, Yafeng Yin, Cong Wang, Qing Gu
- Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence
Jinghan He, Kuan Zhu, Haiyun Guo, Junfeng Fang, Zhenglin Hua, Yuheng Jia, Ming Tang, Tat-Seng Chua, Jinqiao Wang
- Hierarchical Document Refinement for Long-context Retrieval-augmented Generation
Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu, Yongkang Wu, Zhonghua Li, YE QI, Zhicheng Dou
- Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations
Chaoyi Xiang, Chunhua Liu, Simon De Deyne, Lea Frermann
- TEACH: A Contrastive Knowledge Adaptive Distillation Framework for Ancient Chinese Understanding
Yuting Wei, Qi Meng, Yuanxing Xu, Bin Wu
- RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation
Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen
- Progressive Multimodal Reasoning via Active Retrieval
Guanting Dong, Chenghao Zhang, Mengjie Deng, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen
- Pre-training Distillation for Large Language Models: A Design Space Exploration
Hao Peng, Xin Lv, Yushi Bai, Zijun Yao, Jiajie Zhang, Lei Hou, Juanzi Li
- Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions
Pu Jian, Donglei Yu, Jiajun Zhang, Shuo Ren, Wen Yang
- LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks
Yushi Bai, Shangqing Tu, Jiajie Zhang, Hao Peng, Xiaozhi Wang, Xin Lv, Shulin Cao, Jiazheng Xu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li
- Battling against Tough Resister: Strategy Planning with Adversarial Game for Non-collaborative Dialogues
Haiyang Wang, Zhiliang Tian, Yuchen Pan, Xin Song, Xin Niu, Minlie Huang, Bin Zhou
- Cross-model Transferability among Large Language Models on the Platonic Representations of Concepts
Youcheng Huang, Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv
- FoldMoE: Efficient Long Sequence MoE Training via Attention-MoE Pipelining
Guichao Zhu, Lintian Lei, Yuhao QING, Yichao Fu, Fanxin Li, Dong HUANG, Zekai Sun, Heming Cui
- LongReward: Improving Long-context Large Language Models with AI Feedback
Jiajie Zhang, Zhongni Hou, Xin Lv, Shulin Cao, Zhenyu Hou, Yilin Niu, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li
- Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles
Yuxi Xia, Pedro Henrique Luz de Araujo, Klim Zaporojets, Benjamin Roth
- UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
Boxi Yu, Yuxuan Zhu, Pinjia He, Daniel Kang
- Towards Better Evaluation for Generated Patent Claims
Lekang Jiang, Pascal A. Scherz, Stefan Goetz
- Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs
Haritz Puerto, Tilek Chubakov, Xiaodan Zhu, Harish Tayyar Madabushi, Iryna Gurevych
- Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis
Kejian Zhu, Shangqing Tu, Zhuoran Jin, Lei Hou, Juanzi Li, Jun Zhao
- Do Large Language Models have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs
Yanzhu Guo, Simone Conia, Zelin Zhou, Min Li, Saloni Potdar, Henry Xiao
- Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning
Zhu Xu, Zhiqiang Zhao, Zihan Zhang, Yuchi Liu, Quanwei Shen, Fei Liu, Yu Kuang, Jian He, Conglin Liu
- Conformity in Large Language Models
Xiaochen Zhu, Caiqi Zhang, Tom Stafford, Nigel Collier, Andreas Vlachos
- Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings
Chenghao Sun, Zhen Huang, Yonggang Zhang, Le Lu, Houqiang Li, Xinmei Tian, Xu Shen, Jieping Ye
- Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding
Lukas Kinder, Lukas Edman, Alexander Fraser, Tobias Käfer
- FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling
Weilin Zhao, Tengyu Pan, Xu Han, Yudi Zhang, Sun Ao, Yuxiang Huang, Kaihuo Zhang, Weilun Zhao, Yuxuan Li, Jie Zhou, Hao Zhou, Jianyong Wang, Maosong Sun, Zhiyuan Liu
- VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism
Congzhi Zhang, Jiawei Peng, Zhenglin Wang, Yilong Lai, Haowen Sun, Heng Chang, Fei Ma, Weijiang Yu
- Past Meets Present: Creating Historical Analogy with Large Language Models
Nianqi Li, Siyu Yuan, Jiangjie Chen, Jiaqing Liang, Feng Wei, Zujie Liang, Deqing Yang, Yanghua Xiao
- Meta-Reflection: A Feedback-Free Reflection Learning Framework
Yaoke Wang, Yun Zhu, XintongBao, Wenqiao Zhang, Suyang Dai, kehan chen, Wenqiang Li, Gang Huang, Siliang Tang, Yueting Zhuang
- Cross-Lingual Transfer of Cultural Knowledge: An Asymmetric Phenomenon
Chen Zhang, Zhiyuan Liao, Yansong Feng
- Read it in Two Steps: Translating Extremely Low-Resource Languages with Code-Augmented Grammar Books
Chen Zhang, Jiuheng Lin, Xiao Liu, Zekai Zhang, Yansong Feng
- Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs
Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang Sui
- Automating Legal Concept Interpretation with LLMs: Retrieval, Generation, and Evaluation
Kangcheng Luo, Quzhe Huang, Cong Jiang, Yansong Feng
- Visual Evidence Prompting Mitigates Hallucinations in Large Vision-Language Models
Wei Li, Zhen Huang, Houqiang Li, Le Lu, Yang Lu, Xinmei Tian, Xu Shen, Jieping Ye
- Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
Shao Zhang, Xihuai Wang, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen
- TokAlign: Efficient Vocabulary Adaptation via Token Alignment
Chong Li, Jiajun Zhang, Chengqing Zong
- AceEdit: Advancing Continuous Knowledge Editing For Large Language Models
Qi Li, Xiaowen Chu
- The Impact of Token Granularity on the Predictive Power of Language Model Surprisal
Byung-Doh Oh, William Schuler
- Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models
Xiaochen Zhu, Georgi Karadzhov, Chenxi Whitehouse, Andreas Vlachos
- BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering
Taolin Zhang, Dongyang Li, Qizhou Chen, Chengyu Wang, Xiaofeng He
- Dynamic and Generalizable Process Reward Modeling
Zhangyue Yin, Qiushi Sun, Zhiyuan Zeng, Qinyuan Cheng, Xipeng Qiu, Xuanjing Huang
- AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness
Zixin Chen, Hongzhan Lin, Kaixin Li, Ziyang Luo, Zhen Ye, Guang Chen, Zhiyong Huang, Jing Ma
- Towards Text-Image Interleaved Retrieval
Xin Zhang, Ziqi Dai, Yongqi Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Jun Yu, Wenjie Li, Min Zhang
- Large Margin Representation Learning for Robust Cross-lingual Named Entity Recognition
Guangcheng Zhu, Ruixuan Xiao, Zhen Zhu, Gengyu Lyu, Junbo Zhao, Haobo Wang
- An Efficient and Precise Training Data Construction Framework for Process-supervised Reward Model in Mathematical Reasoning
Wei Sun, Qianlong Du, Fuwei Cui, Jiajun Zhang
- QAEncoder: Towards Aligned Representation Learning in Question Answering Systems
Zhengren Wang, Qinhan Yu, Shida Wei, Zhiyu li, Feiyu Xiong, Xiaoxing Wang, Simin Niu, Hao Liang, Wentao Zhang
- Game Development as Human-LLM Interaction
Jiale Hong, Hongqiu Wu, hai zhao
- Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases
Rena Wei Gao, Xuetong Wu, Tatsuki Kuribayashi, Mingrui Ye, Siya Qi, Carsten Roever, Yuanxing Liu, Zheng Yuan, Jey Han Lau
- DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking
Zhuoqun Li, Haiyang Yu, Xuanang Chen, Hongyu Lin, Yaojie Lu, Fei Huang, Xianpei Han, Yongbin Li, Le Sun
- Leveraging Human Production-Interpretation Asymmetries to Test LLM Cognitive Plausibility
Suet-Ying Lam, Qingcheng Zeng, Jingyi Wu, Rob Voigt
- SurveyPilot: an Agentic Framework for Automated Human Opinion Collection from Social Media
Viet Thanh Pham, Lizhen Qu, Zhuang Li, Suraj Sharma, Gholamreza Haffari
- Sharper and Faster mean Better: Towards More Efficient Vision-Language Model for Hour-scale Long Video Understanding
Daoze Zhang, Yuze Zhao, Jintao Huang, Yingda Chen
- Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions
Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Weiwen Xu, Deli Zhao, Lidong Bing
- How Humans and LLMs Organize Conceptual Knowledge: Exploring Subordinate Categories in Italian
Andrea Pedrotti, Giulia Rambelli, Caterina Villani, Marianna Bolognesi
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
Jiaqi Zhao, Miao Zhang, Ming Wang, Yuzhang Shang, Kaihao Zhang, Weili Guan, Yaowei Wang, Min Zhang
- ProtoLens: Advancing Prototype Learning for Fine-Grained Interpretability in Text Classification
Bowen Wei, Ziwei Zhu
- Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization
Chaoqun Cui, Liangbin Huang, Shijing Wang, Zhe Tong, Zhaolong Huang, Xiao Zeng, Xiaofeng Liu
- Sparse Latents Steer Retrieval-Augmented Generation
Chunlei Xin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Xuanang Chen, Xinyan Guan, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun
- Improving the Calibration of Confidence Scores in Text Generation Using the Output Distribution’s Characteristics
Lorenzo Jaime Yu Flores, Ori Ernst, Jackie CK Cheung
- Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders
Boyi Deng, Yu Wan, Baosong Yang, Yidan Zhang, Fuli Feng
- SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model
Xun Liang, Simin Niu, Zhiyu li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Zhaoxin Fan, Bo Tang, Jihao Zhao, Jiawei Yang, Shichao Song, Mengwei Wang
- AnRe: Analogical Replay for Temporal Knowledge Graph Forecasting
Guo Tang, Zheng Chu, Wenxiang Zheng, Junjia Xiang, Yizhuo Li, Weihao Zhang, Ming Liu, Bing Qin
- Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Yunhua Zhou, Xipeng Qiu
- Text is All You Need: LLM-enhanced Incremental Social Event Detection
Zitai Qiu, Congbo Ma, Jia Wu, Jian Yang
- Multimodal Pragmatic Jailbreak on Text-to-image Models
Tong Liu, Zhixin Lai, Jiawen Wang, Gengyuan Zhang, Shuo Chen, Philip Torr, Vera Demberg, Volker Tresp, Jindong Gu
- Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks
Xingcheng Xu, Zibo Zhao, Haipeng Zhang, Yanqing Yang
- Discourse Relation-Enhanced Neural Coherence Modeling
Wei Liu, Michael Strube
- Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models
Kuofeng Gao, Shu-Tao Xia, Ke Xu, Philip Torr, Jindong Gu
- from Benign import Toxic: Jailbreaking the Language Model via Adversarial Metaphors
Yu Yan, Sheng Sun, Zenghao Duan, Teli Liu, Min Liu, Zhiyi yin, Qi Li, LeiJingyu
- ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Contrastive Framework
Hengyuan Zhang, Chenming Shang, Sizhe Wang, Dongdong Zhang, Feng Yao, Renliang Sun, Yiyao Yu, Yujiu Yang, Furu Wei
- MorphMark: Flexible Adaptive Watermarking for Large Language Models
Zongqi Wang, Tianle Gu, Baoyuan Wu, Yujiu Yang
- A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression
Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Xinting Huang, Dong Yu, Zhicheng Dou
- On the Limit of Language Models as Planning Formalizers
Cassie Huang, Li Zhang
- Learning to Generate Structured Output with Schema Reinforcement Learning
Yaxi Lu, Haolun Li, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Zhiyuan Liu, Fangming Liu, Maosong Sun
- On the Robustness of RAG Systems in Educational Question Answering under Knowledge Discrepancies
Tianshi Zheng, Weihan Li, Jiaxin Bai, Weiqi Wang, Yangqiu Song
- Enhancing Unsupervised Sentence Embeddings via Knowledge-Driven Data Augmentation and Gaussian-Decayed Contrastive Learning
Peichao Lai, Zhengfeng Zhang, Wentao Zhang, Fangcheng Fu, Bin CUI
- Improve Safety Training of Large Language Models with Safety-Critical Singular Vectors Localization
Peijian Gu, Quan Wang, Zhendong Mao
- WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models
Huawen Feng, Pu Zhao, Qingfeng Sun, Can Xu, Fangkai Yang, Lu Wang, Qianli Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
- A Triple-View Framework for Fine-Grained Emotion Classification with Clustering-Guided Contrastive Learning
Junqing Gong, Binhan Yang, Wei Shen
- Quantification of Large Language Model Distillation
Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xeron Du, Sirui He, Haihong Wu, Tianci Liu, Jiaheng Liu, Hamid Alinejad-Rokny, Min Yang, Yitao Liang, Zhoufutu Wen, Shiwen Ni
- Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models
Zihan Qiu, Zeyu Huang, Bo Zheng, Kaiyue Wen, Zekun Wang, Rui Men, Ivan Titov, Dayiheng Liu, Jingren Zhou, Junyang Lin
- Pandora’s Box or Aladdin’s Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models
Jinyang Wu, Shuai Zhang, Feihu Che, Mingkuan Feng, Pengpeng Shao, Jianhua Tao
- Stepwise Reasoning Disruption Attack of LLMs
Jingyu Peng, Maolin Wang, Xiangyu Zhao, Kai Zhang, Wanyu Wang, Pengyue Jia, Qidong Liu, Ruocheng Guo, Qi Liu
- Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
Qiyuan Zhang, Yufei Wang, Yuxin Jiang, Liangyou Li, Chuhan Wu, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma
- Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models
Mingyang Wang, Heike Adel, Lukas Lange, Yihong Liu, Ercong Nie, Jannik Strötgen, Hinrich Schuetze
- Optimizing Decomposition for Optimal Claim Verification
Yining Lu, Noah Ziems, Hy Dang, Meng Jiang
- GradOT: Training-free Gradient-persevering Offsite-tuning for Large Language Models
Kai Yao, Zhaorui Tan, Penglei Gao, Lichun Li, Kaixin Wu, Yinggui Wang, Yuan Zhao, Yixin Ji, Jianke Zhu, Wei Wang
- Knowledge Boundary of Large Language Models: A Survey
Moxin Li, Yong Zhao, Wenxuan Zhang, Shuaiyi Li, Wenya Xie, See-Kiong Ng, Tat-Seng Chua, Yang Deng
- Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning
Hai-Long Sun, Zhun Sun, Houwen Peng, Han-Jia Ye
- MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System
Jihao Zhao, Zhiyuan Ji, Zhaoxin Fan, Hanyu Wang, Simin Niu, Bo Tang, Feiyu Xiong, Zhiyu li
- Mitigating Selection Bias with Node Pruning and Auxiliary Options
Hyeong Kyu Choi, Weijie Xu, Chi Xue, Stephanie Eckman, Chandan K. Reddy
- Dually Self-Improved Counterfactual Data Augmentation Using Large Language Model
Luhao Zhang, Xinyu Zhang, Linmei Hu, Dandan Song, Liqiang Nie
- RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation
Shi-Qi Yan, Quan Liu, Zhen-Hua Ling
- Improving Parallel Sentence Mining for Low-Resource and Endangered Languages
Shu Okabe, Katharina Hämmerl, Alexander Fraser
- Learning to Reason from Feedback at Test-Time
Yanyang Li, Michael Lyu, Liwei Wang
- $\textit{L-CiteEval}$: A Suite for Evaluating Fidelity of Long-context Models
Zecheng Tang, Keyan Zhou, Juntao Li, Baibei Ji, jianye hou, Min Zhang
- $SECRET$: Semi-supervised Clinical Trial Document Similarity Search
Trisha Das, Afrah Shafquat, Mandis Beigi, Jacob Aptekar, Jimeng Sun
- Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models’ Uncertainty?
Jiayu Liu, Qing Zong, Weiqi Wang, Yangqiu Song
- Geometric Signatures of Compositionality Across a Language Model’s Lifetime
Jin Hwa Lee, Thomas Jiralerspong, Lei Yu, Yoshua Bengio, Emily Cheng
- Pattern Recognition or Medical Knowledge? The Problem with Multiple-Choice Questions in Medicine
Maxime Griot, Jean Vanderdonckt, Demet YUKSEL, Coralie Hemptinne
- People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text
Jenna Russell, Marzena Karpinska, Mohit Iyyer
- YuLan-Mini: Pushing the Limits of Open Data-efficient Language Model
Hu Yiwen, Song Huatong, Jie Chen, Jia Deng, jiapeng wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, zican Dong, YANG LU, Xu Miao, Xin Zhao, Ji-Rong Wen
- Your Model is Overconfident, and Other Lies We Tell Ourselves
Timothee Mickus, Aman Sinha, Raúl Vázquez
- Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention
Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch
- Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models
Kyeonghyun Kim, Jinhee Jang, Juhwan Choi, Yoonji Lee, Kyohoon Jin, YoungBin Kim
- What is Stigma Attributed to? A Theory-Grounded, Expert-Annotated Interview Corpus for Demystifying Mental-Health Stigma
Han Meng, Yancan Chen, Yunan Li, YITIAN YANG, Jungup Lee, Renwen Zhang, Yi-Chieh Lee
- ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors
Yuguo Yin, Yuxin Xie, Wenyuan Yang, Dongchao Yang, Jinghan Ru, Xianwei Zhuang, Liming Liang, Yuexian Zou
- Enhancing Transformers for Generalizable First-Order Logical Entailment
Tianshi Zheng, Jiazheng Wang, Zihao Wang, Jiaxin Bai, Hang Yin, Zheye Deng, Yangqiu Song, Jianxin Li
- Self-Taught Agentic Long Context Understanding
Yufan Zhuang, Xiaodong Yu, Jialian Wu, Ximeng Sun, Ze Wang, Jiang Liu, Yusheng Su, Jingbo Shang, Zicheng Liu, Emad Barsoum
- Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
Shahrad Mohammadzadeh, Juan David Guerra, Marco Bonizzato, Reihaneh Rabbany, Golnoosh Farnadi
- OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu
- CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter
Yepeng Weng, Dianwen Mei, Huishi Qiu, Xujie Chen, Li Liu, Jiang Tian, Zhongchao Shi
- ConSim: Measuring Concept-Based Explanations’ Effectiveness with Automated Simulatability
Antonin Poché, Alon Jacovi, Agustin Martin Picard, Victor Boutin, Fanny Jourdan
- Decoding Reading Goals from Eye Movements
Omer Shubi, Cfir Avraham Hadar, Yevgeni Berzak
- Uncovering Visual-Semantic Psycholinguistic Properties from the Distributional Structure of Text Embedding Space
Si Wu, Sebastian Bruch
- GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent
Bin Xie, Rui Shao, Gongwei Chen, Kaiwen Zhou, Yinchuan Li, Jie Liu, Min Zhang, Liqiang Nie
- P$^2$ Law: Scaling Law for Post-Training After Model Pruning
Xiaodong Chen, Yuxuan Hu, Xiaokang Zhang, Yanling Wang, Cuiping Li, Hong Chen, Jing Zhang
- Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats
Kuleen Sasse, Carlos Alejandro Aguirre, Isabel Cachola, Sharon Levy, Mark Dredze
- Lost in the Context: Insufficient and Distracted Attention to Contexts in Preference Modeling
Shihan Dou, Jiayi Chen, Chenhao Huang, Feng Chen, Wei Chengzhi, Huiyuan Zheng, Shichun Liu, Yan Liu, Chenxiao Liu, Chao Xin, Lin Yan, Zongzhang Zhang, Tao Gui, Qi Zhang, Xuanjing Huang
- Entailment-Preserving First-order Logic Representations in Natural Language Entailment
Jinu Lee, Qi Liu, Runzhi Ma, Vincent Han, Ziqi Wang, Heng Ji, Julia Hockenmaier
- Enhancing Multimodal Continual Instruction Tuning with BranchLoRA
Duzhen Zhang, Yong Ren, Zhong-Zhi Li, Yahan Yu, Jiahua Dong, Chenxing Li, Zhilong Ji, Jinfeng Bai
- Enhancing Automated Interpretability with Output-Centric Feature Descriptions
Yoav Gur-Arieh, Roy Mayan, Chen Agassy, Atticus Geiger, Mor Geva
- Towards Effective and Efficient Continual Pre-training of Large Language Models
Jie Chen, Zhipeng Chen, jiapeng wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen
- Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization
Yihao Huang, Chong Wang, Xiaojun Jia, Qing Guo, Felix Juefei-Xu, Jian Zhang, Yang Liu, Geguang Pu
- mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding
Anwen Hu, Haiyang Xu, Liang Zhang, Jiabo Ye, Ming Yan, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou
- What Makes a Good Natural Language Prompt?
Do Xuan Long, Duy Dinh, Ngoc-Hai Nguyen, Kenji Kawaguchi, Nancy F. Chen, Shafiq Joty, Min-Yen Kan
- Limited-Resource Adapters Are Regularizers, Not Linguists
Marcell Fekete, Nathaniel Romney Robinson, Ernests Lavrinovics, Djeride Jean-Baptise, Raj Dabre, Johannes Bjerva, Heather Lent
- X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents
Weiqi Wu, Hongqiu Wu, hai zhao
- Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral
Shivani Kumar, David Jurgens
- Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models
Zheyuan Liu, Guangyao Dou, Xiangchi Yuan, Chunhui Zhang, Zhaoxuan Tan, Meng Jiang
- NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning
Zheyuan Zhang, Yiyang Li, Nhi Ha Lan Le, Zehong Wang, Tianyi Ma, Vincent Galassi, Keerthiram Murugesan, Nuno Moniz, Werner Geyer, Nitesh V Chawla, Chuxu Zhang, Yanfang Ye
- ReLearn: Unlearning via Learning for Large Language Models
Haoming Xu, Ningyuan Zhao, Liming Yang, Sendong Zhao, Shumin Deng, Mengru Wang, Bryan Hooi, Nay Oo, Huajun Chen, Ningyu Zhang
- Understanding Cross-Domain Adaptation in Low-Resource Topic Modeling
Pritom Saha Akash, Kevin Chen-Chuan Chang
- UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models
Boyang XUE, Fei Mi, Qi Zhu, Hongru WANG, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, Kam-Fai Wong
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning
Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang
- HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation
Jie Ouyang, Tingyue Pan, Mingyue Cheng, Ruiran Yan, Yucong Luo, Jiaying Lin, Qi Liu
- Uncertainty Propagation on LLM Agent
Qiwei Zhao, Dong Li, Yanchi Liu, Wei Cheng, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Huaxiu Yao, Chen Zhao, Haifeng Chen, Xujiang Zhao
- Beyond Position: the emergence of wavelet-like properties in Transformers
Valeria Ruscio, Umberto Nanni, Fabrizio Silvestri
- Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs
Giovanni Servedio, Alessandro De Bellis, Dario Di Palma, Vito Walter Anelli, Tommaso Di Noia
- Disentangling Biased Knowledge from Reasoning in Large Language Models via Machine Unlearning
Zheyuan Liu, Suraj Maharjan, Fanyou Wu, Rahil Parikh, Belhassen Bayar, Srinivasan H. Sengamedu, Meng Jiang
- LLaMAs Have Feelings Too: Unveiling Sentiment and Emotion Representations in LLaMA Models Through Probing
Dario Di Palma, Alessandro De Bellis, Giovanni Servedio, Vito Walter Anelli, Fedelucio Narducci, Tommaso Di Noia
- CxGGEC: Construction-Guided Grammatical Error Correction
Yayu Cao, Tianxiang Wang, Lvxiaowei Xu, Zhenyao Wang, Ming Cai
- Beyond Sequences: Two-dimensional Representation and Dependency Encoding for Code Generation
Xiangyu Zhang, Yu Zhou, Guang Yang, Wei Cheng, Taolue Chen
- HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs
Qing Li, Jiahui Geng, Zongxiong Chen, Derui Zhu, Yuxia Wang, Congbo Ma, Chenyang Lyu, Fakhri Karray
- What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations
Dongqi Liu, Chenxi Whitehouse, Xi Yu, Louis Mahon, Rohit Saxena, Zheng Zhao, Yifu QIU, Mirella Lapata, Vera Demberg
- NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering
Ruisheng Cao, Hanchong Zhang, Tiancheng Huang, Zhangyi Kang, Yuxin Zhang, Liangtai Sun, Hanqi Li, Yuxun Miao, Shuai Fan, Lu Chen, Kai Yu
- ProvBench: A Benchmark of Legal Provision Recommendation for Contract Auto-Reviewing
Xiuxuan Shen, Zhongyuan Jiang, Junsan Zhang, Junxiao Han, Yao Wan, Chengjie Guo, Bingcheng Liu, Jie Wu, Renxiang Li, Philip S. Yu
- F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching
Yushen CHEN, Zhikang Niu, Ziyang Ma, Keqi Deng, Chunhui Wang, JianZhao, Kai Yu, Xie Chen
- LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks
Anna Bavaresco, Raffaella Bernardi, Leonardo Bertolazzi, Desmond Elliott, Raquel Fernández, Albert Gatt, Esam Ghaleb, Mario Giulianelli, Michael Hanna, Alexander Koller, Andre Martins, Philipp Mondorf, Vera Neplenbroek, Sandro Pezzelle, Barbara Plank, David Schlangen, Alessandro Suglia, Aditya K Surikuchi, Ece Takmaz, Alberto Testoni
- AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation
Xiechi Zhang, Zetian Ouyang, Linlin Wang, Gerard de Melo, Zhu Cao, Xiaoling Wang, Ya Zhang, Yanfeng Wang, Liang He
- CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis
Bohan Zhang, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang
- Efficiently Identifying Watermarked Segments in Mixed-Source Texts
Xuandong Zhao, Chenwen Liao, Yu-Xiang Wang, Lei Li
- FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings
Tong Liu, Xiao Yu, Wenxuan Zhou, Jindong Gu, Volker Tresp
- Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks
Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Xun Wang, Si-Qing Chen, Michael J. Wooldridge, Janet B. Pierrehumbert, Furu Wei
- Towards a More Generalized Approach in Open Relation Extraction
Qing Wang, Yuepei Li, Qiao Qiao, Kang Zhou, Qi Li
- Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home
Viktor Moskvoretskii, Maria Marina, Mikhail Salnikov, Nikolay Ivanov, Sergey Pletenev, Daria Galimzianova, Nikita Krayko, Vasily Konovalov, Irina Nikishina, Alexander Panchenko
- Evaluating Language Models as Synthetic Data Generators
Seungone Kim, Juyoung Suk, Xiang Yue, Vijay Viswanathan, Seongyun Lee, Yizhong Wang, Kiril Gashteovski, Carolin Lawrence, Sean Welleck, Graham Neubig
- Can Graph Descriptive Order Affect Solving Graph Problems with LLMs?
Yuyao Ge, Shenghua Liu, Baolong Bi, Yiwei Wang, Lingrui Mei, Wenjie Feng, Lizhe Chen, Xueqi Cheng
- Learning to Rewrite: Generalized LLM-Generated Text Detection
Ran Li, Wei Hao, Weiliang Zhao, Junfeng Yang, Chengzhi Mao
- Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search
Linhao Yu, Xingguang Ji, Yahui Liu, Fanheng Kong, Chenxi Sun, Jingyuan Zhang, Hongzhi Zhang, V. W., Fuzheng Zhang, Deyi Xiong
- GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs
Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Maria Krylova, Venediktov Egor, Zuev Aleksandr, Evgeny Burnaev
- Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis
Hong Huang, Dapeng Wu
- Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models
Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Qing Yu, Go Irie, Yixuan Li, Hai Helen Li, Ziwei Liu, Kiyoharu Aizawa
- AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models
Yuhang Wu, Wenmeng Yu, Yean Cheng, Yan Wang, Xiaohan Zhang, Jiazheng Xu, Ming Ding, Yuxiao Dong
- Biased LLMs can Influence Political Decision-Making
Jillian Fisher, Shangbin Feng, Robert Aron, Thomas Richardson, Yejin Choi, Daniel W Fisher, Jennifer Pan, Yulia Tsvetkov, Katharina Reinecke
- LexTempus: Enhancing Temporal Generalizability of Legal Language Models Through Dynamic Mixture of Experts
Santosh T.Y.S.S, Tuan-Quang Vuong
- That is Unacceptable: the Moral Foundations of Canceling
Soda Marem Lo, Oscar Araque, Rajesh Sharma, Marco Antonio Stranisci
- FloorPlan-LLaMa: Aligning Architects’ Feedback and Domain Knowledge in Architectural Floor Plan Generation
Jun Yin, Pengyu Zeng, Haoyuan Sun, Yuqin Dai, Han Zheng, Miao Zhang, Yachao Zhang, Shuai Lu
- TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding
Max Ku, Thomas Chong, Jonathan Leung, Krish Shah, Alvin Yu, Wenhu Chen
- FineReason: Evaluating and Improving LLMs’ Deliberate Reasoning through Reflective Puzzle Solving
Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Chaoqun Liu, Lidong Bing, Deli Zhao, Anh Tuan Luu, Yu Rong
- The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs
Sergey Berezin, Reza Farahbakhsh, Noel Crespi
- Identifying Reliable Evaluation Metrics for Scientific Text Revision
Leane Jourdan, Nicolas Hernandez, Florian Boudin, Richard Dufour
- Can Language Models Reason about Individualistic Human Values and Preferences?
Liwei Jiang, Taylor Sorensen, Sydney Levine, Yejin Choi
- BERT-like Models for Slavic Morpheme Segmentation
Dmitry Morozov, Lizaveta Astapenka, Anna Glazkova, Timur Garipov, Olga Lyashevskaya
- Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling
Xianzhen Luo, Yixuan Wang, Qingfu Zhu, Zhiming Zhang, Xuanyu Zhang, Qing Yang, Dongliang Xu
- Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering
Xinyu Tang, Xiaolei Wang, Zhihao Lv, Yingqian Min, Xin Zhao, Binbin Hu, Ziqi Liu, Zhiqiang Zhang
- Drift: Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference
Jiazheng Li, Hanqi Yan, Yulan He
- Fairness through Difference Awareness: Measuring $\textit{Desired}$ Group Discrimination in LLMs
Angelina Wang, Michelle Phan, Daniel E. Ho, Sanmi Koyejo
- MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models
Shojiro Yamabe, Futa Kai Waseda, Tsubasa Takahashi, Koki Wataoka
- Dynamic Scaling of Unit Tests for Code Reward Modeling
Zeyao Ma, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang
- UniConv: Unifying Retrieval and Response Generation for Large Language Model in Conversation
Fengran Mo, Yifan Gao, Chuan Meng, Xin Liu, Zhuofeng Wu, Kelong Mao, Zhengyang Wang, Pei Chen, Zheng Li, Xian Li, Bing Yin, Meng Jiang
- Tracking Life’s Ups and Downs: Mining Life Events from Social Media Posts for Mental Health Analysis
Minghao Lv, Siyuan Chen, Haoan Jin, Minghao Yuan, Qianqian Ju, Yujia Peng, Kenny Q. Zhu, Mengyue Wu
- Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control
Shengpeng Ji, Qian Chen, Wen Wang, Jialong Zuo, Minghui Fang, Ziyue Jiang, Hai Huang, Zehan Wang, Xize Cheng, Siqi Zheng, Zhou Zhao
- PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression
Haoran Que, Wenge Rong
- Towards Effective Extraction and Evaluation of Factual Claims
Dasha Metropolitansky, Jonathan Larson
- Beyond Facts: Evaluating Intent Hallucination in Large Language Models
Yijie Hao, Haofei Yu, Jiaxuan You
- A Systematic Study of Compositional Syntactic Transformer Language Models
Yida Zhao, Hao Xve, Xiang Hu, Kewei Tu
- M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation
Zhaopeng Feng, Jiayuan Su, Jiamei Zheng, Jiahan Ren, Yan Zhang, Jian Wu, Hongwei Wang, Zuozhu Liu
- SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition
Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Junhao Huang, Conghui He, Dahua Lin, Jiaqi Wang
- Personalized Text Generation with Contrastive Activation Steering
Jinghao Zhang, Yuting Liu, Wenjie Wang, Qiang Liu, Shu Wu, Liang Wang, Tat-Seng Chua
- Gumbel Reranking: Differentiable End-to-End Reranker Optimization
Siyuan Huang, Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang, Jingwen Leng, Minyi Guo, Zhouhan Lin
- Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback
Lester James Validad Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi
- SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection
Yi-Fan Lu, Xian-Ling Mao, Tian Lan, Tong Zhang, Yu-Shi Zhu, Heyan Huang
- The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project
Angelina Aspra Aquino, Lester James Validad Miranda, Elsie Marie T. Or
- DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation
Jennifer Chen, Aidar Myrzakhan, Yaxin Luo, Hassaan Muhammad Khan, Sondos Mahmoud Bsharat, Zhiqiang Shen
- G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems
Shilong Wang, Guibin Zhang, Miao Yu, Guancheng Wan, Fanci Meng, chongye guo, Kun Wang, Yang Wang
- Deontological Keyword Bias: The Impact of Modal Verbs on Normative Judgments of Language Models
Bumjin Park, Leejinsil, Jaesik Choi
- LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning
Weijie Shi, Han Zhu, Jiaming Ji, Mengze Li, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Sirui Han, Yike Guo
- Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context
Maggie Mi, Aline Villavicencio, Nafise Sadat Moosavi
- ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation
Xuanle Zhao, Xianzhen Luo, Qi Shi, Chi Chen, Shuo Wang, Zhiyuan Liu, Maosong Sun
- The Cross-linguistic Role of Animacy in Grammar Structures
Nina Gregorio, Matteo Gay, Sharon Goldwater, Edoardo Ponti
- LexGen: Domain-aware Multilingual Lexicon Generation
Ayush Maheshwari, Atul Kumar Singh, N J Karthika, Krishnakant Bhatt, Preethi Jyothi, Ganesh Ramakrishnan
- How to Train Long-Context Language Models (Effectively)
Tianyu Gao, Alexander Wettig, Howard Yen, Danqi Chen
- MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion
Qizhi Pei, Lijun Wu, Zhuoshi Pan, Yu Li, Honglin Lin, Chenlin Ming, Xin Gao, Conghui He, Rui Yan
- Mining Complex Patterns of Argumentative Reasoning in Natural Language Dialogue
Ramon Ruiz-Dolz, Zlata Kikteva, John Lawrence
- OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use
Xueyu Hu, Tao Xiong, Biao Yi, Ruixuan Xiao, Zishu Wei, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shenzhi Wang, Xinchen Xu, Shuofei Qiao, Zhaokai Wang, Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu, Shengyu Zhang, Fei Wu
- Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning
Mingfei Lau, Qian Chen, Yeming Fang, Tingting Xu, Tongzhou Chen, Pavel Golik
- LLM as a Broken Telephone: Iterative Generation Distorts Information
Amr Mohamed, Mingmeng Geng, Michalis Vazirgiannis, Guokan Shang
- VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues
Jianshu Zhang, Dongyu Yao, Renjie Pi, Paul Pu Liang, Yi R. Fung
- Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation
Xiang Geng, Zhejian Lai, Jiajun Chen, Hao Yang, Shujian Huang
- Combining Domain and Alignment Vectors Provides Better Knowledge-Safety Trade-offs in LLMs
Megh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, Sarath Chandar
- Can Uniform Meaning Representation Help GPT-4 Translate from Indigenous Languages?
Shira Wein
- Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models
Fan Zhang, Shulin Tian, Ziqi Huang, Yu Qiao, Ziwei Liu
- LLMs Struggle to Describe the Haystack without Human Help: A Social Science-Inspired Evaluation of Topic Models
Zongxia Li, Lorena Calvo-Bartolomé, Alexander Miserlis Hoyle, Paiheng Xu, Daniel Kofi Stephens, Juan Francisco Fung, Alden Dima, Jordan Lee Boyd-Graber
- ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models
Ziyue Wang, Chi Chen, Fuwen Luo, Yurui Dong, Yuanchi Zhang, Yuzhuang Xu, Xiaolong Wang, Peng Li, Yang Liu
- Enough Coin Flips Can Make LLMs Act Bayesian
Ritwik Gupta, Rodolfo Corona, Jiaxin Ge, Eric Wang, Dan Klein, Trevor Darrell, David M. Chan
- Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games
Wenye Lin, Jonathan Roberts, Yunhan Yang, Samuel Albanie, Zongqing Lu, Kai Han
- A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens
Zhijie Nie, Richong Zhang, Zhanyu Wu
- Commonsense Reasoning in Arab Culture
Abdelrahman Sadallah, Junior Cedric Tonga, Khalid Almubarak, Saeed Almheiri, Farah Atif, Chatrine Qwaider, Karima Kadaoui, Sara Shatnawi, Yaser Alesh, Fajri Koto
- AXIS: Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents
Junting Lu, Zhiyang Zhang, Fangkai Yang, Jue Zhang, Lu Wang, Chao Du, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
- Translation and Fusion Improves Cross-lingual Information Extraction
Yang Chen, Vedaant Shah, Alan Ritter
- Conditional Dichotomy Quantification via Geometric Embedding
Shaobo Cui, Wenqing Liu, Yiyang Feng, Jiawei Zhou, Boi Faltings
- Aligning Large Language Models with Implicit Preferences from User-Generated Content
Zhaoxuan Tan, Zheng Li, Tianyi Liu, Haodong Wang, Hyokun Yun, Ming Zeng, Pei Chen, Zhihan Zhang, Yifan Gao, Ruijie Wang, Priyanka Nigam, Bing Yin, Meng Jiang
- VQAGuider: Guiding Multimodal Large Language Models to Answer Complex Video Questions
Yuyan Chen, Jiyuan Jia, Jiaxin Lu, Siyue Li, Yu Guan, Ming Yang, Qingpei Guo
- Large Language Models are Good Relational Learners
Fang Wu, Vijay Prakash Dwivedi, Jure Leskovec
- SpaRE: Enhancing Spatial Reasoning in Vision-Language Models with Synthetic Data
Michael Ogezi, Freda Shi
- Distilling an End-to-End Voice Assistant Without Instruction Training Data
William Barr Held, Yanzhe Zhang, Weiyan Shi, Minzhi Li, Michael J Ryan, Diyi Yang
- CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games
Shuhang Xu, Fangwei Zhong
- CER: Confidence Enhanced Reasoning in LLMs
Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah
- Watermarking Large Language Models: An Unbiased and Low-risk Method
Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, Michael Chau
- On Synthetic Data Strategies for Domain-Specific Generative Retrieval
Haoyang Wen, Jiang Guo, Yi Zhang, Jiarong Jiang, Zhiguo Wang
- LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates
Ying Shen, Lifu Huang
- CONFETTI: Conversational Function-Calling Evaluation Through Turn-Level Interactions
Tamer Alkhouli, Katerina Margatina, James Gung, Raphael Shu, Claudia Zaghi, MONICA SUNKARA, Yi Zhang
- Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others from Conversational Cues
Anthony Sicilia, Malihe Alikhani
- Uncertainty in Causality: A New Frontier
Shaobo Cui, Luca Mouchel, Boi Faltings
- SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs
Michael J Ryan, Omar Shaikh, Aditri Bhagirath, Daniel Frees, William Barr Held, Diyi Yang
- When People are Floods: Analyzing Dehumanizing Metaphors in Immigration Discourse with Large Language Models
Julia Mendelsohn, Ceren Budak
- AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
Weidi Luo, Shenghong Dai, Xiaogeng Liu, Suman Banerjee, Huan Sun, Muhao Chen, Chaowei Xiao
- Improving Model Factuality with Fine-grained Critique-based Evaluator
Yiqing Xie, Wenxuan Zhou, Pradyot Prakash, Di Jin, Yuning Mao, Quintin Fettes, Arya Talebzadeh, Sinong Wang, Han Fang, Carolyn Rose, Daniel Fried, Hejia Zhang
- Building a Long Text Privacy Policy Corpus with Multi-Class Labels
David Stein, Florencia Marotta-Wurgler
- x-SAL: Leading Symbolic Reasoning across Languages via Cross-lingual Symbolic-Aided Language Model
Leonardo Ranaldi, Giulia Pucci
- When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models
Samuel Joseph Amouyal, Aya Meltzer-Asscher, Jonathan Berant
- Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models
Zixiang Xu, Yanbo Wang, Yue Huang, Xiuying Chen, Jieyu Zhao, Meng Jiang, Xiangliang Zhang
- VLSBench: Unveiling Visual Leakage in Multimodal Safety
Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao
- Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and Reasoning
Sky CH-Wang, Darshan Girish Deshpande, Smaranda Muresan, Anand Kannappan, Rebecca Qian
- Subword models struggle with word learning, but surprisal hides it
Bastian Bunzeck, Sina Zarrieß
- Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation
Jonibek Mansurov, Akhmed Sakip, Alham Fikri Aji
- Conspiracy Theories and Where to Find Them on TikTok
Francesco Corso, Francesco Pierri, Gianmarco De Francisci Morales
- Growing Through Experience: Scaling Episodic Grounding in Language Models
Chunhui Zhang, Sirui Wang, Zhongyu Ouyang, Xiangchi Yuan, Soroush Vosoughi
- LLM as Entity Disambiguator for Biomedical Entity-Linking
Christophe Ye, Cassie S. Mitchell
- Exploiting the Shadows: Unveiling Privacy Leaks through Lower-Ranked Tokens in Large Language Models
Yuan Zhou, ZHUO ZHANG, Xiangyu Zhang
- Towards Geo-Culturally Grounded LLM Generations
Piyawat Lertvittayakumjorn, David Kinney, Vinodkumar Prabhakaran, Donald Martin Jr., Sunipa Dev
- Attacking Vision-Language Computer Agents via Pop-ups
Yanzhe Zhang, Tao Yu, Diyi Yang
- Explicit and Implicit Data Augmentation for Social Event Detection
Congbo Ma, Yuxia Wang, Jia Wu, Jian Yang, Jing Du, Zitai Qiu, Qing Li, Hu Wang, Preslav Nakov
- In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents
Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Rajan Iyer, Tianlong Chen, huan liu, Chen-Yu Lee, Tomas Pfister
- Revisiting Classical Chinese Event Extraction with Ancient Literature Information
Xiaoyi Bao, Zhongqing Wang, Jinghang Gu, Chu-Ren Huang
- Unanswerability Evaluation for Retrieval Augmented Generation
XIANGYU PENG, Prafulla Kumar Choubey, Caiming Xiong, Chien-Sheng Wu
- SCALE: Towards Collaborative Content Analysis in Social Science with Large Language Model Agents and Human Intervention
Chengshuai Zhao, Zhen Tan, Chau-Wai Wong, Xinyan Zhao, Tianlong Chen, huan liu
- Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
Erxin Yu, Jing Li, Ming Liao, Qi Zhu, Boyang XUE, Minghui Xu, Baojun Wang, Lanqing HONG, Fei Mi, Lifeng Shang
- RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework
Kunlun Zhu, Yifan Luo, Dingling Xu, Yukun Yan, Zhenghao Liu, Shi Yu, Ruobing Wang, Shuo Wang, Yishan Li, Nan Zhang, Xu Han, Zhiyuan Liu, Maosong Sun
- A Survey on Patent Analysis: From NLP to Multimodal AI
Homaira Huda Shomee, Zhu Wang, Sathya N. Ravi, Sourav Medya
- SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification
Chengye Wang, Yifei Shen, Zexi Kuang, Arman Cohan, Yilun Zhao
- MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents
Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Zhe Wang, Zhenhailong Wang, Cheng Qian, Xiangru Tang, Heng Ji, Jiaxuan You
- Sinhala Encoder-only Language Models and Evaluation
Tharindu Ranasinghe, Hansi Hettiarachchi, Nadeesha Chathurangi Naradde Vidana Pathirana, Damith Premasiri, Lasitha Uyangodage, Isuri Nanomi Arachchige, Alistair Plum, Paul Rayson, Ruslan Mitkov
- LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing
Zhengxiang Wang, Veronika Makarova, Zhi Li, Jordan Kodner, Owen Rambow
- SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?
Haomin Zhuang, Yihua Zhang, Kehan Guo, Jinghan Jia, Gaowen Liu, Sijia Liu, Xiangliang Zhang
- Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges
Bolei Ma, Yuting Li, Wei Zhou, Ziwei Gong, Yang Janet Liu, Katja Jasinskaja, Annemarie Friedrich, Julia Hirschberg, Frauke Kreuter, Barbara Plank
- LocAgent: Agentic Code Localization with Graph-Based Indexing
Zhaoling Chen, Xiangru Tang, Gangda Deng, Fang Wu, Jialong Wu, Zhiwei Jiang, Viktor Prasanna, Arman Cohan, Xingyao Wang
- COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation
Raghvendra Kumar, Mohammed Salman S A, Aryan Sahu, Tridib Nandi, Pragathi Y P, Sriparna Saha, Jose G Moreno
- Mind the Gap: Static and Interactive Evaluations of Large Audio Models
Minzhi Li, William Barr Held, Michael J Ryan, Kunat Pipatanakul, Potsawee Manakul, Hao Zhu, Diyi Yang
- Understanding In-context Machine Translation for Low-Resource Languages: A Case Study on Manchu
Renhao Pei, Yihong Liu, Peiqin Lin, François Yvon, Hinrich Schuetze
- CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs
Jizhan Fang, Tianhe Lu, Yunzhi Yao, Ziyan Jiang, Xin Xu, Huajun Chen, Ningyu Zhang
- TripleFact: Defending Data Contamination in the Evaluation of LLM-driven Fake News Detection
Cheng Xu, Nan Yan
- MUSTS: MUltilingual Semantic Textual Similarity Benchmark
Tharindu Ranasinghe, Hansi Hettiarachchi, Constantin Orasan, Ruslan Mitkov
- Meaning Beyond Truth Conditions: Evaluating Discourse Level Understanding via Anaphora Accessibility
Xiaomeng Zhu, Zhenghao Zhou, Simon Charlow, Robert Frank
- Benchmarking Systematic Relational Reasoning with Large Language and Reasoning Models
Irtaza Khalid, Amir Masoud Nourollah, Steven Schockaert
- Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation
Senyu Li, Zipeng Sun, Jiayi Wang, Xue Liu, Pontus Stenetorp, Siva Reddy, David Ifeoluwa Adelani
- Building Better: Avoiding Pitfalls in Developing Language Resources when Data is Scarce
Nedjma Ousidhoum, Meriem Beloucif, Saif M. Mohammad
- Can Large Language Models Accurately Generate Answer Keys for Health-related Questions?
Davis Bartels, Deepak Gupta, Dina Demner-Fushman
- Literary Evidence Retrieval via Long-Context Language Models
Katherine Thai, Mohit Iyyer
- BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages
Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Idris Abdulmumin, Jan Philip Wahle, Terry Ruas, Meriem Beloucif, Christine de Kock, Nirmal Surange, Daniela Teodorescu, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Alham Fikri Aji, Felermino D. M. A. Ali, Ilseyar Alimova, Vladimir Araujo, Nikolay Babakov, Naomi Baes, Ana-Maria Bucur, Andiswa Bukula, Guanqun Cao, Rodrigo Tufiño, Rendi Chevi, Chiamaka Ijeoma Chukwuneke, Alexandra Ciobotaru, Daryna Dementieva, Murja Sani Gadanya, Robert Geislinger, Bela Gipp, Oumaima Hourrane, Oana Ignat, Falalu Ibrahim Lawan, Rooweither Mabuya, Rahmad Mahendra, Vukosi Marivate, Alexander Panchenko, Andrew Piper, Charles Henrique Porto Ferreira, Vitaly Protasov, Samuel Rutunda, Manish Shrivastava, Aura Cristina Udrea, Lilian Diana Awuor Wanzare, Sophie Wu, Florian Valentin Wunderlich, Hanif Muhammad Zhafran, Tianhui Zhang, Yi Zhou, Saif M. Mohammad
- SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation
Yufei Tian, Jiao Sun, Nanyun Peng, Zizhao Zhang
- CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era
Yanlin Feng, Simone Papicchio, Sajjadur Rahman
- Empathy Prediction from Diverse Perspectives
Francine Chen, Scott Carter, Tatiana Lau, Nayeli Suseth Bravo, Sumanta Bhattacharyya, Kate Sieck, Charlene C. Wu
- Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice
Federico Ravenda, Seyed Ali Bahrainian, Andrea Raballo, Antonietta Mira, Noriko Kando
- INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models
Aum Kendapadi, Kerem Zaman, Rakesh R Menon, Shashank Srivastava
- Circuit Stability Characterizes Language Model Generalization
Alan Sun
- Comparing LLM-generated and human-authored news text using formal syntactic theory
Olga Zamaraeva, Dan Flickinger, Francis Bond, Carlos Gómez-Rodríguez
- Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes
Sharan Maiya, Yinhong Liu, Ramit Debnath, Anna Korhonen
- White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs
Yixin Wan, Kai-Wei Chang
- AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions
Adriana Eufrosina Bora, Akshatha Arodi, Duoyi Zhang, Jordan Bannister, Mirko Bronzi, Arsene Fansi Tchango, Md Abul Bashar, Richi Nayak, Kerrie Mengersen
- Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence
Mohsen Fayyaz, Ali Modarressi, Hinrich Schuetze, Nanyun Peng
- SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence
Zhining Liu, Rana Ali Amjad, Ravinarayana Adkathimar, Tianxin Wei, Hanghang Tong
- The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects
Yixin Wan, Kai-Wei Chang
- A Little Human Data Goes A Long Way
Dhananjay Ashok, Jonathan May
- Mitigating Shortcut Learning with InterpoLated Learning
Michalis Korakakis, Andreas Vlachos, Adrian Weller
- Toward Automatic Discovery of a Canine Phonetic Alphabet
Theron S. Wang, Xingyuan Li, Hridayesh Lekhak, Tuan Minh Dang, Mengyue Wu, Kenny Q. Zhu
- DavIR: Data Selection via Implicit Reward for Large Language Models
Haotian Zhou, Tingkai Liu, Qianli Ma, Yufeng Zhang, Jianbo Yuan, Pengfei Liu, Yang You, Hongxia Yang
- Byte Latent Transformer: Patches Scale Better Than Tokens
Artidoro Pagnoni, Ramakanth Pasunuru, Pedro Rodriguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, LILI YU, Jason E Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srini Iyer
- DiffuseDef: Improved Robustness to Adversarial Attacks via Iterative Denoising
Zhenhao Li, Huichi Zhou, Marek Rei, Lucia Specia
- Identifying Cellular Niches in Spatial Transcriptomics: An Investigation into the Capabilities of Large Language Models
Huanhuan Wei, Xiao Luo, Hongyi Yu, Jinping Liang, Luning Yang, Lixing Lin, Alexandra Popa, Xiting Yan
- Culture Matters in Toxic Language Detection in Persian
Zahra Bokaei, Walid Magdy, Bonnie Webber
- Bitnet.cpp: Efficient Edge Inference for Ternary LLMs
Jinheng Wang, Hansong Zhou, Ting Song, Shijie Cao, Yan Xia, Ting Cao, Jianyu Wei, Shuming Ma, Hongyu Wang, Furu Wei
- Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification
Guilherme Fonseca, Washington Cunha, Gabriel Prenassi, Marcos André Gonçalves, Leonardo Chaves Dutra da Rocha
- Forward Knows Efficient Backward Path: Saliency-Guided Memory-Efficient Fine-tuning of Large Language Models
Yeachan Kim, SangKeun Lee
- Focus on What Matters: Enhancing Medical Vision-Language Models with Automatic Attention Alignment Tuning
Aofei Chang, Le Huang, Alex James Boyd, Parminder Bhatia, Taha Kass-Hout, Cao Xiao, Fenglong Ma
- LLMs + Persona-Plug = Personalized LLMs
Jiongnan Liu, Yutao Zhu, Shuting Wang, Xiaochi Wei, Erxue Min, Yu Lu, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou
- Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition
Masato Mita, Ryo Yoshida, Yohei Oseki
- IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data
Tao Feng, Lizhen Qu, Niket Tandon, Gholamreza Haffari
- INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages
Hao Yu, Jesujoba Oluwadara Alabi, Andiswa Bukula, Jian Yun Zhuang, En-Shiun Annie Lee, Tadesse Kebede Guge, Israel Abebe Azime, Happy Buzaaba, Blessing Kudzaishe Sibanda, Godson Koffi KALIPE, Jonathan Mukiibi, Salomon KABONGO KABENAMUALU, Mmasibidi Setaka, Lolwethu Ndolela, Nkiruka Odu, Rooweither Mabuya, Shamsuddeen Hassan Muhammad, Salomey Osei, Sokhar Samb, Dietrich Klakow, David Ifeoluwa Adelani
- Boosting Long-Context Information Seeking via Query-Guided Activation Refilling
Hongjin Qian, Zheng Liu, Peitian Zhang, Zhicheng Dou, Defu Lian
- Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration
Tianyi Bai, Ling Yang, Zhen Hao Wong, Fupeng Sun, Xinlin Zhuang, Jiahui Peng, Chi Zhang, Lijun Wu, Qiu Jiantao, Wentao Zhang, Binhang Yuan, Conghui He
- AdaDHP: Fine-Grained Fine-Tuning via Dual Hadamard Product and Adaptive Parameter Selection
Han Liu, Changya Li, Xiaotong Zhang, Feng Zhang, Fenglong Ma, Wei Wang, Hong Yu
- KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph
Jinhao Jiang, Kun Zhou, Xin Zhao, Yang Song, Chen Zhu, Hengshu Zhu, Ji-Rong Wen
- Curriculum Debiasing: Toward Robust Parameter-Efficient Fine-Tuning Against Dataset Biases
Mingyu Lee, Yeachan Kim, Wing-Lam Mok, SangKeun Lee
- Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings
Austin Xu, Srijan Bansal, Yifei Ming, Semih Yavuz, Shafiq Joty
- On the Reliability of Large Language Models for Causal Discovery
Tao Feng, Lizhen Qu, Niket Tandon, Zhuang Li, Xiaoxi Kang, Gholamreza Haffari
- Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media Contexts
Jingxuan Li, Yuning Yang, Shengqi Yang, Linfan Zhang, Ying Nian Wu
- TeRDy: Temporal Relation Dynamics through Frequency Decomposition for Temporal Knowledge Graph Completion
Ziyang Liu, Chaokun Wang
- Incorporating Domain Knowledge into Materials Tokenization
Yerim Oh, Jun-Hyung Park, Junho Kim, SungHo Kim, SangKeun Lee
- PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization
Yidan Wang, Yanan Cao, Yubing Ren, Fang Fang, Zheng Lin, Binxing Fang
- Agents Under Siege: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks
Rana Shahroz, Zhen Tan, Sukwon Yun, Charles Fleming, Tianlong Chen
- Semantic-Eval : A Semantic Comprehension Evaluation Framework for Large Language Models Generation without Training
Shusheng Li, Jiale Li, Yifei Qu, Xinwei Shi, Yanliang Guo, Ziyi He, Yubo Wang, Wenjun Tan
- Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases
Michael Y. Hu, Jackson Petty, Chuan Shi, William Merrill, Tal Linzen
- When to Speak, When to Abstain: Contrastive Decoding with Abstention
Hyuhng Joon Kim, Youna Kim, Sang-goo Lee, Taeuk Kim
- On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs
Herun Wan, Minnan Luo, Zhixiong Su, Guang Dai, Xiang Zhao
- Investigating and Extending Homans’ Social Exchange Theory with Large Language Model based Agents
Lei Wang, Zheqing Zhang, Xu Chen
- A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models
Jiesong Liu, Brian Park, Xipeng Shen
- If Attention Serves as a Cognitive Model of Human Memory Retrieval, What is the Plausible Memory Representation?
Ryo Yoshida, Shinnosuke Isono, Kohei Kajikawa, Taiga Someya, Yushi Sugimoto, Yohei Oseki
- Aligning VLM Assistants with Personalized Situated Cognition
Yongqi Li, Shen Zhou, Xiaohu Li, Xin Miao, Jintao Wen, Mayi Xu, Jianhao Chen, Birong Pan, Hankun Kang, Yuanyuan Zhu, Ming Zhong, Tieyun Qian
- Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models
Zhisong Zhang, Yan Wang, Xinting Huang, Tianqing Fang, Hongming Zhang, Chenlong Deng, Shuaiyi Li, Dong Yu
- Faster Speculative Decoding via Effective Draft Decoder with Pruned Candidate Tree
Huanran Zheng, Xiaoling Wang
- Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models
Zhuojun Ding, Wei Wei, Chenghao Fan
- Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents
Tao Wu, Jingyuan Chen, Wang Lin, Mengze Li, Yumeng Zhu, Ang Li, Kun Kuang, Fei Wu
- CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction
Jiali Chen, Xusen Hei, HongFei Liu, Yuancheng Wei, Zikun Deng, Jiayuan Xie, Yi Cai, Li Qing
- Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling
Junyi Li, Hwee Tou Ng
- The Lawyer That Never Thinks: Consistency and Fairness as Keys to Reliable AI
Dana R Alsagheer, Abdulrahman Kamal, Mohammad Kamal, Cosmo Yang Wu, Weidong Shi
- Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean
SungHo Kim, Nayeon Kim, Taehee Jeon, SangKeun Lee
- SpeechFake: A Large-Scale Multilingual Speech Deepfake Dataset Incorporating Cutting-Edge Generation Methods
Wen Huang, Yanmei Gu, Zhiming Wang, Huijia Zhu, Yanmin Qian
- ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation
Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Aojun Zhou, Junting Pan, Hongsheng Li
- InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes Under Herd Behavior
Huisheng Wang, Zhuoshi Pan, Hangjing Zhang, Mingxiao Liu, Hanqing Gao, H. Vicky Zhao
- Enhancing Neural Machine Translation Through Target Language Data: A $k$NN-LM Approach for Domain Adaptation
Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, yingfeng luo, Tong Xiao, JingBo Zhu
- Multi-level Relevance Document Identifier Learning for Generative Retrieval
Fuwei Zhang, Xiaoyu Liu, Xinyu Jia, Yingfei Zhang, Shuai Zhang, Xiang Li, Fuzhen Zhuang, Wei Lin, Zhao Zhang
- EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Ping Luo
- Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder
Siting Li, Pang Wei Koh, Simon Shaolei Du
- NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization
Hyuntak Kim, Byung-Hak Kim
- HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models
Xiao Wang, Jingyun Hua, Weihong Lin, Yuanxing Zhang, Fuzheng Zhang, Jianlong Wu, Di ZHANG, Liqiang Nie
- Uni-Retrieval: A Multi-Style Retrieval Framework for STEM’s Education
Yanhao Jia, Xinyi Wu, Li Hao, QinglinZhang, Yuxiao Hu, Shuai Zhao, Wenqi Fan
- DenseLoRA: Dense Low-Rank Adaptation of Large Language Models
Lin Mu, Xiaoyu Wang, Li Ni, Yang Li, Zhize Wu, Peiquan Jin, Yiwen Zhang
- Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis
Jisoo Mok, Ik-hwan Kim, Sangkwon Park, Sungroh Yoon
- Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models
Yuheng Chen, Pengfei Cao, Yubo Chen, Yining Wang, Shengping Liu, Kang Liu, Jun Zhao
- Towards Context-Robust LLMs: A Gated Representation Fine-tuning Approach
Shenglai Zeng, Pengfei He, Kai Guo, Tianqi Zheng, Hanqing Lu, Yue Xing, Hui Liu
- Seeking Rational Demonstrations for Large Language Models: A Domain Generalization Approach to Unsupervised Cross-Domain Keyphrase Generation
Guangzhen Zhao, Yu Yao, Dechang Kong, Zhenjiang Dong
- On Support Samples of Next Word Prediction
Yuqian Li, Yupei Du, Yufang Liu, Feifei Feng, Mou Xiao Feng, Yuanbin Wu
- WebWalker: Benchmarking LLMs in Web Traversal
Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zhou, Pengjun Xie, Fei Huang
- From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models
Yidan Wang, Yubing Ren, Yanan Cao, Binxing Fang
- AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs
Hongxin Li, Jingfan CHEN, Jingran Su, Yuntao Chen, Li Qing, Zhaoxiang Zhang
- Introducing Graph Context into Language Models through Parameter-Efficient Fine-Tuning for Lexical Relation Mining
Jingwen Sun, Zhiyi Tian, Yu He, Jingwei Sun, Guangzhong Sun
- S-RAG: A Novel Audit Framework for Detecting Unauthorized Use of Personal Data in RAG Systems
Zhirui Zeng, Jiamou Liu, Meng-Fen Chiang, Jialing He, Zijian Zhang
- Praetor: A Fine-Grained Generative LLM Evaluator with Instance-Level Customizable Evaluation Criteria
Yongqi Leng, Renren Jin, Yue chen, Zhuowen Han, Ling Shi, Jianxiang Peng, Lei Yang, Juesi Xiao, Deyi Xiong
- LexKeyPlan: Planning with Keyphrases and Retrieval Augmentation for Legal Text Generation: A Case Study on European Court of Human Rights Cases
Santosh T.Y.S.S, Elvin Quero Hernandez
- Mitigating Gender Confounding Bias from Spoken Language in Dementia Detection via Weight Masking
Zhecheng Sheng, Xiruo Ding, Brian Hur, Changye Li, Trevor Cohen, Serguei V. S. Pakhomov
- MCS-Bench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in Chinese Classical Studies
Yang Liu, Jiahuan Cao, Hiuyi Cheng, Yongxin Shi, Kai Ding, Lianwen Jin
- The Knowledge Microscope: Features as Better Analytical Lenses than Neurons
Yuheng Chen, Pengfei Cao, Kang Liu, Jun Zhao
- From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding
Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao
- PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance
Haoran Li, Wenbin Hu, Huihao JING, Yulin Chen, Qi Hu, Sirui Han, Tianshu Chu, Peizhao Hu, Yangqiu Song
- Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View
Yanran Wu, Inez Hua, Yi Ding
- ExpeTrans: LLMs Are Experiential Transfer Learners
Jinglong Gao, Xiao Ding, Lingxiao Zou, Bibo Cai, Bing Qin, Ting Liu
- Cool-Fusion: Fuse Large Language Models without Training
Cong Liu, Xiaojun Quan, Yan Pan, Weigang Wu, Xu Chen, Liang Lin
- DAPE V2: Process Attention Score as Feature Map for Length Extrapolation
Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li
- MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training
Hui Huang, Jiaheng Liu, Yancheng He, Shilong Li, Bing Xu, Conghui Zhu, Muyun Yang, Tiejun Zhao
- LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation
Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Xin Zhao, Bingning Wang, Weipeng Chen
- APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs
Yuxiang Huang, Mingye Li, Xu Han, Chaojun Xiao, Weilin Zhao, Sun Ao, Hao Zhou, Jie Zhou, Zhiyuan Liu, Maosong Sun
- PPT: A Minor Language News Recommendation Model via Cross-Lingual Preference Pattern Transfer
Yiyang Zhang, Nan Chen
- GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis
Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin
- Top-$n\sigma$: Eliminating Noise in Logit Space for Robust Token Sampling of LLM
Chenxia Tang, Jianchun Liu, Hongli Xu, Liusheng Huang
- SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation
Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, Deyu Zhou
- Mitigating Non-Representative Prototypes and Representation Bias in Few-Shot Continual Relation Extraction
Thanh Duc Pham, Nam Le Hai, Linh Ngo Van, Nguyen Thi Ngoc Diep, Sang Dinh, Thien Huu Nguyen
- MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts
Wei Tao, Haocheng Lu, Xiaoyang Qu, Bin Zhang, Kai Lu, Jiguang Wan, Jianzong Wang
- PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration
Ziqian Zeng, Jianwei Wang, Junyao Yang, ZhengdongLu, Haoran Li, Huiping Zhuang, Cen Chen
- Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang, Jiahui Peng, Ren Ma, Yinfan Wang, Tianyi Bai, Xingjian Wei, Qiu Jiantao, Chi Zhang, Ying Qian, Conghui He
- GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning
Qingchen Yu, Zifan Zheng, Ding Chen, Simin Niu, Bo Tang, Feiyu Xiong, Zhiyu li
- Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition
Kehua Feng, Keyan Ding, tan hongzhi, Kede Ma, Zhihua Wang, Shuangquan Guo, Cheng yuzhou, Ge Sun, Guozhou Zheng, Qiang Zhang, Huajun Chen
- DTCRS: Dynamic Tree Construction for Recursive Summarization
Guanran Luo
- A Generative Adaptive Replay Continual Learning Model for Temporal Knowledge Graph Reasoning
Zhiyu Zhang, Wei Chen, Youfang Lin, Huaiyu Wan
- ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search
Yize Zhang, Tianshu Wang, Sirui Chen, Kun Wang, Xingyu Zeng, Hongyu Lin, Xianpei Han, Le Sun, Chaochao Lu
- PKAG-DDI: Pairwise Knowledge-Augmented Language Model for Drug-Drug Interaction Event Text Generation
Ziyan Wang, Zhankun Xiong, Feng Huang, Wen Zhang
- Knowledge-Augmented Multimodal Clinical Rationale Generation for Disease Diagnosis with Small Language Models
Shuai Niu, Jing Ma, Hongzhan Lin, Liang Bai, Zhihua Wang, Richard Yi Da Xu, Yunya Song, Xian Yang
- TWIST: Text-encoder Weight-editing for Inserting Secret Trojans in Text-to-Image Models
Xindi Li, Zhe Liu, Tong Zhang, Jiahao Chen, Qingming Li, Jinbao Li, Shouling Ji
- Frictional Agent Alignment Framework: Slow Down and Don’t Break Things
Abhijnan Nath, Carine Graff, Andrei Bachinin, Nikhil Krishnaswamy
- Powerformer: Efficient and High-Accuracy Privacy-Preserving Language Model with Homomorphic Encryption
Dongjin Park, Eunsang Lee, Joon-Woo Lee
- Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs
Weixiang Zhao, Yulin Hu, Yang Deng, Jiahe Guo, Xingyu Sui, Xinyang Han, An Zhang, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu
- Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?
Zihao Li, Lecheng Zheng, Bowen Jin, Dongqi Fu, Baoyu Jing, Yikun Ban, Jingrui He, Jiawei Han
- Towards Enhanced Immersion and Agency for LLM-based Interactive Drama
Hongqiu Wu, Weiqi Wu, Tianyang Xu, Jiameng Zhang, hai zhao
- Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures
Shun Inadumi, Nobuhiro Ueda, Koichiro Yoshino
- Improving Factuality with Explicit Working Memory
Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Yi Sun, Luke Zettlemoyer, Gargi Ghosh, Wen-tau Yih
- Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models
Chengao Li, Hanyu Zhang, Yunkun Xu, Hongyan Xue, Xiang Ao, Qing He
- Dynamic Parallel Tree Search for Efficient LLM Reasoning
Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang, Ziwei Liu, Bo Du, Xianglong Liu, Dacheng Tao
- Pre$^3$: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation
Junyi Chen, Shihao Bai, Zaijun Wang, Siyu Wu, Chuheng Du, Hailong Yang, Ruihao Gong, Shengzhong Liu, Fan Wu, Guihai Chen
- SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL
Ge Qu, Jinyang Li, Bowen Qin, Xiaolong Li, Nan Huo, Chenhao Ma, Reynold Cheng
- GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models
Tao Zhang, Ziqian Zeng, YuxiangXiao, Huiping Zhuang, Cen Chen, James R. Foulds, Shimei Pan
- Large Language and Protein Assistant for Protein-Protein Interactions Prediction
Peng Zhou, Pengsen Ma, Jianmin Wang, Xibao Cai, Haitao Huang, Wei Liu, Longyue Wang, Lai Hou Tim, xiangxiang Zeng
- SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement
Runnan Fang, Xiaobin Wang, Yuan Liang, Shuofei Qiao, Jialong Wu, Zekun Xi, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen
- An Empirical Study of Many-to-Many Summarization with Large Language Models
Jiaan Wang, Fandong Meng, Zengkui Sun, Yunlong Liang, Yuxuan Cao, Jiarong Xu, HAOXIANG SHI, Jie Zhou
- Locate-and-Focus: Enhancing Terminology Translation in Speech Language Models
Suhang Wu, Jialong Tang, Chengyi Yang, Pei Zhang, Baosong Yang, Junhui Li, Min Zhang, Jinsong Su
- GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents
Lingxiao Diao, Xinyue Xu, Wanxuan Sun, Cheng Yang, Zhuosheng Zhang
- TC–RAG: Turing–Complete RAG’s Case study on Medical LLM Systems
Xinke Jiang, Yue Fang, Rihong Qiu, Haoyu Zhang, Yongxin Xu, Hao Chen, WentaoZhang, Ruizhe Zhang, Yuchen Fang, Xinyu Ma, Xu Chu, Junfeng Zhao, Yasha Wang
- SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning
Zexiong Ma, Chao Peng, Pengfei Gao, Xiangxin Meng, Yanzhen Zou, Bing Xie
- MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models
Zhongzhan Huang, Guoming Ling, Shanshan Zhong, Hefeng Wu, Liang Lin
- Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG
Xin Sun, Jianan Xie, Zhongqi Chen, Qiang Liu, Shu Wu, Yuehe Chen, Bowen Song, Zilei Wang, Weiqiang Wang, Liang Wang
- PwnGPT: Automatic Exploit Generation Based on Large Language Models
Wanzong Peng, Lin Ye, Xuetao Du, Hongli Zhang, Dongyang Zhan, Yunting Zhang, Yicheng Guo, Chen Zhang
- VMLU Benchmarks: A comprehensive benchmark toolkit for Vietnamese LLMs
Cuc Thi Bui, Nguyen Truong Son, Truong van trang, Lam Viet Phung, Pham Nhut Huy, Hoang Anh Le, Quoc Huu Van, Phong Nguyen-Thuan Do, Van Le Tran Truc, Duc Thanh Chau, Le-Minh Nguyen
- Scaling Laws for RNN LLM in Long-Context Scenarios with State Size
Kai Liu, Jianfei Gao, Kai Chen
- Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes
Bocheng Li, Zhujin Gao, Linli Xu
- A Strategic Coordination Framework of Small LMs Matches Large LMs in Data Synthesis
Xin Gao, Qizhi Pei, Zinan Tang, Yu Li, Honglin Lin, Jiang Wu, Lijun Wu, Conghui He
- Defining and Evaluating Visual Language Models’ Basic Spatial Abilities: A Perspective from Psychometrics
Wenrui Xu, Dalin Lyu, Weihang Wang, Jie Feng, Chen Gao, Yong Li
- SPHERE: Unveiling Spatial Blind Spots in Vision-Language Models Through Hierarchical Evaluation
Wenyu Zhang, Wei En Ng, Lixin Ma, Yuwen Wang, Junqi Zhao, Allison Koenecke, Boyang Li, WANGLU
- Enhancing Retrieval Systems with Inference-Time Logical Reasoning
Felix Faltings, Wei Wei, Yujia Bao
- Using Subtext to Enhance Generative IDRR
Zhipang Wang, Yu Hong, Weihao Sun, Guodong Zhou
- User-side Model Consistency Monitoring for Open Source Large Language Models Inference Services
Qijun Miao, Zhixuan Fang
- Jailbreaking? One Step Is Enough!
Weixiong Zheng, Peijian Zeng, YiWei Li, Hongyan Wu, Nankai Lin, Junhao Chen, Aimin Yang, Yongmei Zhou
- Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning
Yongxin Xu, Ruizhe Zhang, Xinke Jiang, Yujie Feng, Yuzhen Xiao, Xinyu Ma, Runchuan Zhu, Xu Chu, Junfeng Zhao, Yasha Wang
- PaSa: An LLM Agent for Comprehensive Academic Paper Search
Yichen He, Guanhua Huang, Peiyuan Feng, Yuan Lin, Yuchen Zhang, Hang Li, Weinan E
- Less Mature is More Adaptable for Sentence-level Language Modeling
Abhilasha Sancheti, David Dale, Artyom Kozhevnikov, Maha Elbayad
- EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts
Subhajit Chaudhury, Payel Das, Sarathkrishna Swaminathan, Georgios Kollias, Elliot Nelson, Khushbu Pahwa, Tejaswini Pedapati, Igor Melnyk, Matthew Riemer
- UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter Efficient Fine-Tuning of Large Models
Xueyan Zhang, Jinman Zhao, Zhifei Yang, Yibo Zhong, Shuhao Guan, Linbo Cao, Yining Wang
- Agri-CM$^3$: A Chinese Massive Multi-modal, Multi-level Benchmark for Agricultural Understanding and Reasoning
Haotian Wang, Yi Guan, Fanshu Meng, Chao Zhao, Lian Yan, Yang Yang, Jingchi Jiang
- TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification
Junnan Zhu, Min Xiao, Yining Wang, Feifei Zhai, Yu Zhou, Chengqing Zong
- CaLMQA: Exploring culturally specific long-form question answering across 23 languages
Shane Arora, Marzena Karpinska, Hung-Ting Chen, Ipsita Bhattacharjee, Mohit Iyyer, Eunsol Choi
- Croppable Knowledge Graph Embedding
Yushan Zhu, Wen Zhang, Zhiqiang Liu, Mingyang Chen, Lei Liang, Huajun Chen
- HyKGE: A Hypothesis Knowledge Graph Enhanced RAG Framework for Accurate and Reliable Medical LLMs Responses
Xinke Jiang, Ruizhe Zhang, Yongxin Xu, Rihong Qiu, Yue Fang, Zhiyuan Wang, Jinyi Tang, Hongxin Ding, Xu Chu, Junfeng Zhao, Yasha Wang
- LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models
Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, WangYan, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi
- BeamLoRA: Beam-Constraint Low-Rank Adaptation
Naibin Gu, Zhenyu Zhang, Xiyu Liu, Peng Fu, Zheng Lin, Shuohuan Wang, Yu Sun, Hua Wu, Weiping Wang, Haifeng Wang
- GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art
Chenkai Zhang, Yiming Lei, Zeming Liu, Haitao Leng, ShaoGuo Liu, Tingting Gao, Qingjie Liu, Yunhong Wang
- UniLR: Unleashing the Power of LLMs on Multiple Legal Tasks with a Unified Legal Retriever
Ang Li, Yiquan Wu, Yifei Liu, Ming Cai, Lizhi Qing, Shihang Wang, Yangyang Kang, Chengyuan Liu, Fei Wu, Kun Kuang
- Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models
Haoran Ye, TianZe Zhang, Yuhang Xie, Liyuan Zhang, Yuanyi Ren, Xin Zhang, Guojie Song
- Beyond Dialogue: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model
Yeyong Yu, Runsheng Yu, Haojie Wei, Zhanqiu Zhang, Quan QIAN
- ACECODER: Acing Coder RL via Automated Test-Case Synthesis
Huaye Zeng, Dongfu Jiang, Haozhe Wang, Ping Nie, Xiaotong Chen, Wenhu Chen
- Quantifying Semantic Emergence in Language Models
Hang Chen, Xinyu Yang, Jiaying Zhu, Wenya Wang
- DebateCoder: Towards Collective Intelligence of LLMs via Test Case Driven LLM Debate for Code Generation
Jizheng Chen, Kounianhua Du, Xinyi Dai, Weiming Zhang, Xihuai Wang, Yasheng Wang, Ruiming Tang, Weinan Zhang, Yong Yu
- The Tug of War Within: Mitigating the Fairness-Privacy Conflicts in Large Language Models
Chen Qian, Dongrui Liu, Jie Zhang, Yong Liu, Jing Shao
- GraphInsight: Unlocking Insights in Large Language Models for Graph Structure Understanding
Yukun Cao, Shuo Han, Zengyi Gao, Zezhong Ding, Xike Xie, S Kevin Zhou
- Phonotomizer: A Compact, Unsupervised, Online Training Approach to Real-Time, Multilingual Phonetic Segmentation
Michael S. Yantosca, Albert M. K. Cheng
- A Multi-persona Framework for Argument Quality Assessment
Bojun Jin, Jianzhu Bao, Yufang Hou, Yang Sun, Yice Zhang, Huajie Wang, Bin Liang, Ruifeng Xu
- Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
Chengwu Liu, Ye Yuan, Yichun Yin, Yan Xu, Xin Xu, Zaoyu Chen, Lifeng Shang, Qun Liu, Ming Zhang
- SAM Decoding: Speculative Decoding via Suffix Automaton
Yuxuan Hu, Ke Wang, Xiaokang Zhang, Fanjin Zhang, Cuiping Li, Hong Chen, Jing Zhang
- PsyAdvisor: A Plug-and-Play Strategy Advice Planner with Proactive Questioning in Psychological Conversations
Yuxin Hu, Danni Liu, Bo Liu, Yida Chen, Jiuxin Cao, Yan Liu
- $HomeBench$: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices
Silin Li, Yuhang Guo, Jiashu Yao, Zeming Liu, Haifeng Wang
- Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment
Xueyao Zhang, Yuancheng Wang, Chaoren Wang, Ziniu Li, Zhuo Chen, Zhizheng Wu
- GiFT: Gibbs Fine-Tuning for Code Generation
Haochen Li, Wanjin Feng, Xin Zhou, Zhiqi Shen
- Enhancing Interpretable Image Classification Through LLM Agents and Conditional Concept Bottleneck Models
Yiwen Jiang, Deval Mehta, Wei Feng, Zongyuan Ge
- Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction
Xiaowei Zhu, Yubing Ren, Yanan Cao, Xixun Lin, Fang Fang, Yangxi Li
- RSCF: Relation-Semantics Consistent Filter for Entity Embedding of Knowledge Graph
Junsik Kim, Jinwook Park, Kangil Kim
- RolePlot: A Systematic Framework for Evaluating and Enhancing the Plot-Progression Capabilities of Role-Playing Agents
Pinyi Zhang, Siyu An, Lingfeng Qiao, Yifei Yu, Jingyang Chen, Jie Wang, di yin, Xing Sun, Kai Zhang
- TreeRL: LLM Reinforcement Learning with On-Policy Tree Search
Zhenyu Hou, Ziniu Hu, Yujiang Li, Rui Lu, Jie Tang, Yuxiao Dong
- Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model
Emre Can Acikgoz, Jeremiah Greer, Akul Datta, Ze Yang, William Zeng, Oussama Elachqar, Emmanouil Koukoumidis, Dilek Hakkani-Tür, Gokhan Tur
- Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation
Yupu Liang, Yaping Zhang, Zhiyang Zhang, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou
- SDPO: Segment-Level Direct Preference Optimization for Social Agents
Aobo Kong, Wentao Ma, Shiwan Zhao, Yongbin Li, Yuchuan Wu, Ke Wang, Xiaoqian Liu, Qicheng Li, Yong Qin, Fei Huang
- KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors
Zhiyang Qi, Takumasa Kaneko, Keiko Takamizo, Mariko Ukiyo, Michimasa Inaba
- SURVEYFORGE : On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing
Xiangchao Yan, Shiyang Feng, Jiakang Yuan, Renqiu Xia, Bin Wang, LEI BAI, Bo Zhang
- Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning
Yexing Du, Youcheng Pan, Ziyang Ma, Bo Yang, Yifan Yang, Keqi Deng, Xie Chen, Yang Xiang, Ming Liu, Bing Qin
- AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research
Yilun Zhao, Weiyuan Chen, Zhijian Xu, Yixin Liu, Chengye Wang, Manasi Patwardhan, Lovekesh Vig, Arman Cohan
- Redundancy Principles for MLLMs Benchmarks
Zicheng Zhang, Xiangyu Zhao, Xinyu Fang, Chunyi Li, Xiaohong Liu, Xiongkuo Min, Haodong Duan, Kai Chen, Guangtao Zhai
- WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models
Yifu Chen, Shengpeng Ji, Haoxiao Wang, Ziqing Wang, Siyu Chen, Jinzheng He, Jin Xu, Zhou Zhao
- ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5
Jiaming Zhou, shiyao wang, Shiwan Zhao, Jiabei He, Haoqin Sun, Hui Wang, Cheng Liu, Aobo Kong, Yujie Guo, Xi Yang, Yequan Wang, Yonghua Lin, Yong Qin
- Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization
Yao Xiao, Hai Ye, Linyao Chen, Hwee Tou Ng, Lidong Bing, Xiaoli Li, Roy Ka-Wei Lee
- Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization
Yuhao Wang, Keyan Ding, Kehua Feng, Zeyuan Wang, Ming Qin, Xiaotong Li, Qiang Zhang, Huajun Chen
- SINCon: Mitigate LLM-Generated Malicious Message Injection Attack for Rumor Detection
Mingqing Zhang, Qiang Liu, Xiang Tao, Shu Wu, Liang Wang
- Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
Jungwoo Park, Chanwoong Yoon, Hyeon Hwang, Taewhoo Lee, Jaewoo Kang
- Agentic Knowledgeable Self-awareness
Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen
- A Unified Agentic Framework for Evaluating Conditional Image Generation
Jifang Wang, Yangxue, Longyue Wang, Zhenran Xu, Yiyu Wang, Yaowei Wang, Weihua Luo, Kaifu Zhang, Baotian Hu, Min zhang
- Planning-Driven Programming: A Large Language Model Programming Workflow
Chao Lei, Yanchuan Chang, Nir Lipovetzky, Krista A. Ehinger
- Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering
Yuan Sui, Yufei He, Zifeng Ding, Bryan Hooi
- Nudging: Inference-time Alignment of LLMs via Guided Decoding
Yu Fei, Yasaman Razeghi, Sameer Singh
- Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Zhilin Wang, Yafu Li, Jianhao Yan, Yu Cheng, Yue Zhang
- State-offset Tuning: State-based Parameter-Efficient Fine-Tuning for State Space Models
Wonjun Kang, Kevin Galim, Yuchen Zeng, Minjae Lee, Hyung Il Koo, Nam Ik Cho
- SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models
Zhuang Li, YUNCHENG HUA, Thuy-Trang Vu, Haolan Zhan, Lizhen Qu, Gholamreza Haffari
- Internal and External Impacts of Natural Language Processing Papers
Yu Zhang
- HFT: Half Fine-Tuning for Large Language Models
Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Weiran Xu, Yu Sun, Hua Wu
- Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis
Huijun Lian, Zekai Sun, Keqi Chen, Yingming Gao, Ya Li
- From Objectives to Questions: A Planning-based Framework for Educational Mathematical Question Generation
Cheng Cheng, Zhenya Huang, GuanHao Zhao, Yuxiang Guo, Xin Lin, Jinze Wu, Xin Li, Shijin Wang
- RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts
Mingyan Wu, Zhenghao Liu, Yukun Yan, Xinze Li, Shi Yu, Zheni Zeng, Yu Gu, Ge Yu
- Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
Yafu Li, Ronghao Zhang, Zhilin Wang, Huajian Zhang, Leyang Cui, Yongjing Yin, Tong Xiao, Yue Zhang
- An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling
Xuemei Tang, Jun Wang, Qi Su, Chu-Ren Huang, Jinghang Gu
- Accurate KV Cache Quantization with Outlier Tokens Tracing
Yi Su, Yuechi Zhou, Quantong Qiu, Juntao Li, Qingrong Xia, Ping Li, Xinyu Duan, Zhefeng Wang, Min Zhang
- Can Large Language Models Understand Internet Buzzwords Through User-Generated Content
Chen Huang, Junkai Luo, Xinzuo Wang, Wenqiang Lei, Jiancheng Lv
- EAC-MoE: Expert-Selection Aware Compressor for Mixture-of-Experts Large Language Models
Yuanteng Chen, Yuantian Shao, Peisong Wang, Jian Cheng
- Activation Steering Decoding: Mitigating Hallucination in Large Vision-Language Models through Bidirectional Hidden State Intervention
Jingran Su, Jingfan CHEN, Hongxin Li, Yuntao Chen, Li Qing, Zhaoxiang Zhang
- Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models
Fangzhi Xu, Qiushi Sun, Kanzhi Cheng, Jun Liu, Yu Qiao, Zhiyong Wu
- Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback
Yucheng Zhou, Lingran Song, Jianbing Shen
- Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging
Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Yu Sun, Hua Wu, Sen Su
- MapNav: A Novel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation
Lingfeng Zhang, Xiaoshuai Hao, Qinwen Xu, Qiang Zhang, Xinyao Zhang, Pengwei Wang, Jing Zhang, Zhongyuan Wang, Shanghang Zhang, Renjing Xu
- Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging
Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang, Benyou Wang
- CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention Intervention
Zekai Ye, Qiming Li, Xiaocheng Feng, Libo Qin, Yichong Huang, Baohang Li, Kui Jiang, Yang Xiang, Zhirui Zhang, Yunfei Lu, Duyu Tang, Dandan Tu, Bing Qin
- Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching
Xiangci Li, Zhiyu Chen, Jason Ingyu Choi, Nikhita Vedula, Besnik Fetahu, Oleg Rokhlenko, Shervin Malmasi
- Multi-Agent Collaboration for Multilingual Code Instruction Tuning
Jian Yang, Wei Zhang, Yibo Miao, Shanghaoran Quan, Zhenhe Wu, Qiyao Peng, Liqun Yang, Tianyu Liu, Zeyu Cui, Binyuan Hui, Junyang Lin
- Cultivating Gaming Sense for Yourself: Making VLMs Gaming Experts
wenxuan lu, Jiangyang He, Zhanqiu Zhang, Tianning Zang, Steven Y. Guo
- Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning
Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Qiushi Sun, Kanzhi Cheng, Junxian He, Jun Liu, Zhiyong Wu
- Accelerating Dense LLMs via L0-regularized Mixture-of-Experts
Zhenyu Zhang, JiuDong Yang, taozhaowen, Meng Chen
- Extending Complex Logical Queries on Uncertain Knowledge Graphs
Weizhi Fei, Zihao Wang, Hang Yin, Yang Duan, Yangqiu Song
- Knowledge Decoupling via Orthogonal Projection for Lifelong Editing of Large Language Models
Haoyu Xu, Pengxiang Lan, Enneng Yang, Guibing Guo, Jianzhe Zhao, Linying Jiang, Xingwei Wang
- $\phi$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation
Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu
- Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?
Leyi Pan, Aiwei Liu, Shiyu Huang, Yijian LU, Xuming Hu, Lijie Wen, Irwin King, Philip S. Yu
- Rethinking Reward Model Evaluation Through the Lens of Reward Overoptimization
Sunghwan Kim, Dongjin Kang, Taeyoon Kwon, Hyungjoo Chae, Dongha Lee, Jinyoung Yeo
- LISTN: Lexicon induction with socio-temporal nuance
Christine de Kock
- LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement
Boyi Kang, Xinfa Zhu, Zihan Zhang, Zhen Ye, Mingshuai Liu, Ziqian Wang, Yike Zhu, Guobin Ma, Jun Chen, Longshuai Xiao, CHAO WENG, Wei Xue, Lei Xie
- MadaKV: Adaptive Modality-Perception KV Cache Eviction for Efficient Multimodal Long-Context Inference
Kunxi Li, Zhonghua Jiang, Zhouzhou Shen, ZhaodeWang, chengfei lv, Shengyu Zhang, Fan Wu, Fei Wu
- Efficient OpAmp Adaptation for Zoom Attention to Golden Contexts
Haoyuan Wu, Rui Ming, Haisheng Zheng, Zhuolun He, Bei Yu
- Bridging Discrete Codec Representations and Speech Language Models
Shengpeng Ji, Minghui Fang, Jialong Zuo, Ziyue Jiang, Dingdong WANG, Hanting Wang, Hai Huang, Zhou Zhao
- Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger
Wenjun Li, Dexun Li, Kuicai Dong, Cong Zhang, Hao Zhang, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Liu
- MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark
Qihao Zhao, Yangyu Huang, Tengchao Lv, Lei Cui, Qinzheng Sun, Shaoguang Mao, Xin Zhang, Ying Xin, Qiufeng Yin, Scarlett Li, Furu Wei
- Code-Switching Red-Teaming: LLM Evaluation for Safety and Multilingual Understanding
Haneul Yoo, Yongjin Yang, Hwaran Lee
- Unleashing LLM Reasoning Capability via Scalable Question Synthesis from Scratch
Yuyang Ding, Xinyu Shi, Xiaobo Liang, Juntao Li, Zhaopeng Tu, Qiaoming Zhu, Min Zhang
- DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing
Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh
- PQR: Improving Dense Retrieval via Potential Query Modeling
Junfeng Kang, Rui Li, Qi Liu, Yanjiang Chen, Zheng Zhang, Junzhe Jiang, Heng Yu, Yu Su
- Do Multimodal Large Language Models Truly See What We Point At? Investigating Indexical, Iconic, and Symbolic Gesture Comprehension
Noriki Nishida, Koji Inoue, Hideki Nakayama, Mayumi Bono, Katsuya Takanashi
- Cross-lingual Generalization and Compression: From Language-Specific to Shared Neurons
Frederick Riemenschneider, Anette Frank
- SDBench: A Survey-based Domain-specific LLM Benchmarking and Optimization Framework
Cheng Guo, Hu Kai, Shuxian Liang, Yiyang Jiang, Yi Gao, Xian-Sheng Hua, Wei Dong
- ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents
Yusheng Liao, Shuyang Jiang, Yanfeng Wang, Yu Wang
- Lexical Recall or Logical Reasoning: Probing the Limits of Reasoning Abilities in Large Language Models
Henrike Beyer, Chris Reed
- ChainEdit: Propagating Ripple Effects through Logical Rule-Guided Chain Updates
Zilu dong, Xiangqing Shen, Zinong Yang, Rui Xia
- HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model
Haiyang Guo, Fanhu Zeng, Ziwei Xiang, Fei Zhu, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu
- Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models
Qika Lin, Tianzhe Zhao, Kai He, Zhen Peng, Fangzhi Xu, Ling Huang, Jingying Ma, Mengling Feng
- Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
Yifan Zhang, Wenyu Du, Dongming Jin, Jie Fu, Zhi Jin
- TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition
Tianwei Lin, Jiang Liu, Wenqiao Zhang, Yang Dai, Haoyuan Li, Zhelun Yu, Wanggui He, Juncheng Li, Jiannan Guo, Hao Jiang, Siliang Tang, Yueting Zhuang
- CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models
Ling Shi, Deyi Xiong
- STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
Jaeseong Lee, seung-won hwang, Aurick Qiao, Daniel F Campos, Zhewei Yao, Yuxiong He
- Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System
Ziyou Jiang, Mingyang Li, Guowei Yang, Junjie Wang, Yuekai Huang, Zhiyuan Chang, Qing Wang
- FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation
Huadai Liu, Jialei Wang, Rongjie Huang, Yang Liu, Heng Lu, Wei Xue, Zhou Zhao
- How does Misinformation Affect Large Language Model Behaviors and Preferences?
Miao Peng, Nuo Chen, Jianheng Tang, Jia Li
- YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering
Jennifer D’Souza, Hamed Babaei Giglou, Quentin Münch
- GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding
Ziyin Zhang, Hang Yu, Sage Lee, Peng Di, Jianguo Li, Rui Wang
- MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis
Daniel Philip Rose, Chia-Chien Hung, Marco Lepri, Israa Alqassem, Kiril Gashteovski, Carolin Lawrence
- A Training-free LLM-based Approach to General Chinese Character Error Correction
Houquan Zhou, Bo Zhang, Zhenghua Li, Ming Yan, Min Zhang
- HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models
Songtao Jiang, Yan Zhang, Yeying Jin, Zhihang Tang, Yangyang Wu, YANG FENG, Jian Wu, Zuozhu Liu
- MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale
Jiawei Guo, Tianyu Zheng, Yizhi LI, Yuelin Bai, Bo Li, Yubo Wang, King Zhu, Graham Neubig, Wenhu Chen, Xiang Yue
- SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning
Prabhat Pandey, Rupak Vignesh Swaminathan, K V Vijay Girish, Arunasish Sen, Jian. Xie, Grant Strimel, Andreas Schwarz
- Recent Advances in Speech Language Models: A Survey
Wenqian Cui, Dianzhi Yu, Xiaoqi Jiao, Ziqiao Meng, Guangyan Zhang, Qichao Wang, Steven Y. Guo, Irwin King
- LexCLiPR: Cross-Lingual Paragraph Retrieval from Legal Judgments
Rohit Upadhya, Santosh T.Y.S.S
- Multi-task Adversarial Attacks against Black-box Model with Few-shot Queries
Wenqiang Wang, Yan XIAO, Hao Lin, Yangshijie Zhang, Xiaochun Cao
- SPECTRA: Faster Large Language Model Inference with Optimized Internal and External Speculation
Nguyen-Khang Le, Truong Dinh Do, Le-Minh Nguyen
- Multi-level Association Refinement Network for Dialogue Aspect-based Sentiment Quadruple Analysis
Zeliang Tong, Wei Wei, Xiaoye Qu, Rikui Huang, Zhixin Chen, Xingyu Yan
- Innovative Image Fraud Detection with Cross-Sample Anomaly Analysis: The Power of LLMs
QiWen Wang, Zhenghao Lin, Chen Lin, Junqi Yang, Zhenzhe Ying, Weiqiang Wang
- Cooperative or Competitive? Understanding the Interaction between Attention Heads From A Game Theory Perspective
Xiaoye Qu, Zengqi Yu, Dongrui Liu, Wei Wei, Daizong Liu, Jianfeng Dong, Yu Cheng
- MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification
Linzhuang Sun, Hao Liang, Jingxuan Wei, Bihui Yu, Tianpeng Li, Fan Yang, Zenan Zhou, Wentao Zhang
- Graph-Structured Trajectory Extraction from Travelogues
Aitaro Yamamoto, Hiroyuki Otomo, Hiroki Ouchi, Shohei Higashiyama, Hiroki Teranishi, Hiroyuki Shindo, Taro Watanabe
- Learning First-Order Logic Rules for Argumentation Mining
Yang Sun, Guanrong Chen, Hamid Alinejad-Rokny, Jianzhu Bao, Yuqi Huang, Bin Liang, Kam-Fai Wong, Min Yang, Ruifeng Xu
- Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency
Jiafeng Liang, Shixin Jiang, Xuan Dong, Ning Wang, Zheng Chu, Hui Su, Jinlan Fu, Ming Liu, See-Kiong Ng, Bing Qin
- UniRAG: Unified Query Understanding Method for Retrieval Augmented Generation
Rui Li, Liyang He, Qi Liu, Zheng Zhang, Heng Yu, Yuyang Ye, Linbo Zhu, Yu Su
- Contextual Experience Replay for Continual Learning of Language Agents
Yitao Liu, Chenglei Si, Karthik R Narasimhan, Shunyu Yao
- Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning
Qi Sun, Pengfei Hong, Tej Deep Pala, Vernon Toh, U-Xuan Tan, Deepanway Ghosal, Soujanya Poria
- Towards Comprehensive Argument Analysis in Education: Dataset, Tasks, and Method
Yupei Ren, Xinyi Zhou, Ning Zhang, Shangqing Zhao, Man Lan, Xiaopeng Bai
- Browsing Like Human: A Multimodal Web Agent with Experiential Fast-and-Slow Thinking
Haohao Luo, Jiayi Kuang, Wei Liu, Ying Shen, Jian Luan, Yang Deng
- MaXIFE: Multilingual and Cross-lingual Instruction Following Evaluation
Yile Liu, Ziwei Ma, Xiu Jiang, Jinglu HU, ChangJing, Liang Li
- Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
Guijin Son, Jiwoo Hong, Hyunwoo Ko, James Thorne
- Can MLLMs Understand the Deep Implication Behind Chinese Images?
Chenhao Zhang, Xi Feng, Yuelin Bai, Xeron Du, Jinchang Hou, Kaixin Deng, Guangzeng Han, Qinrui Li, Bingli Wang, Jiaheng Liu, Xingwei Qu, Yifei Zhang, Qixuan Zhao, Yiming Liang, Ziqiang Liu, Feiteng Fang, Min Yang, Wenhao Huang, Chenghua Lin, Ge Zhang, Shiwen Ni
- KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan
Mukhammed Togmanov, Nurdaulet Mukhituly, Diana Turmakhan, Jonibek Mansurov, Maiya Goloburda, Akhmed Sakip, Zhuohan Xie, Yuxia Wang, Bekassyl Syzdykov, Nurkhan Laiyk, Alham Fikri Aji, Ekaterina Kochmar, Preslav Nakov, Fajri Koto
- Fast or Slow? Integrating Fast Intuition and Deliberate Thinking for Enhancing Visual Question Answering
Songtao Jiang, Chenyi Zhou, Yan Zhang, Yeying Jin, Zuozhu Liu
- Towards Multi-dimensional Evaluation of LLM Summarization across Domains and Languages
Hyangsuk Min, Yuho Lee, Minjeong Ban, Jiaqi Deng, Nicole Hee-Yeon Kim, Taewon Yun, Hang Su, Jason Cai, Hwanjun Song
- ClusterAttn: KV Cache Compression under Intrinsic Attention Clustering
Minwei Zhang, Haifeng Sun, Jingyu Wang, Shaolong Li, Wanyi Ning, Qi Qi, Zirui Zhuang, Jianxin Liao
- SHARE: Shared Memory-Aware Open-Domain Long-Term Dialogue Dataset Constructed from Movie Script
Eunwon Kim, Chanho Park, Buru Chang
- Incongruity-aware Tension Field Network for Multi-modal Sarcasm Detection
Jiecheng Zhang, C.L.Philip Chen, Shuzhen Li, Tong Zhang
- Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh
Nurkhan Laiyk, Daniil Orel, Rituraj Joshi, Maiya Goloburda, Yuxia Wang, Preslav Nakov, Fajri Koto
- Stealing Training Data from Large Language Models in Decentralized Training through Activation Inversion Attack
Chenxi Dai, Lin Lu, Pan Zhou
- From Selection to Generation: A Survey of LLM-based Active Learning
Yu Xia, Subhojyoti Mukherjee, Zhouhang Xie, Junda Wu, Xintong Li, Ryan Aponte, Hanjia Lyu, Joe Barrow, Hongjie Chen, Franck Dernoncourt, Branislav Kveton, Tong Yu, Ruiyi Zhang, Jiuxiang Gu, Nesreen K. Ahmed, Yu Wang, Xiang Chen, Hanieh Deilamsalehy, Sungchul Kim, Zhengmian Hu, Yue Zhao, Nedim Lipka, Seunghyun Yoon, Ting-Hao Kenneth Huang, Zichao Wang, Puneet Mathur, Soumyabrata Pal, Koyel Mukherjee, Zhehao Zhang, Namyong Park, Thien Huu Nguyen, Jiebo Luo, Ryan A. Rossi, Julian McAuley
- OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation
Qinglin Zhang, Luyao Cheng, Chong Deng, Qian Chen, Wen Wang, Siqi Zheng, Jiaqing Liu, Hai Yu, Chao-Hong Tan, Zhihao Du, ShiLiang Zhang
- DoMIX: An Efficient Framework for Exploiting Domain Knowledge in Fine-Tuning
Dohoon Kim, Donghun Kang, Taesup Moon
- EAGLE: Expert-Guided Self-Enhancement for Preference Alignment in Pathology Large Vision-Language Model
Meidan Ding, Jipeng Zhang, Wenxuan Wang, Haiqin Zhong, Xiaoqin Wang, XINHENG LYU, Wenting Chen, Linlin Shen
- CoT-ICL Lab: A Petri Dish for Studying Chain-of-Thought Learning from In-Context Demonstrations
Vignesh Kothapalli, Hamed Firooz, Maziar Sanjabi
- Flexora: Flexible Low-Rank Adaptation for Large Language Models
Chenxing Wei, Yao Shu, Ying Tiffany He, Fei Yu
- QDTSynth: Quality-Driven Formal Theorem Synthesis for Enhancing Proving Performance of LLMs
Lei Wang, Ruobing Zuo, Gaolei He, Jianlin Wang, Zhengfeng Yang
- RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought
Yi Lu, Jiawang Cao, Yongliang Wu, Bozheng Li, Licheng Tang, Yangguang Ji, Chong Wu, Jay Wu, Wenbo Zhu
- QAEval: Mixture of Evaluators for Question-Answering Task Evaluation
Tan Yue, Rui Mao, xuzhao Shi, SHUO ZHAN, Zuhao Yang, Dongyan Zhao
- Debiasing the Fine-Grained Classification Task in LLMs with Bias-Aware PEFT
Daiying Zhao, Xinyu Yang, Hang Chen
- Demystifying Small Language Models for Edge Deployment
Zhenyan Lu, Xiang Li, Dongqi Cai, Rongjie Yi, Fangming Liu, Wei Liu, Jian Luan, Xiwen Zhang, Nicholas D. Lane, Mengwei Xu
- Adapt Once, Thrive with Updates: Transferable Parameter-Efficient Fine-Tuning on Evolving Base Models
Naibin Gu, Peng Fu, Xiyu Liu, Ke Ma, Zheng Lin, Weiping Wang
- Can Vision-Language Models Evaluate Handwritten Math?
Oikantik Nath, Hanani Bathina, Mohammed Safi Ur Rahman Khan, Mitesh M Khapra
- Continual Gradient Low-Rank Projection Fine-Tuning for LLMs
Chenxu Wang, Yilin Lyu, Zicheng Sun, Liping Jing
- Towards Objective Fine-tuning: How LLMs’ Prior Knowledge Causes Potential Poor Calibration?
Ziming Wang, Zeyu Shi, Haoyi Zhou, Shiqi Gao, Qingyun Sun, Jianxin Li
- Can Community Notes Replace Professional Fact-Checkers?
Nadav Borenstein, Greta Warren, Desmond Elliott, Isabelle Augenstein
- Towards Robust ESG Analysis Against Greenwashing Risks: Aspect-Action Analysis with Cross-Category Generalization
Keane Ong, Rui Mao, Deeksha varshney, Erik Cambria, Gianmarco Mengaldo
- HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States
Yilei Jiang, Xinyan Gao, Tianshuo Peng, Yingshui Tan, Xiaoyong Zhu, Bo Zheng, Xiangyu Yue
- SwiLTra-Bench: The Swiss Legal Translation Benchmark
Joel Niklaus, Jakob Merane, Luka Nenadic, Sina Ahmadi, Yingqiang Gao, Cyrill A. H. Chevalley, Claude Humbel, Christophe Gösken, Lorenzo Tanzi, Thomas Lüthi, Stefan Palombo, Spencer Poff, Boling Yang, Nan Wu, Matthew Guillod, Robin Mamié, Daniel Brunner, Julio Pereyra, Niko Grupen
- Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement
Yichen Dong, Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang
- Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Philipp Mondorf, Sondre Wold, Barbara Plank
- Can LLMs Ground when they (Don’t) Know: A Study on Direct and Loaded Political Questions
Clara Lachenmaier, Judith Sieker, Sina Zarrieß
- GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking
Yingjian Chen, Haoran Liu, Yinhong Liu, Jinxiang Xie, Rui Yang, Han Yuan, Yanran Fu, Peng Yuan Zhou, Qingyu Chen, James Caverlee, Irene Li
- SCULPT: Systematic Tuning of Long Prompts
Shanu Kumar, Akhila Yesantarao Venkata, Shubhanshu Khandelwal, Bishal Santra, Parag Agrawal, Manish Gupta
- Crab: A Novel Configurable Role-Playing LLM with Assessing Benchmark
Kai He, Yucheng Huang, Wenqing Wang, Delong Ran, Dongming Sheng, Junxuan Huang, Qika Lin, Jiaxing Xu, Wenqiang Liu, Mengling Feng
- Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
Yingshui Tan, Boren Zheng, Baihui Zheng, Kerui Cao, Huiyun Jing, Jincheng Wei, Jiaheng Liu, Yancheng He, Wenbo Su, Xiaoyong Zhu, Bo Zheng, Kaifu Zhang
- TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data Synthesis
Xiaorui Wu, Xiaofeng Mao, Fei Li, Xin Zhang, Donghong Ji, Chong Teng, Xuanhong Li, Zhuang Li
- Cross-Lingual Optimization for Language Transfer in Large Language Models
Jungseob Lee, Seongtae Hong, Hyeonseok Moon, Heuiseok Lim
- ACE: A Generative Cross-Modal Retrieval Framework With Coarse-To-Fine Semantic Modeling
Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao
- MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark
Xiang Yue, Tianyu Zheng, Yuansheng Ni, Yubo Wang, Kai Zhang, Shengbang Tong, Yuxuan Sun, Botao Yu, Ge Zhang, Huan Sun, Yu Su, Wenhu Chen, Graham Neubig
- Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch
Xueru Wen, Jie Lou, Zichao Li, Yaojie Lu, XingYu, Yuqiu Ji, Guohai Xu, Hongyu Lin, Ben He, Xianpei Han, Le Sun, Debing Zhang
- Why Safeguarded Ships Run Aground? Aligned Large Language Models’ Safety Mechanisms Tend to Be Anchored in The Template Region
Chak Tou Leong, Qingyu Yin, Jian Wang, Wenjie Li
- LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering
Jinhe Bi, Yujun Wang, Haokun Chen, Xun Xiao, Artur Hecker, Volker Tresp, Yunpu Ma
- Efficient Long Context Language Model Retrieval with Compression
Minju Seo, Jinheon Baek, Seongyun Lee, Sung Ju Hwang
- Ontology-Guided Reverse Thinking Makes Large Language Models Stronger on Knowledge Graph Question Answering
Runxuan Liu, luobei, Jiaqi Li, Baoxin Wang, Ming Liu, Dayong Wu, Shijin Wang, Bing Qin
- Towards Omni-RAG: Comprehensive Retrieval-Augmented Generation for Large Language Models in Medical Applications
Zhe Chen, Yusheng Liao, Shuyang Jiang, Pingjie Wang, YiQiu Guo, Yanfeng Wang, Yu Wang
- Predicting Turn-Taking and Backchannel in Human-Machine Conversations Using Linguistic, Acoustic, and Visual Signals
Yuxin Lin, Yinglin Zheng, Ming Zeng, Wangzheng Shi
- A New Formulation of Zipf’s Meaning-Frequency Law through Contextual Diversity
Ryo Nagata, Kumiko Tanaka-Ishii
- The Mirage of Model Editing: Revisiting Evaluation in the Wild
Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Qi Cao, Dawei Yin, Huawei Shen, Xueqi Cheng
- LAQuer: Localized Attribution Queries in Content-grounded Generation
Eran Hirsch, Aviv Slobodkin, David Wan, Elias Stengel-Eskin, Mohit Bansal, Ido Dagan
- EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning
Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma, Aobo Kong, Fei Huang, Jianbin Jiao, Junge Zhang
- DCG-SQL: Enhancing In-Context Learning for Text-to-SQL with Deep Contextual Schema Link Graph
Jihyung Lee, Jin-Seop Lee, Jaehoon Lee, YunSeok Choi, Jee-Hyong Lee
- Multilingual Gloss-free Sign Language Translation: Towards Building a Sign Language Foundation Model
Sihan Tan, Taro Miyazaki, Kazuhiro Nakadai
- PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy
Shuhao Guan, Moule Lin, Cheng Xu, Xinyi Liu, Jinman Zhao, Jiexin Fan, Qi Xu, Derek Greene
- Digest the Knowledge: Large Language Models empowered Message Passing for Knowledge Graph Question Answering
Junhong Wan, Tao Yu, Kunyu Jiang, Yao Fu, Weihao Jiang, Jiang Zhu
- RecLM: Recommendation Instruction Tuning
Yangqin Jiang, Yuhao Yang, Lianghao Xia, Da Luo, Kangyi Lin, Chao Huang
- DS$^2$-ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment Analysis
Hongling Xu, Yice Zhang, Qianlong Wang, Ruifeng Xu
- MISP-Meeting: A Real-World Dataset with Multimodal Cues for Long-form Meeting Transcription and Summarization
HangChen, Chao-Han Huck Yang, Jia-Chen Gu, Hongxu Yin, Sabato Marco Siniscalchi, Jun Du
- Learning Together to Perform Better: Teaching Small-Scale LLMs to Collaborate via Preferential Rationale Tuning
Sohan Patnaik, Milan Aggarwal, Sumit Bhatia, Balaji Krishnamurthy
- MolRAG: Unlocking the Power of Large Language Models for Molecular Property Prediction
Ziting Xian, Jiawei Gu, Lingbo Li, Eran Segal, Shangsong Liang
- SkillAggregation: Reference-free LLM-Dependent Aggregation
Guangzhi Sun, Anmol Kagrecha, Potsawee Manakul, Phil Woodland, Mark Gales
- MasRouter: Learning to Route LLMs for Multi-Agent Systems
Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, Yiyan Qi
- Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation
Haozhe Xu, Xiaohua Wang, Changze Lv, Xiaoqing Zheng
- Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation
Peiwen Yuan, Yueqi Zhang, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Jiayi Shi, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li
- Advancing Sequential Numerical Prediction in Autoregressive Models
Xiang Fei, Jinghui Lu, Qi Sun, Hao Feng, Yanjie Wang, Wei Shi, An-Lan Wang, Jingqun Tang, Can Huang
- iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering
Shuai Wang, Yinan Yu
- IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory
Wei Song, Zhenya Huang, Cheng Cheng, Weibo Gao, Bihan Xu, GuanHao Zhao, Fei Wang, Runze Wu
- MLAS-LoRA: Language-Aware Parameters Detection and LoRA-Based Knowledge Transfer for Multilingual Machine Translation
Tianyu Dong, Bo Li, Jinsong Liu, shaolin Zhu, Deyi Xiong
- M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation
Jiaheng Liu, Ken Deng, Congnan Liu, Jian Yang, Shukai Liu, He Zhu, Peng Zhao, Linzheng Chai, Yanan Wu, Ge Zhang, Yingshui Tan, Zekun Moore Wang, JinKe, Zhaoxiang Zhang, Bangyu Xiang, Guoan Zhang, Wenbo Su, Bo Zheng
- Evaluating Design Decisions for Dual Encoder-based Entity Disambiguation
Susanna Rücker, Alan Akbik
- How to Compare Things Properly? A Study on Answering Comparative Questions using Argument Summarization
Irina Nikishina, Saba Anwar, Nikolay Dolgov, Maria Manina, Daria Ignatenko, Artem Shelmanov, Chris Biemann
- FinanceReasoning: Make Financial Numerical Reasoning More Credible, Comprehensive, and Challenging
Zichen Tang, Haihong E, Ziyan Ma, Haoyang He, Jiacheng Liu, Zhongjun Yang, Zihua Rong, Rongjin Li, Kun Ji, Huang Qing, Xinyang Hu, Yang Liu, Qianhe Zheng
- Controllable Style Arithmetic with Language Models
Weiqi Wang, Wengang Zhou, Zongmeng Zhang, Jie Zhao, Houqiang Li
- Masks Can be Learned As An Alternative of Experts
Peiyu Liu, Tianwen Wei, Bo Zhu, Xin Zhao, Shuicheng YAN
- Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment
Chao Wen, Jacqueline Staub, Adish Singla
- Removal of Hallucination on Hallucination: Debate-Augmented RAG
Wentao Hu, Wengyu Zhang, Yiyang Jiang, Chen Jason Zhang, Xiaoyong Wei, Li Qing
- CodeDPO: Aligning Code Models with Self Generated and Verified Source Code
Kechi Zhang, Ge Li, Yihong Dong, Jingjing Xu, Jun Zhang, Jing Su, Yongfei Liu, Zhi Jin
- ProxAnn: Use-Oriented Evaluations of Topic Models and Document Clustering
Alexander Miserlis Hoyle, Lorena Calvo-Bartolomé, Jordan Lee Boyd-Graber, Philip Resnik
- BOOKWORLD: From Novels to Interactive Agent Societies for Story Creation
Yiting Ran, Xintao Wang, Tian Qiu, Jiaqing Liang, Yanghua Xiao, Deqing Yang
- Quantifying Lexical Semantic Shift via Unbalanced Optimal Transport
Ryo Kishino, Hiroaki Yamagiwa, Ryo Nagata, Sho Yokoi, Hidetoshi Shimodaira
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
Hao Peng, Yunjia Qi, Xiaozhi Wang, Zijun Yao, Bin Xu, Lei Hou, Juanzi Li
- Adaptive and Robust Translation from Natural Language to Multi-model Query Languages
Gengyuan Shi, Chaokun Wang, Liu Yabin, Jiawei Ren
- SAKE: Steering Activations for Knowledge Editing
Marco Scialanga, Thibault Laugel, Vincent Grari, Marcin Detyniecki
- Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs
Danni Liu, Jan Niehues
- Can external validation tools improve annotation quality for LLM-as-a-Judge?
Arduin Findeis, Floris Weers, Guoli Yin, Ke Ye, Ruoming Pang, Tom Gunter
- One for All: Update Parameterized Knowledge Across Multiple Models with Once Edit
Weitao Ma, Xiyuan Du, Xiaocheng Feng, Lei Huang, Yichong Huang, Huiyi Zhang, Xiaoliang Yang, Baohang Li, Xiachong Feng, Ting Liu, Bing Qin
- VLMInferSlow: Evaluating the Efficiency Robustness of Large Vision-Language Models as a Service
Xiasi Wang, Tianliang Yao, Simin Chen, Runqi Wang, Lei YE, Kuofeng Gao, Yi Huang, Yuan Yao
- The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs
Nitay Calderon, Roi Reichart, Rotem Dror
- CrisisTS: Coupling Social Media Textual Data and Meteorological Time Series for Urgency Classification
Romain Meunier, Farah Benamara, Véronique Moriceau, Savitha Ramasamy, Zhongzheng Qiao
- How to Mitigate Overfitting in Weak-to-strong Generalization?
Junhao Shi, Qinyuan Cheng, Zhaoye Fei, Yining Zheng, Qipeng Guo, Xipeng Qiu
- Com$^2$ : A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models
Kai Xiong, Xiao Ding, Yixin Cao, Yuxiong Yan, Li Du, Yufei Zhang, Jinglong Gao, Jiaqian Liu, Bing Qin, Ting Liu
- Dynamic Head Selection for Neural Lexicalized Constituency Parsing
Yang Hou, Zhenghua Li
- My Words Imply Your Opinion: Reader Agent-Based Propagation Enhancement for Personalized Implicit Emotion Analysis
Jian Liao, Yu Feng, Yujin Zheng, Jun Zhao, Suge Wang, JianXing Zheng
- EvolveBench: A Comprehensive Benchmark for Assessing Temporal Awareness in LLMs on Evolving Knowledge
Zhiyuan Zhu, Yusheng Liao, Zhe Chen, Yuhao Wang, Yunfeng Guan, Yanfeng Wang, Yu Wang
- Enabling LLM Knowledge Analysis via Extensive Materialization
Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Simon Razniewski
- Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching
Jialong Zuo, Shengpeng Ji, Minghui Fang, Mingze Li, Ziyue Jiang, Xize Cheng, Xiaoda Yang, Chen Feiyang, Xinyu Duan, Zhou Zhao
- Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs
Jingcheng Niu, Xingdi Yuan, Tong Wang, Hamidreza Saghir, Amir H. Abdi
- CritiQ: Mining Data Quality Criteria from Human Preferences
Honglin Guo, Kai Lv, Qipeng Guo, Tianyi Liang, Zhiheng Xi, Demin Song, Qiuyinzhe Zhang, Yu Sun, Kai Chen, Xipeng Qiu, Tao Gui
- Theoretical Guarantees for Minimum Bayes Risk Decoding
Yuki Ichihara, Yuu Jinnai, Kaito Ariu, Tetsuro Morimura, Eiji Uchibe
- Mutual-Taught for Co-adapting Policy and Reward Models
Tianyuan Shi, Canbin Huang, Fanqi Wan, Longguang Zhong, Ziyi Yang, Weizhou Shen, Xiaojun Quan, Ming Yan
- Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages
Wenhao Zhuang, Yuan Sun, Xiaobing Zhao
- Unmasking Style Sensitivity: A Causal Analysis of Bias Evaluation Instability in Large Language Models
Jiaxu Zhao, Meng Fang, Kun Zhang, Mykola Pechenizkiy
- MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines
Dávid Javorský, Ondřej Bojar, François Yvon
- BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning
Ercong Nie, Bo Shao, Mingyang Wang, Zifeng Ding, Helmut Schmid, Hinrich Schuetze
- What Matters in Evaluating Book-Length Stories? A Systematic Study of Long Story Evaluation
Dingyi Yang, Qin Jin
- PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation
Linhai Zhang, Jialong Wu, Deyu Zhou, Yulan He
- Enhancing Event-centric News Cluster Summarization via Data Sharpening and Localization Insights
Longyin Zhang, Bowei Zou, AiTi Aw
- MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence Calibration
Zhitao He, Sandeep Polisetty, Zhiyuan Fan, Shujin Wu, Yuchen Huang, Yi R. Fung
- LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios
Xiaodong Wu, Minhao Wang, Yichen Liu, Xiaoming Shi, He Yan, Lu Xiangju, Junmin Zhu, Wei Zhang
- FEAT: A Preference Feedback Dataset through a Cost-Effective Auto-Generation and Labeling Framework for English AI Tutoring
Hyein Seo, Taewook Hwang, Yohan Lee, Sangkeun Jung
- Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering
Shuzheng Si, Haozhe Zhao, Gang Chen, Cheng Gao, Yuzhuo Bai, Zhitong Wang, Kaikai An, Kangyang Luo, Chen Qian, Fanchao Qi, Baobao Chang, Maosong Sun
- One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs
Junwoo Ha, Hyunjun Kim, Sangyoon Yu, Haon Park, Ashkan Yousefpour, Yuna Park, Suhyun Kim
- RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information
Zhiwei Liu, Kailai Yang, Qianqian Xie, Christine de Kock, Sophia Ananiadou, Eduard Hovy
- Task-Specific Information Decomposition for End-to-End Dense Video Captioning
Zhiyue Liu, Xinru Zhang, Jinyuan Liu
- CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges
Haitao Li, Junjie Chen, Qingyao Ai, Zhumin Chu, Yujia Zhou, Qian Dong, Yiqun LIU
- Context Matters: Semantic Expansion and Taxonomy-Grounded Augmentation for Sexism Detection
SAHRISH KHAN, Gabriele Pergola, Arshad Jhumka
- Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models
Elena Sofia Ruzzetti, Giancarlo A. Xompero, Davide Venditti, Fabio Massimo Zanzotto
- PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
Xinyu Zhang, Yuxuan Dong, Yanrui Wu, Jiaxing Huang, Chengyou Jia, Basura Fernando, Mike Zheng Shou, Lingling Zhang, Jun Liu
- Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information
Yein Park, Chanwoong Yoon, Jungwoo Park, Minbyul Jeong, Jaewoo Kang
- Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training
Zheheng Luo, Xin Zhang, Xiao Liu, Haoling Li, Yeyun Gong, Qi Chen, Peng CHENG
- Sheep’s Skin, Wolf’s Deeds: Are LLMs Ready for Metaphorical Implicit Hate Speech?
Jingjie Zeng, Yuanyuan Sun, zekun wang, Liang Yang, Hongfei Lin
- Neuron-Level Sequential Editing for Large Language Models
Houcheng Jiang, Junfeng Fang, Tianyu Zhang, Baolong Bi, An Zhang, Ruipeng Wang, Tao Liang, Xiang Wang
- Automatic Expert Discovery in LLM Upcycling via Sparse Interpolated Mixture-of-Experts
Shengzhuang Chen, Ying Wei, Jonathan Richard Schwarz
- SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
Keqi Deng, Wenxi Chen, Xie Chen, Phil Woodland
- VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models
Wenqian Cui, Xiaoqi Jiao, Ziqiao Meng, Irwin King
- RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, YE QI, Zhicheng Dou
- ChronoSense: Exploring Temporal Understanding in Large Language Models with Time Intervals of Events
Duygu Sezen Islakoglu, Jan-Christoph Kalo
- The Role of Deductive and Inductive Reasoning in Large Language Models
Chengkun Cai, Xu Zhao, Haoliang Liu, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, Jenq-Neng Hwang, Serge Belongie, Lei Li
- Disentangling the Roles of Representation and Selection in Data Pruning
Yupei Du, Yingjin Song, Hugh Mee Wong, Daniil Ignatev, Albert Gatt, Dong Nguyen
- FRACTAL: Fine-Grained Scoring from Aggregate Text Labels
Yukti Makhija, Priyanka Agrawal, Rishi Saket, Aravindan Raghuveer
- ACT: Knowledgeable Agents to Design and Perform Complex Tasks
Makoto Nakatsuji, Shuhei Tateishi, Yasuhiro Fujiwara, Ayaka Matsumoto, Narichika Nomoto, Yoshihide Sato
- Logical forms complement probability in understanding language model (and human) performance
Yixuan Wang, Freda Shi
- Length Controlled Generation for Black-box LLMs
Yuxuan Gu, Wenjie Wang, Xiaocheng Feng, Weihong Zhong, kun Zhu, Lei Huang, Ting Liu, Bing Qin, Tat-Seng Chua
- Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization
Lei Huang, Xiaocheng Feng, Weitao Ma, Yuchun Fan, Xiachong Feng, Yangfan Ye, Weihong Zhong, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu, Bing Qin
- Global Eye: Breaking the “Fixed Thinking Pattern” during the Instruction Expansion Process
wenxuan lu, Wei Liu, Jian Luan, Bin Wang, Songhao Jiang, Tianning Zang
- On Synthesizing Data for Context Attribution in Question Answering
Gorjan Radevski, Kiril Gashteovski, Shahbaz Syed, Christopher Malon, Sebastien Nicolas, Chia-Chien Hung, Timo Sztyler, Verena Heußer, Wiem Ben Rim, Masafumi Enomoto, Kunihiro Takeoka, Masafumi Oyamada, Goran Glavaš, Carolin Lawrence
- TST: A Schema-Based Top-Down and Dynamic-Aware Agent of Text-to-Table Tasks
Peiwen Jiang, Haitong Jiang, Ruhui Ma, Yvonne Jie Chen, Jinhua Cheng
- EventRAG: Enhancing LLM Generation with Event Knowledge Graphs
Zairun Yang, Yilin Wang, Zhengyan Shi, Yuan Yao, Lei Liang, Keyan Ding, Emine Yilmaz, Huajun Chen, Qiang Zhang
- Supervised Fine-Tuning Achieve Rapid Task Adaption Via Alternating Attention Head Activation Patterns
Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin
- Can’t See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs
Wenxuan Wang, Xiaoyuan Liu, Kuiyi Gao, Jen-tse Huang, Youliang Yuan, Pinjia He, Shuai Wang, Zhaopeng Tu
- Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling
Jiayi Zeng, Yizhe Feng, Mengliang He, Wenhui Lei, Wei Zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou
- TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning
Soumyabrata Chaudhuri, Pranav Purkar, Ritwik Raghav, Shubhojit Mallick, Manish Gupta, Abhik Jana, Shreya Ghosh
- DualGuard: A Parameter Space Transformation Approach for Bidirectional Defense in Split-Based LLM Fine-Tuning
Zihan Liu, Yizhen Wang, Rui Wang, Sai Wu
- Movie101v2: Improved Movie Narration Benchmark
Zihao Yue, Yepeng Zhang, Ziheng Wang, Qin Jin
- Can LLMs Evaluate Complex Attribution in QA? Automatic Benchmarking using Knowledge Graphs
Nan Hu, Jiaoyan Chen, Yike Wu, Guilin Qi, Hongru WANG, Sheng Bi, Yongrui Chen, Tongtong Wu, Jeff Z. Pan
- Value Portrait: Understanding Values of LLMs with Human-aligned Benchmark
Jongwook Han, Dongmin Choi, Woojung Song, Eun-Ju Lee, Yohan Jo
- FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation
Wei Li, Xin Zhang, Zhongxin Guo, Shaoguang Mao, Wen Luo, Guangyue Peng, Yangyu Huang, Houfeng Wang, Scarlett Li
- Do not Abstain! Identify and Solve the Uncertainty
Jingyu Liu, JingquanPeng, xiaopeng Wu, Xubin Li, Tiezheng Ge, Bo Zheng, Yong Liu
- Decoding by Contrasting Knowledge: Enhancing Large Language Model Confidence on Edited Facts
Baolong Bi, Shenghua Liu, Lingrui Mei, Yiwei Wang, Junfeng Fang, Pengliang Ji, Xueqi Cheng
- ImpliHateVid: A Benchmark Dataset and Two-stage Contrastive Learning Framework for Implicit Hate Speech Detection in Videos
Mohammad Zia Ur Rehman, Anukriti Bhatnagar, Omkar Kabde, Shubhi Bansal, Dr. Nagendra Kumar
- Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions
Leonardo Ranaldi, Marco Valentino, Andre Freitas
- Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments
Aniket Bhattacharyya, Anurag Tripathi, Ujjal Das, Archan Karmakar, Amit Pathak, Maneesh Gupta
- Enhancing Large Language Model’s Capabilities in Open Domains via Autonomous Tool Integration from GitHub
Bohan Lyu, Xin Cong, Heyang Yu, Pan Yang, Cheng Qian, Zihe Wang, Yujia Qin, Yining Ye, Yaxi Lu, Chen Qian, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun
- LLMs Can Simulate Standardized Patients via Agent Coevolution
Zhuoyun Du, LujieZheng, Renjun Hu, Yuyang Xu, Xiawei Li, Ying Sun, Wei Chen, Jian Wu, Haolei Cai, Haochao Ying
- Donate or Create? Comparing Data Collection Strategies for Emotion-labeled Multimodal Social Media Posts
Christopher Bagdon, Aidan Combs, Carina Silberer, Roman Klinger
- Which Demographics do LLMs Default to During Annotation?
Johannes Schäfer, Aidan Combs, Christopher Bagdon, Jiahui Li, Nadine Probol, Lynn Greschner, Sean Papay, Yarik Menchaca Resendiz, Aswathy Velutharambath, Amelie Wuehrl, Sabine Weber, Roman Klinger
- Can You Really Trust Code Copilot? Evaluating Large Language Models from a Code Security Perspective
Yutao Mou, Xiao Deng, Yuxiao Luo, Shikun Zhang, Wei Ye
- From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MarkerGen
Peiwen Yuan, Chuyi Tan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Yueqi Zhang, Jiayi Shi, Boyuan Pan, Yao Hu, Kan Li
- AGD: Adversarial Game Defense Against Jailbreak Attacks in Large Language Models
Shilong Pan, Zhiliang Tian, Zhen Huang, Wanlong Yu, Zhihua Wen, Xinwang Liu, Kai Lu, Minlie Huang, Dongsheng Li
- SCOP: Evaluating the Comprehension Process of Large Language Models from a Cognitive View
Yongjie Xiao, Hongru Liang, Peixin Qin, YAO ZHANG, Wenqiang Lei
- Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning
Peiying Yu, Guoxin Chen, Jingjing Wang
- An Expanded Massive Multilingual Dataset for High-Performance Language Technologies
Laurie Burchell, Ona De Gibert Bonet, Nikolay Arefyev, Mikko Aulamo, Marta Bañón, Pinzhen Chen, Mariia Fedorova, Liane Guillou, Barry Haddow, Jan Hajič, Jindřich Helcl, Erik Henriksson, Mateusz Klimaszewski, Ville Komulainen, Andrey Kutuzov, Joona Kytöniemi, Veronika Laippala, Petter Mæhlum, Bhavitvya Malik, Farrokh Mehryary, Vladislav Mikhailov, Nikita Moghe, Amanda Myntti, Dayyán O’Brien, Stephan Oepen, Proyag Pal, Jousia Piha, Sampo Pyysalo, Gema Ramírez-Sánchez, David Samuel, Pavel Stepachev, Jörg Tiedemann, Dušan Variš, Tereza Vojtěchová, Jaume Zaragoza-Bernabeu
- Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Yue Yang, Ajay Patel, Matt Deitke, Tanmay Gupta, Luca Weihs, Andrew Head, Mark Yatskar, Chris Callison-Burch, Ranjay Krishna, Aniruddha Kembhavi, Christopher Clark
- Hierarchical Attention Generates Better Proofs
Jianlong Chen, Chao Li, Yang Yuan, Andrew C Yao
- Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents
Tianyi Men, Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao
- It’s Not Bragging If You Can Back It Up: Can LLMs Understand Braggings?
Jingjie Zeng, Huayang Li, Yuanyuan Sun, Liang Yang, Hongfei Lin
- A Troublemaker with Contagious Jailbreak Makes Chaos in Honest Towns
Tianyi Men, Pengfei Cao, Zhuoran Jin, Yubo Chen, Kang Liu, Jun Zhao
- Meta-Learning Neural Mechanisms rather than Bayesian Priors
Michael Eric Goodale, Salvador Mascarenhas, Yair Lakretz
- Shifting from Ranking to Set Selection for Retrieval Augmented Generation
Dahyun Lee, Yongrae Jo, Haeju Park, Moontae Lee
- Understanding Large Language Model Vulnerabilities to Social Bias Attacks
Jiaxu Zhao, Meng Fang, Fanghua Ye, Ke Xu, Qin Zhang, Joey Tianyi Zhou, Mykola Pechenizkiy
- ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents
Zhigen Li, Jianxiang Peng, Yanmeng Wang, Yong Cao, Tianhao Shen, Minghui Zhang, Linxi Su, Shang Wu, Yihang Wu, YuQian Wang, Ye Wang, Wei Hu, Jianfeng Li, Shaojun Wang, Jing Xiao, Deyi Xiong
- Pixel-Level Reasoning Segmentation via Multi-turn Conversations
Dexian Cai, Xiaocui Yang, YongKang Liu, Daling Wang, Shi Feng, Yifei Zhang, Soujanya Poria
- Fixing Distribution Shifts of LLM Self-Critique via On-Policy Self-Play Training
Rong Bao, Donglei Yu, Kai Fan, Minpeng Liao
- Inferring Functionality of Attention Heads from their Parameters
Amit Elhelo, Mor Geva
- Faithful and Robust LLM-Driven Theorem Proving for NLI Explanations
Xin Quan, Marco Valentino, Louise A. Dennis, Andre Freitas
- Revealing the Deceptiveness of Knowledge Editing: A Mechanistic Analysis of Superficial Editing
Jiakuan Xie, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao
- Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation
Wenyu Huang, Pavlos Vougiouklis, Mirella Lapata, Jeff Z. Pan
- From Human Reading to NLM Understanding: Evaluating the Role of Eye-Tracking Data in Encoder-Based Models
Luca Dini, Lucia Domenichelli, Dominique Brunato, Felice Dell’Orletta
- Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering
LINHAO YE, Qin Chen, Jie Zhou, Lang Yu, Zhikai Lei, Liang He
- Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
Xiaoyuan Liu, Wenxuan Wang, Youliang Yuan, Jen-tse Huang, Qiuzhi Liu, Pinjia He, Zhaopeng Tu
- SceneGenAgent: Precise Industrial Scene Generation with Coding Agent
Xiao Xia, Dan Zhang, Zibo Liao, Zhenyu Hou, Tianrui Sun, Jing Li, Ling Fu, Yuxiao Dong
- ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models
Hanxing Ding, Shuchang Tao, Liang Pang, Zihao Wei, Jinyang Gao, Bolin Ding, Huawei Shen, Xueqi Cheng
- Human Alignment: How Much Do We Adapt to LLMs?
Cazalet Tanguy, Ruben Janssens, Tony Belpaeme, Joni Dambre
- Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study
Bashar Alhafni, Nizar Habash
- From Isolates to Families: Using Neural Networks for Automated Language Affiliation
Frederic Blum, Steffen Herbold, Johann-Mattis List
- ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models
Xuxu Liu, Siyuan Liang, Mengya Han, Yong Luo, Aishan Liu, Xiantao Cai, Zheng He, Dacheng Tao
- Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts
Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie Zhou
- When Harry Meets Superman: The Role of The Interlocutor in Persona-Based Dialogue Generation
Daniela Occhipinti, Marco Guerini, Malvina Nissim
- ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs
Zhenliang Zhang, Xinyu Hu, Huixuan Zhang, Junzhe Zhang, Xiaojun Wan
- Revisit Self-Debugging with Self-Generated Tests for Code Generation
Xiancai Chen, Zhengwei Tao, Kechi Zhang, Changzhi Zhou, Xinyu Zhang, Wanli Gu, Yuanpeng He, Mengdi Zhang, Xunliang Cai, Haiyan Zhao, Zhi Jin
- InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Dingdong WANG, Jin Xu, Ruihang Chu, Zhifang Guo, Xiong Wang, Jincenzi Wu, Dongchao Yang, Shengpeng Ji, Junyang Lin
- Exploring LLMs’ Ability to Spontaneously and Conditionally Modify Moral Expressions through Text Manipulation
Candida Maria Greco, Lucio La Cava, Lorenzo Zangari, Andrea Tagarelli
- Mixture of Ordered Scoring Experts for Cross-prompt Essay Trait Scoring
Po-Kai Chen, Bo-Wei Tsai, Shao Kuan Wei, Chien-Yao Wang, Jia-Ching Wang, Yi-Ting Huang
- A Sample Offline Saves Time: Knowledge Distillation in the LLM Era
Anshumann, Mohd Abbas Zaidi, Akhil Kedia, Jinwoo Ahn, Taehwak Kwon, Kangwook Lee, Haejun Lee, Joohyung Lee
- Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues
Varsha Suresh, M. Hamza Mughal, Christian Theobalt, Vera Demberg
- ExploraCoder: Advancing Code Generation for Multiple Unseen APIs via Planning and Chained Exploration
Yunkun Wang, Yue Zhang, Zhen Qin, Chen Zhi, Binhua Li, Fei Huang, Yongbin Li, Shuiguang Deng
- Segment First or Comprehend First? Explore the Limit of Unsupervised Word Segmentation with Large Language Models
Zihong Zhang, Liqi He, Zuchao Li, Lefei Zhang, hai zhao, Bo Du
- RUBY: An Effective Framework for Multi-Constraint Multi-Hop Question Generation
Wenzhuo Zhao, Shuangyin Li
- Can Indirect Prompt Injection Attacks Be Detected and Removed?
Yulin Chen, Haoran Li, Yuan Sui, Yufei He, Yue Liu, Yangqiu Song, Bryan Hooi
- Identifying Open Challenges in Language Identification
Rob van der Goot
- The Distracting Effect: Understanding Irrelevant Passages in RAG
Chen Amiraz, Florin Cuconasu, Simone Filice, Zohar Karnin
- Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining for Extremely Low-Resource Languages
Zeli Su, Ziyin Zhang, Guixian Xu, Jianing Liu, Xu Han, Ting Zhang, Yushuang Dong
- Graphically Speaking: Unmasking Abuse in Social Media with Conversation Insights
Célia Nouri, Chloé Clavel, Jean-Philippe Cointet
- CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision
YifeiLu, Fanghua Ye, Jian Li, Qiang Gao, Cheng Liu, Haibo Luo, nan du, Xiaolong Li, Feiliang Ren
- RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models
Hieu Tran, Zonghai Yao, Zhichao Yang, Junda Wang, Yifan Zhang, Feiyun Ouyang, Shuo Han, hong yu
- Defense Against Prompt Injection Attack by Leveraging Attack Techniques
Yulin Chen, Haoran Li, Zihao Zheng, Dekai Wu, Yangqiu Song, Bryan Hooi
- Acquisition and Application of Novel Knowledge in Large Language Models
Ziyu Shang, Jianghan Liu, Zhizhao Luo, Peng Wang, Wenjun Ke, Jiajun Liu, Zijie Xu, Guozheng Li
- DNCASR: End-to-End Training for Speaker-Attributed ASR
Xianrui Zheng, Chao Zhang, Phil Woodland
- Exploring Persona Sentiment Sensitivity in Personalized Dialogue Generation
Yonghyun Jun, Hwanhee Lee
- AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, William Yang Wang
- LLM-Guided Semantic-Aware Clustering for Topic Modeling
Jianghan Liu, Ziyu Shang, Wenjun Ke, Peng Wang, Zhizhao Luo, Jiajun Liu, Guozheng Li, Yining Li
- Hierarchical Bracketing Encodings for Dependency Parsing as Tagging
Ana Ezquerro, David Vilares, Anssi Yli-Jyrä, Carlos Gómez-Rodríguez
- OASIS: Order-Augmented Strategy for Improved Code Search
GAO Zuchen, Zizheng Zhan, Xianming LI, Erxin Yu, Haotian Zhang, chenbin, Yuqun Zhang, Jing Li
- Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
Yancheng He, Shilong Li, Jiaheng Liu, Weixun Wang, Xingyuan Bu, Ge Zhang, Z.Y. Peng, Zhaoxiang Zhang, Zhicheng Zheng, Wenbo Su, Bo Zheng
- OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference
Xiangyu Zhao, Shengyuan Ding, Zicheng Zhang, Haian Huang, Maosongcao, Jiaqi Wang, Weiyun Wang, Xinyu Fang, Wenhai Wang, Guangtao Zhai, Hua Yang, Haodong Duan, Kai Chen
- Dynamic Order Template Prediction for Generative Aspect-Based Sentiment Analysis
Yonghyun Jun, Hwanhee Lee
- Tree-KG: An Expandable KG Construction Framework for Knowledge-intensive Domains
Songjie Niu, Kaisen Yang, Rui Zhao, Yichao Liu, Zonglin Li, Hongning Wang, Wenguang Chen
- Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric
Yuming Yang, Yang Nan, Junjie Ye, Shihan Dou, Xiao Wang, Shuo Li, Huijie Lv, Tao Gui, Qi Zhang, Xuanjing Huang
- Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning
Nan Huo, Jinyang Li, Bowen Qin, Ge Qu, Xiaolong Li, Xiaodong Li, Chenhao Ma, Reynold Cheng
- Minimal Pair-Based Evaluation of Code-Switching
Igor Sterner, Simone Teufel
- DNASpeech: A Contextualized and Situated Text-to-Speech Dataset with Dialogues, Narratives and Actions
Chuanqi Cheng, Hongda Sun, Bo Du, Shuo Shang, Xinrong Hu, Rui Yan
- LLaMA-Omni 2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis
Qingkai Fang, Yan Zhou, Shoutao Guo, Shaolei Zhang, Yang Feng
- Error Comparison Optimization for Large Language Models on Aspect-Based Sentiment Analysis
Qianlong Wang, Keyang Ding, Hengxin Gao, Hui Wang, Ruifeng Xu
- The AI Gap: How Socioeconomic Status Affects Language Technology Interactions
Elisa Bassignana, Amanda Cercas Curry, Dirk Hovy
- Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set
Florian Eichin, Yang Janet Liu, Barbara Plank, Michael A. Hedderich
- Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia
Samuel Cahyawijaya, Holy Lovenia, Joel Ruben Antony Moniz, Tack Hwa Wong, Mohammad Rifqi Farhansyah, Thant Thiri Maung, Frederikus Hudi, David Anugraha, Muhammad Ravi Shulthan Habibi, Muhammad Reza Qorib, Amit Agarwal, Joseph Marvin Imperial, Hitesh Laxmichand Patel, Vicky Feliren, Bahrul Ilmi Nasution, Manuel Antonio Rufino, Genta Indra Winata, Rian Adam Rajagede, Carlos Rafael Catalan, Mohamed Fazli Mohamed Imam, Priyaranjan Pattnayak, Salsabila Zahirah Pranida, Kevin Pratama, Yeshil Bangera, Adisai Na-Thalang, Patricia Nicole Monderin, Yueqi Song, christian simon, Lynnette Hui Xian Ng, Richardy Lobo Sapan, Taki Hasan Rafi, Bin Wang, Supryadi, Kanyakorn Veerakanjana, Piyalitt Ittichaiwong, Matthew Theodore Roque, Karissa Vincentio, Takdanai Kreangphet, Phakphum Artkaew, Kadek Hendrawan Palgunadi, Yanzhi Yu, Rochana Prih Hastuti, William Nixon, Mithil Bangera, Adrian Xuan Wei Lim, Aye Hninn Khine, Hanif Muhammad Zhafran, Teddy Ferdinan, Audra Aurora Izzani, Ayushman Singh, Evan, Jauza Akbar Krito, Michael Anugraha, Fenal Ashokbhai Ilasariya, Haochen Li, John Amadeo Daniswara, Filbert Aurelian Tjiaranata, Eryawan Presma Yulianrifat, Can Udomcharoenchaikit, Fadil Risdian Ansori, Mahardika Krisna Ihsani, Giang Nguyen, Anab Maulana Barik, Dan John Velasco, Rifo Ahmad Genadi, Saptarshi Saha, Chengwei Wei, Isaiah Edri W. Flores, Kenneth Chen Ko Han, Anjela Gail D. Santos, Wan Shen Lim, Kaung Si Phyo, Tim Santos, Meisyarah Dwiastuti, Jiayun Luo, Jan Christian Blaise Cruz, Ming Shan Hee, Ikhlasul Akmal Hanif, M.Alif Al Hakim, Muhammad Rizky Sya’ban, Kun Kerdthaisong, Lester James Validad Miranda, Fajri Koto, Tirana Noor Fatyanosa, Alham Fikri Aji, Jostin Jerico Rosal, Jun Kevin, Robert Wijaya, Onno P. Kampman, Ruochen Zhang, Börje F. Karlsson, Peerat Limkonchotiwat
- Soundwave: Less is More for Speech-Text Alignment in LLMs
Yuhao Zhang, Zhiheng Liu, Fan Bu, Ruiyu Zhang, Benyou Wang, Haizhou Li
- RoToR: Towards More Reliable Responses for Order-Invariant Inputs
Soyoung Yoon, Dongha Ahn, Youngwon Lee, Minkyu Jung, HyungJoo Jang, seung-won hwang
- Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation
Shivalika Singh, Angelika Romanou, Clémentine Fourrier, Jian Gang Ngui, David Ifeoluwa Adelani, Daniel Vila-Suero, Peerat Limkonchotiwat, Kelly Marchisio, Wei Qi Leong, Yosephine Susanto, Raymond Ng, Shayne Longpre, Sebastian Ruder, Wei-Yin Ko, Antoine Bosselut, Alice Oh, Andre Martins, Daphne Ippolito, Enzo Ferrante, Leshem Choshen, Marzieh Fadaee, Beyza Ermis, Sara Hooker
- Improving Dialogue Discourse Parsing through Discourse-aware Utterance Clarification
Yaxin Fan, Peifeng Li, Qiaoming Zhu
- ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs
Yan Yang, Yixia Li, Hongru WANG, Xuetao Wei, James Jianqiao Yu, Yun Chen, Guanhua Chen
- Words of Warmth: Trust and Sociability Norms for over 26k English Words
Saif M. Mohammad
- BehaviorBox: Automated Behavioral Comparison of Language Models
Lindia Tjuatja, Graham Neubig
- HAF-RM: A Hybrid Alignment Framework for Reward Model Training
Shujun Liu, Xiaoyu Shen, Yuhang Lai, Siyuan Wang, ShengbinYue, Zengfeng Huang, Xuanjing Huang, zhongyu wei
- CULEMO: Cultural Lenses on Emotion - Benchmarking LLMs for Cross-Cultural Emotion Understanding
Tadesse Destaw Belay, Ahmed Haj Ahmed, Alvin C Grissom II, Iqra Ameer, Grigori Sidorov, Olga Kolesnikova, Seid Muhie Yimam
- DiffPO: Diffusion-styled Preference Optimization for Inference Time Alignment of Large Language Models
Ruizhe Chen, Wenhao Chai, Zhifei Yang, Xiaotian Zhang, Ziyang Wang, Tony Quek, Joey Tianyi Zhou, Soujanya Poria, Zuozhu Liu
- MemeQA: Holistic Evaluation of Meme Understanding
Khoi P. N. Nguyen, Terrence Li, Derek Lou Zhou, Gabriel Xiong, Pranav Balu, Nandhan Alahari, Alan Huang, Tanush Chauhan, Harshavardhan Bala, Emre Guzelordu, Affan Kashfi, Aaron Xu, Suyesh Shrestha, Megan Vu, Jerry Wang, Vincent Ng
- LoGU: Long-form Generation with Uncertainty Expressions
Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen Yang, Nigel Collier, Dong Yu, Deqing Yang
- KiRAG: Knowledge-Driven Iterative Retriever for Enhancing Retrieval-Augmented Generation
Jinyuan Fang, Zaiqiao Meng, Craig MacDonald
- Enhancing Lexicon-Based Text Embeddings with Large Language Models
Yibin Lei, Tao Shen, Yu Cao, Andrew Yates
- CoCoLex: Confidence-guided Copy-based Decoding for Grounded Legal Text Generation
Santosh T.Y.S.S, Youssef Tarek Elkhayat, Oana Ichim, Pranav Shetty, Dongsheng Wang, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu
- Beyond N-Grams: Rethinking Evaluation Metrics and Strategies for Multilingual Abstractive Summarization
Itai Mondshine, Tzuf Paz-Argaman, Reut Tsarfaty
- CC-Tuning: A Cross-Lingual Connection Mechanism for Improving Joint Multilingual Supervised Fine-Tuning
Yangfan Ye, Xiaocheng Feng, Zekun Yuan, Xiachong Feng, Libo Qin, Lei Huang, Weitao Ma, Yichong Huang, Zhirui Zhang, Yunfei Lu, Xiaohui Yan, Duyu Tang, Dandan Tu, Bing Qin
- SConU: Selective Conformal Uncertainty in Large Language Models
Zhiyuan Wang, Qingni Wang, Yue Zhang, Tianlong Chen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu
- MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval
Junjie Zhou, yongping xiong, Zheng Liu, Ze Liu, Shitao Xiao, Yueze Wang, Bo Zhao, Chen Jason Zhang, Defu Lian
- When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs
Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang
- UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook
Yidi Jiang, Qian Chen, Shengpeng Ji, Yu Xi, Wen Wang, Chong Zhang, Xianghu Yue, ShiLiang Zhang, Haizhou Li
- KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models
Fnu Mohbat, Mohammed J Zaki
- Multilingual Arbitration: Optimizing Data Pools to Accelerate Multilingual Progress
Ayomide Odumakinde, Daniel D’souza, Pat Verga, Beyza Ermis, Sara Hooker
- Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models
Yuheng Lu, Bingshuo Qian, Caixia Yuan, Huixing Jiang, Xiaojie Wang
- Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models
Yancheng He, Shilong Li, Jiaheng Liu, Yingshui Tan, Weixun Wang, Hui Huang, Xingyuan Bu, Hangyu Guo, Chengwei Hu, Boren Zheng, Zhuoran Lin, Dekai Sun, Zhicheng Zheng, Wenbo Su, Bo Zheng
- PVP: An Image Dataset for Personalized Visual Persuasion with Persuasiveness Ratings, Persuasion Strategies, and Viewer Characteristic
Junseo Kim, Jongwook Han, Dongmin Choi, Jongwook Yoon, Eun-Ju Lee, Yohan Jo
- Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval
Zheng Liu, Ze Liu, Zhengyang Liang, Junjie Zhou, Shitao Xiao, Chao Gao, Chen Jason Zhang, Defu Lian
- Tunable LLM-based Proactive Recommendation Agent
Mingze Wang, Chongming Gao, Wenjie Wang, Yangyang Li, Fuli Feng
- AgentRM: Enhancing Agent Generalization with Reward Modeling
Yu Xia, Jingru Fan, Weize Chen, Siyu Yan, Xin Cong, Zhong Zhang, Yaxi Lu, Yankai Lin, Zhiyuan Liu, Maosong Sun
- Score Consistency Meets Preference Alignment: Dual-Consistency for Partial Reward Modeling
Bin Xie, Bingbing Xu, Yige Yuan, Shengmao Zhu, Huawei Shen
- Segment-Based Attention Masking for GPTs
Shahar Katz, Liran Ringel, Yaniv Romano, Lior Wolf
- Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
Yuri Kuratov, Mikhail Arkhipov, Aydar Bulatov, Mikhail Burtsev
- Bi-Tuning with Collaborative Information for Controllable LLM-based Sequential Recommendation
Xinyu Zhang, Linmei Hu, Luhao Zhang, Wentao Cheng, Yashen Wang, Ge Shi, Chong Feng, Liqiang Nie
- A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment
Jean-Philippe Corbeil, Amin Dada, Jean-Michel Attendu, Asma Ben Abacha, Alessandro Sordoni, Lucas Caccia, Francois Beaulieu, Thomas Lin, Jens Kleesiek, Paul Vozila
- DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts
Yuchen Feng, Bowen Shen, Naibin Gu, Jiaxuan Zhao, Peng Fu, Zheng Lin, Weiping Wang
- DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression
Yi Zhao, Zuchao Li, hai zhao, Baoyuan Qi, Liu Guoming
- Computation Mechanism Behind LLM Position Generalization
Chi Han, Heng Ji
- IPO: Your Language Model is Secretly a Preference Classifier
Shivank Garg, Ayush Singh, Shweta Singh, Paras Chopra
- Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up
Jiahao Yuan, Dehui du, Hao Zhang, Zixiang Di, Usman Naseem
- Déjà Vu? Decoding Repeated Reading from Eye Movements
Yoav Meiri, Omer Shubi, Cfir Avraham Hadar, Ariel Kreisberg Nitzav, Yevgeni Berzak
- LLMs can be easily Confused by Instructional Distractions
Yerin Hwang, Yongil Kim, Jahyun Koo, Taegwan Kang, Hyunkyung Bae, Kyomin Jung
- PlanGenLLMs: A Modern Survey of LLM Planning Capabilities
Hui Wei, Zihao Zhang, Shenghua He, Tian Xia, Shijia Pan, Fei Liu
- IAM: Efficient Inference through Attention Mapping between Different-scale LLMs
Yi Zhao, Zuchao Li, hai zhao
- nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow
Geliang Ouyang, Jingyao Chen, Zhihe Nie, Yi Gui, Yao Wan, Hongyu Zhang, Dongping Chen
- ZIPA: A family of efficient models for multilingual phone recognition
Jian Zhu, Farhan Samir, Eleanor Chodroff, David R. Mortensen
- GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration
Yoo Yeon Sung, Eve Fleisig, Yu Hou, Ishan Upadhyay, Jordan Lee Boyd-Graber
- That doesn’t sound right: Evaluating speech transcription quality in field linguistics corpora
Eric Le Ferrand, Bo Jiang, Emily Prud’hommeaux, Joshua Hartshorne
- Dynamic Evaluation with Cognitive Reasoning for Multi-turn Safety of Large Language Models
Lanxue Zhang, Yanan Cao, Yuqiang Xie, Fang Fang, Yangxi Li
- Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering
William Jurayj, Jeffrey Cheng, Benjamin Van Durme
- From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions
Nathanaël Carraz Rakotonirina, Mohammed Hamdy, Jon Ander Campos, Lucas Weber, Alberto Testoni, Marzieh Fadaee, Sandro Pezzelle, Marco Del Tredici
- Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints
Junxiao Yang, Zhexin Zhang, Shiyao Cui, Hongning Wang, Minlie Huang
- Multilingual Text-to-Image Generation Magnifies Gender Stereotypes
Felix Friedrich, Katharina Hämmerl, Patrick Schramowski, Manuel Brack, Jindřich Libovický, Alexander Fraser, Kristian Kersting
- Adversarial Alignment with Anchor Dragging Drift ($A^3D^2$): Multimodal Domain Adaptation with Partially Shifted Modalities
Jun Sun, Xinxin Zhang, Simin Hong, Jian Zhu, Lingfang Zeng
- A Reality Check on Context Utilisation for Retrieval-Augmented Generation
Lovisa Hagström, Sara Vera Marjanovic, Haeun Yu, Arnav Arora, Christina Lioma, Maria Maistro, Pepa Atanasova, Isabelle Augenstein
- CU-MAM: Coherence-Driven Unified Macro-Structures for Argument Mining
Debela Gemechu, Chris Reed
- Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
Hongyu Chen, Seraphina Goldfarb-Tarrant
- Text-to-ES Bench: A Comprehensive Benchmark for Converting Natural Language to Elasticsearch Query
DonggeXue, Zhili Pu, Zhentao Xia, Hongli Sun, Ruihui Hou, Guangya Yu, Yupian Lin, Yongqi Fan, Jingping Liu, Tong Ruan
- AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation
Songming Zhang, Xue Zhang, Tong Zhang, Bojie Hu, Yufeng Chen, Jinan Xu
- Acoustic Individual Identification of White-Faced Capuchin Monkeys Using Joint Multi-Species Embeddings
Álvaro Vega-Hidalgo, Artem Abzaliev, Thore Bergman, Rada Mihalcea
- DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal
Vaibhav Aggarwal, Ojasv Kamal, Abhinav Japesh, Zhijing Jin, Bernhard Schölkopf
- Steering off Course: Reliability Challenges in Steering Language Models
Patrick Queiroz Da Silva, Hari Sethuraman, Dheeraj Rajagopal, Hannaneh Hajishirzi, Sachin Kumar
- Impartial Multi-task Representation Learning via Variance-invariant Probabilistic Decoding
Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu
- If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World
Adrian de Wynter
- Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization on Multi-party Conversation
Luyao Cheng, Hui Wang, Chong Deng, Siqi Zheng, Yafeng Chen, Rongjie Huang, Qinglin Zhang, Qian Chen, Xihao Li, Wen Wang
- Vulnerability of LLMs to Vertically Aligned Text Manipulations
Zhecheng Li, Yiwei Wang, Bryan Hooi, Yujun Cai, Zhen Xiong, Nanyun Peng, Kai-Wei Chang
- AutoMixer: Checkpoint Artifacts as Automatic Data Mixers
Ernie Chang, Yang Li, Patrick Huber, Vish Vogeti, David Kant, Yangyang Shi, Vikas Chandra
- Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow
Behrooz Azarkhalili, Maxwell W. Libbrecht
- Beyond Prompting: An Efficient Embedding Framework for Open-Domain Question Answering
Zhanghao Hu, Hanqi Yan, Qinglin Zhu, Zhenyi Shen, Yulan He, Lin Gui
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark
Jianlyu Chen, Nan Wang, Chaofan Li, Bo Wang, Shitao Xiao, Han Xiao, Hao Liao, Defu Lian, Zheng Liu
- SELF-PERCEPT: Introspection Improves Large Language Models’ Detection of Multi-Person Mental Manipulation in Conversations
Danush Khanna, Pratinav Seth, Sidhaarth Sredharan Murali, Aditya Kumar Guru, Siddharth Shukla, Tanuj Tyagi, SANDEEP CHAURASIA, Kripabandhu Ghosh
- WE-MATH: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning?
Runqi Qiao, Qiuna Tan, Guanting Dong, MinhuiWu, Chong Sun, Xiaoshuai Song, Jiapeng Wang, Zhuoma GongQue, Shanglin Lei, YiFan Zhang, Zhe Wei, Miaoxuan Zhang, Runfeng Qiao, Xiao Zong, Yida Xu, Peiqing Yang, Zhimin Bao, Muxi Diao, Chen Li, Honggang Zhang
- Modeling the Evolution of English Noun Compounds with Feature-Rich Diachronic Compositionality Prediction
Filip Miletić, Sabine Schulte im Walde
- What’s the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns
Michael A. Hedderich, Anyi Wang, Raoyuan Zhao, Florian Eichin, Jonas Fischer, Barbara Plank
- V-Oracle: Making Progressive Reasoning in Deciphering Oracle Bones for You and Me
Runqi Qiao, Qiuna Tan, Guanting Dong, MinhuiWu, Jiapeng Wang, YiFan Zhang, Zhuoma GongQue, Chong Sun, Yida Xu, Yadong Xue, Ye Tian, Zhimin Bao, LAN YANG, Chen Li, Honggang Zhang
- Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension
Fajri Koto, Amir Hossein Yari
- Improving Language and Modality Transfer in Translation by Character-level Modeling
Ioannis Tsiamas, David Dale, Marta R. Costa-jussà
- DialUp! Modeling the Language Continuum by Adapting Models to Dialects and Dialects to Models
Niyati Bafna, Emily Chang, Nathaniel Romney Robinson, David R. Mortensen, Kenton Murray, David Yarowsky, Hale Sirin
- AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs
Nicholas E. Corrado, Julian Katz-Samuels, Adithya M Devraj, Hyokun Yun, Chao Zhang, Yi Xu, Yi Pan, Bing Yin, Trishul Chilimbi
- A Variational Approach for Mitigating Entity Bias in Relation Extraction
Samuel Mensah, Elena Kochkina, Jabez Magomere, Joy Prakash Sain, Simerjot Kaur, Charese Smiley
- Modelling Complex Semantics Relation with Contrastively Fine-Tuned Relational Encoders
Naïm Es-sebbani, Esteban Marquer, Zied Bouraoui
- Error-driven Data-efficient Large Multimodal Model Tuning
Barry Menglong Yao, Qifan Wang, Lifu Huang
- Planning with Diffusion Models for Target-Oriented Dialogue Systems
Hanwen Du, Bo Peng, Xia Ning
- Interactive and Expressive Code-Augmented Planning with Large Language Models
Anthony Zhe Liu, Xinhe Wang, Jacob Sansom, Yao Fu, Jongwook Choi, Sungryull Sohn, Jaekyeom Kim, Honglak Lee
- Synergistic Weak-Strong Collaboration by Aligning Preferences
Yizhu Jiao, Xuchao Zhang, Zhaoyang Wang, Yubo Ma, Zhun Deng, Rujia Wang, Chetan Bansal, Saravan Rajmohan, Jiawei Han, Huaxiu Yao
- Understanding Silent Data Corruption in LLM Training
Jeffrey Jian Ma, Hengzhi Pei, Leonard Lausen, George Karypis
- Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback
Guan-Ting Lin, Prashanth Gurunath Shivakumar, Aditya Gourav, Yile Gu, Ankur Gandhe, Hung-yi Lee, Ivan Bulyko
- Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs
Jungsoo Park, Junmo Kang, Gabriel Stanovsky, Alan Ritter
- BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona T. Diab, Maarten Sap
- Deep Temporal Reasoning in Video Language Models: A Cross-Linguistic Evaluation of Action Duration and Completion through Perfect Times
Olga Loginova, Sofía Ortega Loguinova
- Amplifying Trans and Nonbinary Voices: A Community-Centred Harm Taxonomy for LLMs
Eddie L. Ungless, Sunipa Dev, Cynthia L. Bennett, Rebecca Gulotta, Jasmijn Bastings, Remi Denton
- Enhancing Human Evaluation in Machine Translation with Comparative Judgement
Yixiao Song, Parker Riley, Daniel Deutsch, Markus Freitag
- Infogen: Generating Complex Statistical Infographics from Documents
Akash Ghosh, Aparna Garimella, Pritika Ramu, Sambaran Bandyopadhyay, Sriparna Saha
- Partial Colexifications Improve Concept Embeddings
Arne Rubehn, Johann-Mattis List
- Improved Unbiased Watermark for Large Language Models
Ruibo Chen, Yihan Wu, Junfeng Guo, Heng Huang
- MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection
Yixian Shen, Qi Bi, JIA-HONG HUANG, Hongyi Zhu, Andy D. Pimentel, Anuj Pathania
- Multi-Attribute Steering of Language Models via Targeted Intervention
Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal
- AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations
Gaurav Verma, Rachneet Kaur, Nishan Srishankar, Zhen Zeng, Tucker Balch, Manuela Veloso
- Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers
Zhijian Xu, Yilun Zhao, Manasi Patwardhan, Lovekesh Vig, Arman Cohan
- On the Acquisition of Shared Grammatical Representations in Bilingual Language Models
Catherine Arnett, Tyler A. Chang, James A. Michaelov, Ben Bergen
- GenKnowSub: Improving Modularity and Reusability of LLMs through General Knowledge Subtraction
Mohammadtaha Bagherifard, Sahar Rajabi, Ali Edalat, Yadollah Yaghoobzadeh
- Using Shapley interactions to understand how models use structure
Diganta Misra, Divyansh Singhvi, Andrej Erkelens, Raghav Jain, Isabel Papadimitriou, Naomi Saphra
- Adversarial Tokenization
Renato Geh, Zilei Shao, Guy Van den Broeck
- Classifying Unreliable Narrators with Large Language Models
Anneliese Brei, Katharine Henry, Abhisheik Sharma, Shashank Srivastava, Snigdha Chaturvedi
- ConceptCarve: Dynamic Realization of Evidence
Eylon Caplan, Dan Goldwasser
- QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question Answering
An Quang Tang, Xiuzhen Zhang, Minh Ngoc Dinh, Zhuang Li
- Navigating Rifts in Human-LLM Grounding: Study and Benchmark
Omar Shaikh, Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz
- Substance over Style: Evaluating Proactive Conversational Coaching Agents
Vidya Srinivas, Xuhai Xu, Xin Liu, Kumar Ayush, Isaac Galatzer-Levy, Shwetak Patel, Daniel McDuff, Tim Althoff
- Open-World Planning via Lifted Regression with LLM-Inferred Affordances for Embodied Agents
Xiaotian Liu, Ali Pesaranghader, HANZE LI, Punyaphat Sukcharoenchaikul, Jaehong Kim, Tanmana Sadhu, Hyejeong Jeon, Scott Sanner
- (RSA)2: A Rhetorical-Strategy-Aware Rational Speech Act Framework for Figurative Language Understanding
Cesare Spinoso-Di Piano, David Eric Austin, Pablo Piantanida, Jackie CK Cheung
- The Role of Abstract Representations and Observed Preferences in the Ordering of Binomials in Large Language Models
Zachary Nicholas Houghton, Kenji Sagae, Emily Morgan
- SYNTHIA: Novel Concept Design with Affordance Composition
Hyeonjeong Ha, Xiaomeng Jin, Jeonghwan Kim, Jiateng Liu, Zhenhailong Wang, Khanh Duy Nguyen, Ansel Blume, Nanyun Peng, Kai-Wei Chang, Heng Ji
- Consistent Client Simulation for Motivational Interviewing-based Counseling
Yizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang, Nicholas Gabriel Lim, Cameron Tan Shi Ern, Phey Ling KIT, Jenny Giam Xiuhui, John Pinto, Ee-Peng Lim
- AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context
Naba Rizvi, Harper Strickland, Daniel Gitelman, Tristan Cooper, Alexis Morales Flores, Aekta Kallepalli, Akshat Alurkar, Haaset Owens, Saleha Ahmedi, Isha Khirwadkar, Imani N. S. Munyaka, Nedjma Ousidhoum
- Structural Reasoning Improves Molecular Understanding of LLM
Yunhui Jang, Jaehyung Kim, Sungsoo Ahn
- CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration
Yizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang, Phey Ling KIT, Nicholas Gabriel Lim, Cameron Tan Shi Ern, Ee-Peng Lim
- Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles
Kuang Wang, Xianfei Li, Shenghao Yang, Li Zhou, Feng Jiang, Haizhou Li
- Targeted Syntactic Evaluation for Grammatical Error Correction
Aomi Koyama, Masato Mita, Su-Youn Yoon, Yasufumi Takama, Mamoru Komachi
- VQ-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos
Tingyu Song, Guo Gan, Tongyan Hu, Yilun Zhao
- Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions
Joseph Suh, Erfan Jahanparast, Suhong Moon, Minwoo Kang, Serina Chang
- SAD-LM: A Large-Scale Generalist Diffusion Language Model
Jaesung Tae, Hamish Ivison, Sachin Kumar, Arman Cohan
- Detecting LLM-Generated Korean Text through Linguistic Feature Analysis
Shinwoo Park, Shubin Kim, Do-Kyung Kim, Yo-Sub Han
- Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL
Hanbing Liu, Haoyang Li, Xiaokang Zhang, Ruotong Chen, Haiyong Xu, Tian Tian, Qi Qi, Jing Zhang
- On Generalization across Measurement Systems: LLMs Entail More Test-Time Compute for Underrepresented Cultures
Minh Duc Bui, Kyung eun Park, Goran Glavaš, Fabian David Schmidt, Katharina von der Wense
- CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships?
Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee
- Veracity Bias and Beyond: Uncovering LLMs’ Hidden Beliefs in Problem-Solving Reasoning
Yue Zhou, Barbara Di Eugenio
- Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Meng Li, Guangda Huzhang, Haibo Zhang, Xiting Wang, Anxiang Zeng
- LLM Meets Scene Graph: Can Large Language Models Understand and Generate Scene Graphs? A Benchmark and Empirical Study
Dongil Yang, Minjin Kim, Sunghwan Kim, Beong-woo Kwak, Minjun Park, Jinseok Hong, Woontack Woo, Jinyoung Yeo
- Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems
Haochun Wang, Sendong Zhao, Jingbo Wang, Zewen Qiang, Bing Qin, Ting Liu
- The Invisible Hand: Unveiling Provider Bias in Large Language Models for Code Generation
Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Qingshuang Bao, Weipeng Jiang, Qian Wang, Chao Shen, Yang Liu
- K/DA: Automated Data Generation Pipeline for Detoxifying Implicitly Offensive Language in Korean
Minkyeong Jeon, Hyemin Jeong, Yerang Kim, Jiyoung Kim, Jae Hyeon Cho, Byung-Jun Lee
- THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation
Yunlong Liang, Fandong Meng, Jie Zhou
- Neuron Empirical Gradient: Discovering and Quantifying Neurons’ Global Linear Controllability
Xin Zhao, Zehui Jiang, Naoki Yoshinaga
- Can third-parties read our emotions?
Jiayi Li, Yingfan Zhou, Pranav Narayanan Venkit, Halima Binte Islam, Sneha Arya, Shomir Wilson, Sarah Rajtmajer
- OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching
Nghia Huynh Nguyen Hieu, Ngoc Son Nguyen, Huynh Nguyen Dang, Thieu Vo, Truong-Son Hy, Van Nguyen
- World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Siyin Wang, Zhaoye Fei, Qinyuan Cheng, Shiduo Zhang, Panpan Cai, Jinlan Fu, Xipeng Qiu
- JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs
Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang
- CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models
Xiaqiang Tang, Jian Li, Keyu Hu, nan du, Xiaolong Li, Xi Zhang, Weigao Sun, Sihong Xie
- Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models
Yuqiao Tan, Shizhu He, Kang Liu, Jun Zhao
- Enhancing Mathematical Reasoning in LLMs by Stepwise Correction
Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang
- PsyDial: A Large-scale Long-term Conversational Dataset for Mental Health Support
Huachuan Qiu, Zhenzhong Lan
- Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and Correction
Didi Zhang, Yaxin Fan, Peifeng Li, Qiaoming Zhu
- Exclusion of Thought: Mitigating Cognitive Load in Large Language Models for Enhanced Reasoning in Multiple-Choice Tasks
Qihang Fu, Yongbin Qin, Ruizhang Huang, Yanping Chen, Yulin Zhou, Lintao Long
- Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation
Zhi Qu, Yiran Wang, Jiannan Mao, Chenchen Ding, Hideki Tanaka, Masao Utiyama, Taro Watanabe
- VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search
Yikun Wang, Siyin Wang, Qinyuan Cheng, Zhaoye Fei, Liang Ding, Qipeng Guo, Dacheng Tao, Xipeng Qiu
- Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models
JianXing Liao, Junyan Xu, Yatao Sun, Maowen Tang, Sicheng He, Jingxian Liao, Shui Yu, Yun Li, Xiaohong Guan
- LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma, Dongrui Liu, Qian Chen, Linfeng Zhang, Jing Shao
- Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback
Jiakang Yuan, Xiangchao Yan, Bo Zhang, Tao Chen, Botian Shi, Wanli Ouyang, Yu Qiao, LEI BAI, Bowen Zhou
- PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
Yun Luo, Yingjie Li, Xiangkun Hu, Qinglin Qi, Fang Guo, Qipeng Guo, Zheng Zhang, Yue Zhang
- Prompt-Guided Internal States for Hallucination Detection of Large Language Models
Fujie Zhang, Peiqi Yu, Biao Yi, Baolei Zhang, Tong Li, Zheli Liu
- Typology-Guided Adaptation for African NLP
Ndapa Nakashole
- Don’t Erase, Inform! Detecting and Contextualizing Harmful Language in Cultural Heritage Collections
Orfeas Menis Mastromichalakis, Jason Liartis, Kristina Rose, Antoine Isaac, Giorgos Stamou
- ECLM: Entity Level Language Model for Spoken Language Understanding with Chain of Intent
Shangjian Yin, Peijie Huang, JiaTian Chen, Haojing Huang, Yuhong Xu
- FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation
Qinggang Zhang, Zhishang Xiang, Yilin Xiao, Le Wang, Junhui Li, Jinsong Su, Xinrun Wang
- Knowledge Image Matters: Improving Knowledge-Based Visual Reasoning with Multi-Image Large Language Models
Guanghui Ye, Huan Zhao, Zhixue Zhao, Xupeng Zha, Yang Liu, Zhihua Jiang
- Evaluating Personalized Tool-Augmented LLMs from the Perspectives of Personalization and Proactivity
Yupu Hao, Pengfei Cao, Zhuoran Jin, Huanxuan Liao, Yubo Chen, Kang Liu, Jun Zhao
- GUICourse: From General Vision Language Model to Versatile GUI Agent
Wentong Chen, Junbo Cui, Jinyi Hu, Yujia Qin, Junjie Fang, Yue Zhao, Chongyi Wang, Jun Liu, Guirong Chen, Yupeng Huo, Yuan Yao, Yankai Lin, Zhiyuan Liu, Maosong Sun
- Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration
ChaeHun Park, Yujin Baek, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo
- Maximizing the Effectiveness of Larger BERT Models for Compression
Wen-Shu Fan, Su Lu, Shangyu Xing, Xin-Chun Li, De-Chuan Zhan
- Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference
Thanh Le-Cong, Bach Le, Toby Murray
- HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring
Zhixiong Su, Yichen Wang, Herun Wan, Zhaohan Zhang, Minnan Luo
- IndicSynth: A Large-Scale Multilingual Synthetic Speech Dataset for Low-Resource Indian Languages
Divya V Sharma, Vijval Ekbote, Anubha Gupta
- Reinforced IR: A Dual Reinforcement Framework For Domain-Adapted Information Retrieval
Chaofan Li, Jianlyu Chen, Yingxia Shao, Chaozhuo Li, Quanqing Xu, Defu Lian, Zheng Liu
- CoIR: A Comprehensive Benchmark for Code Information Retrieval Models
Xiangyang Li, Kuicai Dong, Yi Quan Lee, Wei Xia, Hao Zhang, Xinyi Dai, Yasheng Wang, Ruiming Tang
- Enhancing Multimodal Retrieval via Complementary Information Extraction and Alignment
Delong Zeng, Yuexiang Xie, Yaliang Li, Ying Shen
- JoPA: Explaining Large Language Model’s Generation via Joint Prompt Attribution
Yurui Chang, Bochuan Cao, Yujia Wang, Jinghui Chen, Lu Lin
- Proxy-Driven Robust Multimodal Sentiment Analysis with Incomplete Data
Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An
- Not All Terms Matter: Recall-Oriented Adaptive Learning for PLM-aided Query Expansion in Open-Domain Question Answering
Xinran Chen, Ben He, Xuanang Chen, Le Sun
- A Mutual Information Perspective on Knowledge Graph Embedding
Jiang Li, Xiangdong Su, Zehua Duo, Tian Lan, Xiaotao Guo, Guanglai Gao
- Aligned but Blind: Alignment Increases Implicit Bias by Reducing Awareness of Race
Lihao Sun, Chengzhi Mao, Valentin Hofmann, Xuechunzi Bai
- IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization
Xinghua Zhang, Haiyang Yu, ChengFu, Fei Huang, Yongbin Li
- ProMALex: Progressive Modular Adapters for Multi-Jurisdictional Legal Language Modeling
Santosh T.Y.S.S, Mohamed Hesham Elganayni
- Flipping Knowledge Distillation: Leveraging Small Models’ Expertise to Enhance LLMs in Text Matching
Mingzhe Li, Jing Xiang, Qishen Zhang, Kaiyang Wan, Xiuying Chen
- Disentangling Language Medium and Culture Context for Evaluating Multilingual Large Language Models
Jiahao Ying, Wei Tang, Yiran Zhao, Yixin Cao, Yu Rong, Wenxuan Zhang
- Detecting Sockpuppetry on Wikipedia Using Meta-Learning
Christine de Kock, Luc Raszewski
- Diversity-oriented Data Augmentation with Large Language Models
Zaitian Wang, Jinghan Zhang, Xinhao Zhang, Kunpeng Liu, pengfei wang, Yuanchun Zhou
- Can LLMs Understand Unvoiced Speech? Exploring EMG-to-Text Conversion with LLMs
Payal Mohapatra, Akash Pandey, Xiaoyuan Zhang, Qi Zhu
- CoreEval: Automatically Building Contamination-Resilient Datasets with Real-World Knowledge toward Reliable LLM Evaluation
Jingqian Zhao, Bingbing Wang, Geng Tu, Yice Zhang, Qianlong Wang, Bin Liang, Jing Li, Ruifeng Xu
- RiOT: Efficient Prompt Refinement with Residual Optimization Tree
Chenyi Zhou, Zhengyan Shi, Yuan Yao, Lei Liang, Huajun Chen, Qiang Zhang
- Caution for the Environment: LLM Agents are Susceptible to Environmental Distractions
Xinbei Ma, Yiting Wang, Yao Yao, Tongxin Yuan, Aston Zhang, Zhuosheng Zhang, hai zhao
- Decoder-Only LLMs can be Masked Auto-Encoders
Dan Qiao, Yuan Gao, Zheming Yang, Di Yang, Ziheng Wu, Pengcheng Lu, Minghui Qiu, Juntao Li, Min Zhang
- Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark
Rong-Cheng Tu, Zi-Ao Ma, Tian Lan, Yuehao Zhao, Heyan Huang, Xian-Ling Mao
- Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering
Rongzhi Zhu, Xiangyu Liu, Zequn Sun, Yiwei Wang, Wei Hu
- TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models
Xinyi He, Yihao Liu, Mengyu Zhou, Yeye He, Haoyu Dong, Shi Han, Zejian Yuan, Dongmei Zhang
- Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Maosongcao, Taolin Zhang, Mo Li, Chuyu Zhang, Yunxin Liu, Conghui He, Haodong Duan, Songyang Zhang, Kai Chen
- CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data Synthesis
Ruixiang Feng, Shen Gao, Xiuying Chen, Lisi Chen, Shuo Shang
- Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis
Junzhuo Li, Bo Wang, Xiuze Zhou, Peijie Jiang, Jia Liu, Xuming Hu
- ChartLens: Fine-grained Visual Attribution in Charts
Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha
- LESA: Learnable LLM Layer Scaling-Up
Yifei Yang, zouying cao, Xinbei Ma, Yao Yao, Zhi Chen, Libo Qin, hai zhao
- MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation
Haochen Xue, Feilong Tang, Ming Hu, Yexin Liu, Qidong Huang, Yulong Li, Chengzhi Liu, Zhongxing Xu, Chong Zhang, Chun-Mei Feng, Yutong Xie, Imran Razzak, Zongyuan Ge, Jionglong Su, Junjun He, Yu Qiao
- Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang, Qiuchi Li, Dawei Song, Zheyu Ye, Yan Gao, Yao Hu
- WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
Rajath Rao, Adithya V Ganesan, Oscar Kjell, Jonah Luby, Akshay Raghavan, Scott M. Feltman, Whitney Ringwald, Ryan L. Boyd, Benjamin J. Luft, Camilo J. Ruggero, Neville Ryant, ROMAN KOTOV, H. Schwartz
- Keys to Robust Edits: From Theoretical Insights to Practical Advances
Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang
- Boosting LLM’s Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning
Xiang Zhuang, Bin Wu, Jiyu Cui, Kehua Feng, Xiaotong Li, Huabin Xing, Keyan Ding, Qiang Zhang, Huajun Chen
- MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation
María Andrea Cruz Blandón, Jayasimha Talur, Bruno Charron, Dong Liu, Saab Mansour, Marcello Federico
- The Role of Visual Modality in Multimodal Mathematical Reasoning: Challenges and Insights
Yufang Liu, Yao Du, Tao Ji, Jianing Wang, Yang Liu, Yuanbin Wu, Aimin Zhou, Mengdi Zhang, Xunliang Cai
- The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters
Chulun Zhou, Qiujing Wang, Mo Yu, Xiaoqian Yue, Rui Lu, Jiangnan Li, Yifan Zhou, Shunchi Zhang, Jie Zhou, Wai Lam
- S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Ruotian Ma, Peisong Wang, Cheng Liu, Xingyan Liu, Jiaqi Chen, Bang Zhang, Xin Zhou, nan du, Jia Li
- Advancing Collaborative Debates with Role Differentiation through Multi-Agent Reinforcement Learning
Haoran Li, Ziyi Su, Yun Xue, Zhiliang Tian, YIPING SONG, Minlie Huang
- Retrieval-Augmented Fine-Tuning With Preference Optimization For Visual Program Generation
Deokhyung Kang, Jeonghun Cho, Yejin Jeon, Sunbin Jang, Minsub Lee, JAWOON CHO, Gary Lee
- STRICTA: Structured Reasoning in Critical Text Assessment for Peer Review and Beyond
Nils Dycke, Matej Zečević, Ilia Kuznetsov, Beatrix Suess, Kristian Kersting, Iryna Gurevych
- XDAC: XAI-Driven Detection and Attribution of LLM-Generated News Comments in Korean
Wooyoung Go, Hyoungshick Kim, Alice Oh, Yongdae Kim
- CENTAUR: Bridging the Impossible Trinity of Privacy, Efficiency, and Performance in Privacy-Preserving Transformer Inference
Jinglong Luo, Guanzhong Chen, Yehong Zhang, SHIYU LIU, Hui Wang, Yue Yu, Xun Zhou, Yuan Qi, Zenglin Xu
- Silencing Empowerment, Allowing Bigotry: Auditing the Moderation of Hate Speech on Twitch
Prarabdh Shukla, Wei Yin Chong, Yash Patel, Brennan Schaffner, Danish Pruthi, Arjun Bhagoji
- EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models
Che Hyun Lee, Heeseung Kim, Jiheum Yeom, Sungroh Yoon
- TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages
Jafar Isbarov, Arofat Akhundjanova, Mammad Hajili, Kavsar Huseynova, Dmitry Gaynullin, Anar Rzayev, Osman Tursun, Aizirek Turdubaeva, Ilshat Saetov, Rinat Kharisov, Saule Belginova, Ariana Kenbayeva, Amina Alisheva, Abdullatif Köksal, SAMIR RUSTAMOV, Duygu Ataman
- Look Both Ways and No Sink: Converting LLMs into Text Encoders without Training
Ziyong Lin, Haoyi Wu, Shu Wang, Kewei Tu, Zilong Zheng, Zixia Jia
- Mitigating Posterior Salience Attenuation in Long-Context LLMs with Positional Contrastive Decoding
Zikai Xiao, Ziyang Wang, Wen MA, Yan Zhang, Wei Shen, WangYan, Luqi Gong, Zuozhu Liu
- A Statistical and Multi-Perspective Revisiting of the Membership Inference Attack in Large Language Models
Bowen Chen, Namgi Han, Yusuke Miyao
- Around the World in 24 Hours: Probing LLM Knowledge of Time and Place
Carolin Holtermann, Paul Röttger, Anne Lauscher
- Mining the uncertainty patterns of humans and models in the annotation of moral foundations and human values
Neele Falk, Gabriella Lapesa
- “What do you call a dog that is incontrovertibly true? Dogma’’: Testing LLM Generalization through Humor
Alessio Cocchieri, Luca Ragazzi, Paolo Italiani, Giuseppe Tagliavini, Gianluca Moro
- Towards Harmonized Uncertainty Estimation for Large Language Models
Rui Li, Jing Long, Muge Qi, Heming Xia, Lei Sha, Peiyi Wang, Zhifang Sui
- VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare
Anudeex Shetty, Amin Beheshti, Mark Dras, Usman Naseem
- Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media
Zhen Sun, Zongmin Zhang, Xinyue Shen, Ziyi Zhang, Yule Liu, Michael Backes, Yang Zhang, Xinlei He
- From English to Second Language Mastery: Enhancing LLMs with Cross-Lingual Continued Instruction Tuning
Linjuan Wu, Hao-Ran Wei, Baosong Yang, Weiming Lu
- WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks
Anudeex Shetty, Qiongkai Xu, Jey Han Lau
- HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation
Yuhan Chen, Ang Lv, Jian Luan, Bin Wang, Wei Liu
- One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments
Ke Yi, Yuhui Xu, Heng Chang, Yuan Meng, Tong Zhang, Jia Li
- Beyond Logits: Aligning Feature Dynamics for Effective Knowledge Distillation
Guoqiang Gong, Jiaxing Wang, Jin Xu, Deping Xiang, Zicheng Zhang, Leqi Shen, Yifeng Zhang, JunhuaShu, ZhaolongXing, Zhen Chen, Pengzhang Liu, Ke Zhang
- Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Jingyang Yuan, Huazuo Gao, Damai Dai, Junyu Luo, Liang Zhao, Zhengyan Zhang, Zhenda Xie, Yuxing Wei, Lean Wang, Zhiping Xiao, Yuqing Wang, Chong Ruan, Ming Zhang, Wenfeng Liang, Wangding Zeng
- DRAE: Dynamic Retrieval-Augmented Expert Networks for Lifelong Learning and Task Adaptation in Robotics
Yayu Long, Kewei Chen, Long Jin, Mingsheng Shang
- MT-RAIG: Novel Benchmark and Evaluation Framework for Retrieval-Augmented Insight Generation over Multiple Tables
Kwangwook Seo, Donguk Kwon, Dongha Lee
- Enhancing Chain-of-Thought Reasoning with Critical Representation Fine-tuning
Chenxi Huang, Shaotian Yan, Liang Xie, Binbin Lin, Sinan Fan, Yue Xin, Deng Cai, Chen Shen, Jieping Ye
- Does the Emotional Understanding of LVLMs Vary Under High-Stress Environments and Across Different Demographic Attributes?
Jaewook Lee, Yeajin Jang, Oh-Woog KWON, Harksoo Kim
- S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling
Suman Adhya, Debarshi Kumar Sanyal
- Learning to Look at the Other Side: A Probing Study of Word Semantics in LLMs with Enabled Bidirectional Attention
Zhaoxin Feng, MA Jianfei, Xiaoyi Bao, Xiaojing Zhao, Emmanuele Chersoni
- Tracing and Dissecting How LLMs Recall Factual Knowledge for Real World Questions
Yiqun Wang, Chaoqun Wan, Sile Hu, Yonggang Zhang, Xiang Tian, Yaowu Chen, Xu Shen, Jieping Ye
- Employing Discourse Coherence Enhancement to Improve Cross-Document Event and Entity Coreference Resolution
Xinyu Chen, Peifeng Li, Qiaoming Zhu
- Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning
Shaobo Wang, Xiangqi Jin, Ziming Wang, Jize Wang, Jiajun Zhang, Kaixin Li, Zichen Wen, Zhong Li, Conghui He, Xuming Hu, Linfeng Zhang
- Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation
Shuo Tang, Xianghe Pang, Zexi Liu, Bohan Tang, Rui Ye, Tian Jin, Xiaowen Dong, Yanfeng Wang, Siheng Chen
- SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
Yige Xu, Xu Guo, Zhiwei Zeng, Chunyan Miao
- FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop Reasoning
Seunghee Kim, Changhyeon Kim, Taeuk Kim
- Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms
Mengru Wang, Ziwen Xu, Shengyu Mao, Shumin Deng, Zhaopeng Tu, Huajun Chen, Ningyu Zhang
- MobiLoRA: Accelerating LoRA-based LLM Inference on Mobile Devices via Context-aware KV Cache Optimization
Borui Li, Yitao Wang, Haoran Ma, Ligeng Chen, Jun Xiao, Shuai Wang
- Language Models Resist Alignment: Evidence From Data Compression
Jiaming Ji, Kaile Wang, Tianyi Qiu, Boyuan Chen, Jiayi Zhou, Changye Li, Hantao Lou, Josef Dai, Yunhuai Liu, Yaodong Yang
- Beyond the Answer: Advancing Multi-Hop QA with Fine-Grained Graph Reasoning and Evaluation
Qichuan Liu, Chentao Zhang, Chenfeng Zheng, Guosheng Hu, Xiaodong Li, Zhihong Zhang
- Mamba Knockout for Unraveling Factual Information Flow
Nir Endy, Idan Daniel Grosbard, Yuval Ran-Milo, Yonatan Slutzky, Itay Tshuva, Raja Giryes
- Small Changes, Big Impact: How Manipulating a Few Neurons Can Drastically Alter LLM Aggression
Jaewook Lee, Junseo Jang, Oh-Woog KWON, Harksoo Kim
- Towards Widening The Distillation Bottleneck for Reasoning Models
Huifeng Yin, Yu Zhao, Minghao Wu, Xuanfan Ni, Bo Zeng, huaiyu.wh, Tianqi Shi, Liangying Shao, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang
- Curiosity-Driven Reinforcement Learning from Human Feedback
Haoran Sun, Yekun Chai, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang
- T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback
Zehan Wang, Ke Lei, Chen Zhu, Jiawei Huang, Sashuai zhou, Luping Liu, Xize Cheng, Shengpeng Ji, Zhenhui Ye, Tao Jin, Zhou Zhao
- CoE: A Clue of Emotion Framework for Emotion Recognition in Conversations
Zhiyu Shen, Yunhe Pang, Yanghui Rao, Jianxing Yu
- MPO: Multilingual Safety Alignment via Reward Gap Optimization
Weixiang Zhao, Yulin Hu, Yang Deng, Tongtong Wu, Wenxuan Zhang, Jiahe Guo, An Zhang, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu
- QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions
Siyin Wang, Wenyi Yu, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Lu Lu, Yu Tsao, Junichi Yamagishi, Yuxuan Wang, Chao Zhang
- On the Relation Between Fine-Tuning, Topological Properties, and Task Performance in Sense-Enhanced Embeddings
Deniz Ekin Yavas, Timothée Bernard, Benoit Crabbé, Laura Kallmeyer
- Finding Needles in Images: Can Multi-modal LLMs Locate Fine Details?
Parth Thakkar, Ankush Agarwal, Prasad Kasu, Pulkit Bansal, Chaitanya Devaguptapu
- Don’t Half-listen: Capturing Key-part Information in Continual Instruction Tuning
Yongquan He, Wenyuan Zhang, Xuancheng Huang, peng zhang, Lingxun Meng, Xiang Zhou, Ke Zeng, Xunliang Cai
- Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction
Yooseop Lee, Suin Kim, Yohan Jo
- Exploring Explanations Improves the Robustness of In-Context Learning
Ukyo Honda, Tatsushi Oka
- Prediction Hubs are Context-Informed Frequent Tokens in LLMs
Beatrix Miranda Ginn Nielsen, Iuri Macocco, Marco Baroni
- Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law
Qiming Ge, Shuhao Xing, Songyang Gao, Yunhua Zhou, Yicheng Zou, Songyang Zhang, Zhi Chen, Hang Yan, Qi Zhang, Qipeng Guo, Kai Chen
- CRUXEVAL-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution
Ruiyang Xu, Jialun Cao, Yaojie Lu, Ming Wen, Hongyu Lin, Xianpei Han, Ben He, Shing-Chi Cheung, Le Sun
- Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs
Haozhen Zhang, Tao Feng, Jiaxuan You
- Rubrik’s Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Diana Galvan-Sosa, Gabrielle Gaudeau, Pride Kavumba, Yunmeng Li, Hongyi gu, Zheng Yuan, Keisuke Sakaguchi, Paula Buttery
- A Dual-Mind Framework for Strategic and Expressive Negotiation Agent
Yutong Liu, Lida Shi, Rui Song, Hao Xu
- Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models
Junjie Wu, Gefei Gu, Yanan Zheng, Dit-Yan Yeung, Arman Cohan
- Revisiting Scaling Laws for Language Models: The Role of Data Quality and Training Strategies
Zhengyu Chen, Siqi Wang, Teng Xiao, Yudong Wang, Shiqi Chen, Xunliang Cai, Junxian He, Jingang Wang
- Limited Generalizability in Argument Mining: State-Of-The-Art Models Learn Datasets, Not Arguments
Marc Feger, Katarina Boland, Stefan Dietze
- Enhancing Machine Translation with Self-Supervised Preference Data
Haoxiang Sun, Ruize Gao, Pei Zhang, Baosong Yang, Rui Wang
- Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval
Hao Sun, Yingyan Hou, Jiayan Guo, Bo Wang, Chunyu Yang, Jinsong Ni, Yan Zhang
- Don’t Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls
Ante Wang, Linfeng Song, Ye Tian, Dian Yu, Haitao Mi, Xiangyu Duan, Zhaopeng Tu, Jinsong Su, Dong Yu
- MEXMA: Token-level objectives improve sentence representations
João Maria Janeiro, Benjamin Piwowarski, Patrick Gallinari, Loic Barrault
- Sparse-to-Dense: A Free Lunch for Lossless Acceleration of Video Understanding in LLMs
Xuan Zhang, Cunxiao Du, Sicheng Yu, Jiawei Wu, Fengzhuo Zhang, Wei Gao, Qian Liu
- Uncertainty-Aware Iterative Preference Optimization for Enhanced LLM Reasoning
Lei Li, Hehuan Liu, Yaxin Zhou, ZhaoYang Gui, Xudong Weng, Yi YUAN, Zheng Wei, Zang Li
- AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration
Zhexuan Wang, Yutong Wang, Xuebo Liu, Liang Ding, Miao Zhang, Jie Liu, Min Zhang
- Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States
Yang Xiao, Jiashuo WANG, Qiancheng Xu, Changhe Song, Chunpu Xu, Yi Cheng, Wenjie Li, Pengfei Liu
- M-IFEval: On Multilingual Instruction-Following Capability of Large Language Models
Bo Zeng, Chenyang Lyu, Sinuo Liu, Mingyan Zeng, Minghao Wu, Xuanfan Ni, Tianqi Shi, Yu Zhao, Yefeng Liu, Chenyu Zhu, Ruizhe Li, Jiahui Geng, Qing Li, Yu Tong, Longyue Wang, Weihua Luo, Kaifu Zhang
- Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
Andrea Santilli, Adam Golinski, Michael Kirchhof, Federico Danieli, Arno Blaas, Miao Xiong, Luca Zappella, Sinead Williamson
- Representation Bending for Large Language Model Safety
Ashkan Yousefpour, Taeheon Kim, Ryan Sungmo Kwon, Seungbeen Lee, Wonje Jeung, Seungju Han, Alvin Wan, Harrison Ngan, Youngjae Yu, Jonghyun Choi
- Analyzing LLMs’ Cognition of Knowledge Boundary Across Languages Through the Lens of Internal Representation
Chenghao Xiao, Hou Pong Chan, Hao Zhang, Mahani Aljunied, Lidong Bing, Noura Al Moubayed, Yu Rong
- Enhancing Retrieval-Augmented Generation via Evidence Tree Search
Hao Sun, Hengyi Cai, Yuchen Li, Xuanbo Fan, Xiaochi Wei, Shuaiqiang Wang, Yan Zhang, Dawei Yin
- HalluLens: LLM Hallucination Benchmark
Yejin Bang, Ziwei Ji, Alan Schelten, Anthony Hartshorn, Tara Fowler, Cheng Zhang, Nicola Cancedda, Pascale Fung
- DEEPER Insight into Your User: Directed Persona Refinement for Dynamic Persona Modeling
Aili Chen, Chengyu Du, Jiangjie Chen, Jinghan Xu, Yikai Zhang, Siyu Yuan, Zulong Chen, Liangyue Li, Yanghua Xiao
- Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models
Jie Liu, Wenxuan Wang, SU Yihang, Jingyuan Huang, Yudi Zhang, Cheng-Yi Li, Wenting Chen, Xiaohan Xing, Kao-Jung Chang, Linlin Shen, Michael R. Lyu
- InstructPart: Task-Oriented Part Segmentation with Instruction Reasoning
Zifu Wan, Yaqi Xie, Ce Zhang, Zhiqiu Lin, Zihan Wang, Simon Stepputtis, Deva Ramanan, Katia P. Sycara
- GRaMPa: Subword Regularisation by Skewing Uniform Segmentation Distributions with an Efficient Path-counting Markov Model
Thomas Bauwens, David Kaczér, Miryam de Lhoneux
- Evaluating the Evaluation of Diversity in Commonsense Generation
Tianhui Zhang, Bei Peng, Danushka Bollegala
- Generate First, Then Sample: Enhancing Fake News Detection with LLM-Augmented Reinforced Sampling
Zhao Tong, Yimeng Gu, Huidong Liu, Qiang Liu, Shu Wu, Haichao Shi, Xiao-Yu Zhang
- ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data
Yu Zhang, Ruijie Yu, Jidong Tian, Feng Zhu, Jiapeng Liu, Xiaokang Yang, Yaohui Jin, Yanyan Xu
- Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception
Shiyu Ni, Keping Bi, Jiafeng Guo, Lulu Yu, Baolong Bi, Xueqi Cheng
- ALGEN: Few-shot Inversion Attacks on Textual Embeddings using Alignment and Generation
Yiyi Chen, Qiongkai Xu, Johannes Bjerva
- Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains
Kun LI, Tianhua Zhang, Xixin Wu, Hongyin Luo, James R. Glass, Helen M. Meng
- STaR-SQL: Self-Taught Reasoner for Text-to-SQL
Mingqian He, Yongliang Shen, Wenqi Zhang, Qiuying Peng, Jun Wang, Weiming Lu
- Fairness Beyond Performance: Investigating Reliability Disparities Across Groups in Legal NLP
Santosh T.Y.S.S, Irtiza Chowdhury
- Beyond Similarity: A Gradient-based Graph Method for Instruction Tuning Data Selection
Yang Zhao, Li Du, Xiao Ding, Yangou Ouyang, Hepeng Wang, Kai Xiong, Jinglong Gao, Zhouhao Sun, Dongliang Xu, Qing Yang, Dongchen Li, Bing Qin, Ting Liu
- FastMCTS: A Simple Sampling Strategy for Data Synthesis
Peiji Li, Kai Lv, Yunfan Shao, Yichuan Ma, Linyang Li, Xiaoqing Zheng, Xipeng Qiu, Qipeng Guo
- Dialogue-RAG: Enhancing Retrieval for LLMs via Node-Linking Utterance Rewriting
Qiwei Li, Teng Xiao, Zuchao Li, Ping Wang, Mengjia Shen, hai zhao
- Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent
Ethan Wilcox, Cui Ding, Giovanni Acampa, Tiago Pimentel, Alex Warstadt, Tamar I Regev
- Evaluating LLMs for Portuguese Sentence Simplification with Linguistic Insights
ARTHUR MARIANO ROCHA DE AZEVEDO SCALERCIO, Elvis A. de Souza, Maria José Bocorny Finatto, Aline Paes
- LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models
Hugo Pitorro, Marcos Vinicius Treviso
- Memorization Inheritance in Sequence-Level Knowledge Distillation for Neural Machine Translation
Verna Dankers, Vikas Raunak
- Improving Low-Resource Morphological Inflection via Self-Supervised Objectives
Adam Wiemerslage, Katharina von der Wense
- Don’t Reinvent the Wheel: Efficient Instruction-Following Text Embedding based on Guided Space Transformation
Yingchaojie Feng, Yiqun Sun, Yandong Sun, Minfeng Zhu, Qiang Huang, Anthony Kum Hoe Tung, Wei Chen
- BOOKCOREF: Coreference Resolution at Book Scale
Giuliano Martinelli, Tommaso Bonomo, Pere-Lluís Huguet Cabot, Roberto Navigli
- OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval
Wei Yang, Jingjing Fu, Rui Wang, Jinyu Wang, Lei Song, Jiang Bian
- Alleviating Hallucinations from Knowledge Misalignment in Large Language Models via Selective Abstention Learning
Lei Huang, Xiaocheng Feng, Weitao Ma, Yuchun Fan, Xiachong Feng, Yuxuan Gu, Yangfan Ye, Liang Zhao, Weihong Zhong, Baoxin Wang, Dayong Wu, Guoping Hu, Lingpeng Kong, Tong Xiao, Ting Liu, Bing Qin
- Retrospective Learning from Interactions
Zizhao Chen, Mustafa Omer Gul, Yiwei Chen, Gloria Geng, Anne Wu, Yoav Artzi
- Personalized Generation In Large Model Era: A Survey
Yiyan Xu, Jinghao Zhang, Alireza Salemi, Xinting Hu, Wenjie Wang, Fuli Feng, Hamed Zamani, Xiangnan He, Tat-Seng Chua
- Graph Counselor: Adaptive Graph Exploration via Multi-Agent Synergy to Enhance LLM Reasoning
Junqi Gao, Xiang Zou, Ying Ai, Dong Li, Yichen Niu, Biqing Qi, Jianxing Liu
- SOTOPIA-Ω: Dynamic Strategy Injection Learning and Social Instruction Following Evaluation for Social Agents
Wenyuan Zhang, Tianyun Liu, Mengxiao Song, Xiaodong Li, Tingwen Liu
- Can Language Models Replace Programmers? REPOCOD Says ‘Not Yet’
Shanchao Liang, Nan Jiang, Yiran Hu, Lin Tan
- Leveraging In-Context Learning for Political Bias Testing of LLMs
Patrick Haller, Jannis Vamvas, Rico Sennrich, Lena Ann Jäger
- CoRet: Improved Retriever for Code Editing
Fabio James Fehr, Prabhu Teja S, Luca Franceschi, Giovanni Zappella
- ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting
Steven H Wang, Maksim Zubkov, Kexin Fan, Sarah Harrell, Yuyang Sun, Wei Chen, Andreas Plesner, Roger Wattenhofer
- LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts
Qibing Ren, Hao Li, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao
- WAFFLE: Fine-tuning Multi-Modal Model for Automated Front-End Development
Shanchao Liang, Nan Jiang, Shangshu Qian, Lin Tan
- Math Neurosurgery: Isolating Language Models’ Math Reasoning Abilities Using Only Forward Passes
Bryan R Christ, Zachary Gottesman, Jonathan Kropko, Thomas Hartvigsen
- Multiple LLM Agents Debate for Equitable Cultural Alignment
Dayeon Ki, Rachel Rudinger, Tianyi Zhou, Marine Carpuat
- RefreshKV: Updating Small KV Cache During Long-form Generation
Fangyuan Xu, Tanya Goyal, Eunsol Choi
- SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings
Weikai Lu, Hao Peng, Huiping Zhuang, Cen Chen, Ziqian Zeng
- Has Machine Translation Evaluation Achieved Human Parity? The Human Reference and the Limits of Progress
Lorenzo Proietti, Stefano Perrella, Roberto Navigli
- Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective
Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, MAHMOUD KHADEMI, Hany Hassan Awadalla, Junjie Wang, Yujiu Yang, Furu Wei
- Language Models Grow Less Humanlike beyond Phase Transition
Tatsuya Aoyama, Ethan Wilcox
- PCoT: Persuasion-Augmented Chain of Thought for Detecting Fake News and Social Media Disinformation
Arkadiusz Modzelewski, Witold Sosnowski, Tiziano Labruna, Adam Wierzbicki, Giovanni Da San Martino
- Coordinating Chaos: A Structured Review of Linguistic Coordination Methodologies
Benjamin Roger Litterer, David Jurgens, Dallas Card
- iNews: A Multimodal Dataset for Modeling Personalized Affective Responses to News
Tiancheng Hu, Nigel Collier
- Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures
Akhila Yerukola, Saadia Gabriel, Nanyun Peng, Maarten Sap
- 500xCompressor: Generalized Prompt Compression for Large Language Models
Zongqian Li, Yixuan Su, Nigel Collier
- Estimating Privacy Leakage of Augmented Contextual Knowledge in Language Models
James Flemings, Bo Jiang, Wanrong Zhang, Zafar Takhirov, Murali Annavaram
- Document-Level Event-Argument Data Augmentation for Challenging Role Types
Joseph Gatto, Omar Sharif, Parker Seegmiller, Sarah Masud Preum
- Mapping the Podcast Ecosystem with the Structured Podcast Research Corpus
Benjamin Roger Litterer, David Jurgens, Dallas Card
- Unravelling the Logic: Investigating the Generalisation of Transformers in Numerical Satisfiability Problems
Tharindu Madusanka, Marco Valentino, Iqra Zahid, Ian Pratt-Hartmann, Riza Batista-Navarro
- The Nature of NLP: Analyzing Contributions in NLP Papers
Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych
- $\mathtt{GeLLM^3O}$: Generalizing Large Language Models for Multi-property Molecule Optimization
Vishal Dey, Xiao Hu, Xia Ning
- Diffusion Directed Acyclic Transformer for Non-Autoregressive Machine Translation
Quan Nguyen-Tri, Cong Dao Tran, Hoang Thanh-Tung
- Follow-up Question Generation For Enhanced Patient-Provider Conversations
Joseph Gatto, Parker Seegmiller, Timothy E. Burdick, Inas S. Khayal, Sarah DeLozier, Sarah Masud Preum
- Unveiling Privacy Risks in LLM Agent Memory
Bo Wang, Weiyi He, Shenglai Zeng, Zhen Xiang, Yue Xing, Jiliang Tang, Pengfei He
- Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation
Emmanouil Zaranis, Giuseppe Attanasio, Sweta Agrawal, Andre Martins
- Language Constrained Multimodal Hyper Adapter For Many-to-Many Multimodal Summarization
Nayu Liu, Fanglong Yao, Haoran Luo, Yong Yang, Chen Tang, Bo Lv
- PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Mingyang Song, Zhaochen Su, Xiaoye Qu, Jiawei Zhou, Yu Cheng
- Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets
Dongyue Li, Ziniu Zhang, Lu Wang, Hongyang R. Zhang
- Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles
Munachiso S Nwadike, Zangir Iklassov, Toluwani Aremu, Tatsuya Hiraoka, Benjamin Heinzerling, Velibor Bojkovic, Hilal AlQuabeh, Martin Takáč, Kentaro Inui
- Shaping the Safety Boundaries: Understanding and Defending Against Jailbreaks in Large Language Models
Lang Gao, Jiahui Geng, Xiangliang Zhang, Preslav Nakov, Xiuying Chen
- ASPERA: A Simulated Environment to Evaluate Planning for Complex Action Execution
Alexandru Coca, Mark Gaynor, Zhenxing Zhang, Jianpeng Cheng, Bo-Hsiang Tseng, Peter Boothroyd, Hector Martinez Alonso, Diarmuid O Seaghdha, Anders Johannsen
- ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework
Jiahao Yuan, Zixiang Di, Zhiqing Cui, Guisong Yang, Usman Naseem
- SARA: Salience-Aware Reinforced Adaptive Decoding for Large Language Models in Abstractive Summarization
Nayu Liu, Junnan Zhu, Yiming Ma, Zhicong Lu, Wenlei Xu, Yong Yang, Jiang Zhong, kaiwen wei
- Embedding-Converter: A Unified Framework for Cross-Model Embedding Transformation
Jinsung Yoon, Sercan O Arik
- Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge
Md Tahmid Rahman Laskar, Israt Jahan, Elham Dolatabadi, Chun Peng, Enamul Hoque, Jimmy Huang
- Answering Complex Geographic Questions by Adaptive Reasoning with Visual Context and External Commonsense Knowledge
Fan Li, Jianxing Yu, Jielong Tang, Wenqing Chen, Hanjiang Lai, Yanghui Rao, Jian Yin
- Efficient Knowledge Editing via Minimal Precomputation
Akshat Gupta, Maochuan Lu, Thomas Hartvigsen, Gopala Anumanchipalli
- Safety Alignment via Constrained Knowledge Unlearning
Zesheng Shi, Yucheng Zhou, Jing Li, Yuxin Jin, YU LI, Daojing He, Fangming Liu, Saleh Alharbi, Jun Yu, Min Zhang
- Response Wide Shut:Surprising Observations in Basic Vision Language Model Capabilities
Shivam Chandhok, Wan-Cyuan Fan, Vered Shwartz, Vineeth N. Balasubramanian, Leonid Sigal
- EffiVLM-Bench: A Comprehensive Benchmark for Evaluating Training-Free Acceleration in Large Visual-Languge Models
Zekun Wang, MingHua Ma, Zexin Wang, Rongchuan Mu, hongtao liu, liping shan, Ming Liu, Bing Qin
- Pre-Training Curriculum for Multi-Token Prediction in Language Models
Ansar Aynetdinov, Alan Akbik
- Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks
Xingxuan Li, Weiwen Xu, Ruochen Zhao, Fangkai Jiao, Shafiq Joty, Lidong Bing
- On Many-Shot In-Context Learning for Long-Context Evaluation
Kaijian Zou, Muhammad Khalifa, Lu Wang
- Meaning Variation and Data Quality in the Corpus of Founding Era American English
Dallas Card
- Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks
Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Daniel Egert, Ellie Evans, Hoo-Chang Shin, Felipe Soares, Yi Dong, Oleksii Kuchaiev
- CulturalBench: A Robust, Diverse and Challenging Benchmark for Measuring LMs’ Cultural Knowledge Through Human-AI Red-Teaming
Yu Ying Chiu, Liwei Jiang, Bill Yuchen Lin, Chan Young Park, Shuyue Stella Li, Sahithya Ravi, Mehar Bhatia, Maria Antoniak, Yulia Tsvetkov, Vered Shwartz, Yejin Choi
- Balancing the Budget: Understanding Trade-offs Between Supervised and Preference-Based Finetuning
Mohit Raghavendra, Junmo Kang, Alan Ritter
- All That Glitters is Not Novel: Plagiarism in AI Generated Research
Tarun Gupta, Danish Pruthi
- Writing Like the Best: Exemplar-Based Expository Text Generation
Yuxiang Liu, Kevin Chen-Chuan Chang
- Temporal Relation Extraction in Clinical Texts: A Span-based Graph Transformer Approach
Rochana Chaturvedi, Peyman Baghershahi, Sourav Medya, Barbara Di Eugenio
- Finding A Voice: Exploring the Potential of African American Dialect and Voice Generation for Chatbots
Sarah E. Finch, Ellie S. Paek, Ikseon Choi, Jinho D. Choi
- Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer’s Disease Detection
Chuyuan Li, Raymond Li, Thalia S. Field, Giuseppe Carenini
- Help Me Write a Story: Evaluating LLMs’ Ability to Generate Writing Feedback
Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata
- Language Fusion for Parameter-Efficient Cross-lingual Transfer
Philipp Borchert, Ivan Vulić, Marie-Francine Moens, Jochen De Weerdt
- Culture is Not Trivia: Sociocultural Theory for Cultural NLP
Naitian Zhou, David Bamman, Isaac L. Bleaman
- AAD-LLM: Neural Attention-Driven Auditory Scene Understanding
Xilin Jiang, Sukru Samet Dindar, Vishal Choudhari, Stephan Bickel, Ashesh Mehta, Guy M McKhann, Daniel Friedman, Adeen Flinker, Nima Mesgarani
- MindRef: Mimicking Human Memory for Hierarchical Reference Retrieval with Fine-Grained Location Awareness
Ye Wang, Xinrun Xu, Zhiming Ding
- Do Language Models Have Semantics? On the Five Standard Positions
Anders Søgaard
- Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems
Myra Cheng, Su Lin Blodgett, Alicia DeVrio, Lisa Egede, Alexandra Olteanu
- Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users
Antonia Karamolegkou, Malvina Nikandrou, Georgios Pantazopoulos, Danae Sanchez Villegas, Phillip Rust, Ruchira Dhar, Daniel Hershcovich, Anders Søgaard
- HumT DumT: Measuring and controlling human-like language in LLMs
Myra Cheng, Sunny Yu, Dan Jurafsky
- ChatBench: From Static Benchmarks to Human-AI Evaluation
Serina Chang, Ashton Anderson, Jake M. Hofman
- LLMs syntactically adapt their language use to their conversational partner
Florian Kandra, Vera Demberg, Alexander Koller
- Teaching an Old LLM Secure Coding: Localized Preference Optimization on Distilled Preferences
Mohammad Saqib Hasan, Saikat Chakraborty, Santu Karmaker, Niranjan Balasubramanian
- Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs
Xiulin Yang, Tatsuya Aoyama, Yuekun Yao, Ethan Wilcox
- Ranking Unraveled: Recipes for LLM Rankings in Head-to-Head AI Combat
Roland Daynauth, Christopher Clarke, Krisztian Flautner, Lingjia Tang, Jason Mars
- LLM Agents Making Agent Tools
Georg Wölflein, Dyke Ferber, Daniel Truhn, Ognjen Arandjelovic, Jakob Nikolas Kather
- CrafText Benchmark: Advancing Language Grounding in Complex Multimodal Open-Ended World
Zoya Volovikova, Gregory Gorbov, Petr Kuderov, Aleksandr Panov, Alexey Skrynnik
- QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation
Bang Nguyen, Tingting Du, Mengxia Yu, Lawrence Angrave, Meng Jiang
- Causal Graph based Event Reasoning using Semantic Relation Experts
Mahnaz Koupaee, Xueying Bai, Mudan Chen, Greg Durrett, Nathanael Chambers, Niranjan Balasubramanian
- LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
Jin Jiang, Yuchen Yan, Yang Liu, Jianing Wang, Shuai Peng, Xunliang Cai, Yixin Cao, Mengdi Zhang, Liangcai Gao
- Do LLMs Understand Dialogues? A Case Study on Dialogue Acts
Ayesha Qamar, Jonathan Tong, Ruihong Huang
- Research Borderlands: Analysing Scientific Writing Across Research Cultures
Shaily Bhatt, Tal August, Maria Antoniak
- CEAES: Bidirectional Reinforcement Learning Optimization for Consistent and Explainable Essay Assessment
Xia Li, Wenjing Pan
- DeAL: Decoding-time Alignment Framework for Large Language Models
James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi-An Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth
- Cultural Bias Matters: A Cross-Cultural Benchmark Dataset and Sentiment-Enriched Model for Understanding Multimodal Metaphors
Senqi Yang, Dongyu Zhang, Jing Ren, Ziqi Xu, Xiuzhen Zhang, Yiliao Song, Hongfei Lin, Feng Xia
- OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction
Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, QIANG QU, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li
- Mixtures of In-Context Learners
Giwon Hong, Emile van Krieken, Edoardo Ponti, Nikolay Malkin, Pasquale Minervini
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation
Yuxuan Zhou, Margret Keuper, Mario Fritz
- RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection
Wenjun Hou, Yi Cheng, Kaishuai Xu, Heng Li, Yan Hu, Wenjie Li, Jiang Liu
- Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates
Jaewoo Ahn, Heeseung Yun, Dayoon Ko, Gunhee Kim
- Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models
Rishabh Adiga, Besmira Nushi, Varun Chandrasekaran
- MTSA: Multi-turn Safety Alignment for LLMs through Multi-round Red-teaming
Weiyang Guo, Jing Li, Wenya Wang, YU LI, Daojing He, Jun Yu, Min Zhang
- The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit
Huixue Zhou, Hengrui Gu, Zaifu Zhan, Xi Liu, Kaixiong Zhou, Yongkang Xiao, Mingfu Liang, Srinivas Prasad Govindan, Piyush Chawla, Jiyan Yang, Xiangfei Meng, Huayu Li, Buyun Zhang, Liang Luo, Wen-Yen Chen, Yiping Han, Bo Long, Rui Zhang, Tianlong Chen
- Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging
Haobo Zhang, Jiayu Zhou
- BIG-Bench Extra Hard
Mehran Kazemi, Bahare Fatemi, Hritik Bansal, John Palowitch, Chrysovalantis Anastasiou, Sanket Vaibhav Mehta, Lalit K Jain, Virginia Aglietti, Disha Jindal, Peter Chen, Nishanth Dikkala, Gladys Tyen, Xin Liu, Uri Shalit, Silvia Chiappa, Kate Olszewska, Yi Tay, Vinh Q. Tran, Quoc V Le, Orhan Firat
- CSTree-SRI: Introspection-Driven Cognitive Semantic Tree for Multi-Turn Question Answering over Extra-Long Contexts
Zhaowen Wang, Xiang Wei, Kangshao Du, Yiting Zhang, Libo Qin, Yingjie Xia, Li Kuang
- TigerLLM - A Family of Bangla Large Language Models
Nishat Raihan, Marcos Zampieri
- InductionBench: LLMs Fail in the Simplest Complexity Class
Wenyue Hua, Tyler Wong, Fei Sun, Liangming Pan, Adam Jardine, William Yang Wang
- RATIONALYST: Pre-training Process-Supervision for Improving Reasoning
Dongwei Jiang, Guoxuan Wang, Yining Lu, Andrew Wang, Jingyu Zhang, Chuyu Liu, Benjamin Van Durme, Daniel Khashabi
- Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation
Andong Chen, Yuchen Song, Kehai Chen, Xuefeng Bai, Muyun Yang, Liqiang Nie, Jie Liu, Tiejun Zhao, Min zhang
- Advancing SMoE for Continuous Domain Adaptation of MLLMs: Adaptive Router and Domain-Specific Loss
Liang Zhang, Ziyao Lu, Fandong Meng, Hui Li, Jie Zhou, Jinsong Su
- Mitigating Media Bias through Multi-document Events Reasoning in LLMs
Yuanyuan Lei, Ruihong Huang
- Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection
Jiatao Li, Xiaojun Wan
- RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates
Md Kowsher, Tara Esmaeilbeig, Chun-Nam Yu, Chen Chen, Mojtaba Soltanalian, Niloofar Yousefi
- Scaling Laws and Efficient Inference for Ternary Language Models
Tejas Vaidhya, Ayush Kaushal, Vineet Jain, Francis Couture-Harpin, Prashant Shishodia, Majid Behbahani, Irina Rish, Yuriy Nevmyvaka
- Exploring the Impact of Instruction-Tuning on LLM’s Susceptibility to Misinformation
Kyubeen Han, Junseo Jang, Hongjin Kim, Geunyeong Jeong, Harksoo Kim
- Do Language Models Understand Honorific Systems in Javanese?
Mohammad Rifqi Farhansyah, Iwan Darmawan, Adryan Kusumawardhana, Genta Indra Winata, Alham Fikri Aji, Derry Tanti Wijaya
- Generative Reward Modeling via Synthetic Criteria Preference Learning
xiaobo liang, Haoke Zhang, Juntao Li, Kehai Chen, Qiaoming Zhu, Min Zhang
- Relation Extraction of Hierarchical Tables Using Multimodal Large Language Models
Xinyu Zhang, Aibo Song, Jingyi Qiu, Jiahui Jin, Tianbo zhang, Xiaolin Fang
- A Self-Denoising Model for Robust Few-Shot Relation Extraction
Liang Zhang, yang zhang, Ziyao Lu, Fandong Meng, Jie Zhou, Jinsong Su
- QuASAR: A Question-Driven Structure-Aware Approach for Table-to-Text Generation
WeiJie Liu, Yibin Zheng, Fang Kong
- Automated Structured Radiology Report Generation
Jean-Benoit Delbrouck, Justin Xu, Johannes Moll, Alois Thomas, Zhihong Chen, Maya Varma, Asfandyar Azhar, Sophie Ostmeier, Andrew Johnston, Eduardo Pontes Reis, Christian Bluethgen, Mohamed S Muneer, Kelvin Zhenghao Li, Curtis Langlotz
- LPOI: Listwise Preference Optimization for Vision Language Models
Fatemeh Pesaran zadeh, Yoojin Oh, Gunhee Kim
- Predicting Through Generation: Why Generation Is Better for Prediction
Md Kowsher, Nusrat Jahan Prottasha, Prakash Bhat, Chun-Nam Yu, Mojtaba Soltanalian, Ivan Garibay, Ozlem Garibay, Chen Chen, Niloofar Yousefi
- “Give Me BF16 or Give Me Death”? Accuracy-Performance Trade-Offs in LLM Quantization
Eldar Kurtic, Alexandre Noll Marques, Shubhra Pandit, Mark Kurtz, Dan Alistarh
- StitchLLM: Serving LLMs, One Block at a Time
Bodun Hu, Shuozhe Li, Saurabh Agarwal, Myungjin Lee, Akshay Jajoo, Jiamin Li, Le Xu, Geon-Woo Kim, Donghyun Kim, Hong Xu, Amy Zhang, Aditya Akella
- Walk in Others’ Shoes with a Single Glance: Human-Centric Visual Grounding with Top-View Perspective Transformation
Yuqi Bu, Xin Wu, Zirui Zhao, Yi Cai, David Hsu, Qiong Liu
- From Citations to Criticality: Predicting Legal Decision Influence in the Multilingual Swiss Jurisprudence
Ronja Stern, Ken Kawamura, Matthias Stürmer, Ilias Chalkidis, Joel Niklaus
- Is linguistically-motivated data augmentation worth it?
Ray Groshan, Michael Ginn, Alexis Palmer
- From Lists to Emojis: How Format Bias Affects Model Alignment
Xuanchang Zhang, Wei Xiong, Lichang Chen, Tianyi Zhou, Heng Huang, Tong Zhang
- Colloquial Singaporean English Style Transfer with Fine-Grained Explainable Control
Jinggui Liang, Dung Vo, Yap Hong Xian, Hai Leong Chieu, Kian Ming A. Chai, Jing Jiang, Lizi Liao
- From Informal to Formal – Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs
Jialun Cao, Yaojie Lu, Meiziniu Li, Haoyang Ma, Haokun Li, Mengda He, Cheng Wen, Le Sun, Hongyu Zhang, Shengchao Qin, Shing-Chi Cheung, Cong Tian
- CoAM: Corpus of All-Type Multiword Expressions
Yusuke Ide, Joshua Tanner, Adam Nohejl, Jacob Hoffman, Justin Vasselli, Hidetaka Kamigaito, Taro Watanabe
- SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation
Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Liu Weichuan, Lei Hou, Juanzi Li
- Exposing the Achilles’ Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning
Joykirat Singh, Akshay Nambi, Vibhav Vineet
- Revisiting LLMs as Zero-Shot Time Series Forecasters: Small Noise Can Break Large Models
Junwoo Park, Hyuck Lee, Dohyun Lee, Daehoon Gwak, Jaegul Choo
- Transferring Textual Preferences to Vision-Language Understanding through Model Merging
Chen-An Li, Tzu-Han Lin, Yun-Nung Chen, Hung-yi Lee
- Understanding the Dark Side of LLMs’ Intrinsic Self-Correction
Qingjie Zhang, Di Wang, Haoting Qian, Yiming Li, Tianwei Zhang, Minlie Huang, Han Qiu
- VideoVista2: 360° Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension
Xinyu Chen, yunxin li, Haoyuan Shi, Baotian Hu, Wenhan Luo, Yaowei Wang, Min Zhang
- What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices
Zhi Chen, Qiguang Chen, Libo Qin, Qipeng Guo, haijun Lv, Yicheng Zou, Hang Yan, Kai Chen, Dahua Lin
- Knowledge Graph Retrieval-Augmented Generation for LLM-based Recommendation
Shijie Wang, Wenqi Fan, Yue Feng, LIN SHANRU, Xinyu Ma, Shuaiqiang Wang, Dawei Yin
- SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment
Qin Liu, Fei Wang, Chaowei Xiao, Muhao Chen
- ProgCo: Program Helps Self-Correction of Large Language Models
Xiaoshuai Song, Yanan Wu, Weixun Wang, Jiaheng Liu, Wenbo Su, Bo Zheng
- I0T: Embedding Standardization Method Towards Zero Modality Gap
Na Min An, Eunki Kim, James Thorne, Hyunjung Shim
- Leveraging Self-Attention for Input-Dependent Soft Prompting in LLMs
Ananth Muppidi, Abhilash Nandy, Sambaran Bandyopadhyay
- Odysseus Navigates the Sirens’ Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation
Wen Luo, Feifan Song, Wei Li, Guangyue Peng, Shaohang Wei, Houfeng Wang
- Better Embeddings with Coupled Adam
Felix Stollenwerk, Tobias Stollenwerk
- Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation
Guofu Xie, Xiao Zhang, Ting Yao, Yunsheng Shi
- Controllable and Reliable Knowledge-Intensive Task Agents with Declarative GenieWorksheets
Harshit Joshi, Shicheng Liu, James Chen, Larsen Weigle, Monica Lam
- Benchmarking Long-Context Language Models on Long Code Understanding
Jia Li, Xuyuan Guo, Lei Li, Kechi Zhang, Ge Li, Jia Li, Zhengwei Tao, Fang Liu, Chongyang Tao, Yuqi Zhu, Zhi Jin
- MAGNET: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities
Savya Khosla, Aditi Tiwari, Kushal Kafle, Simon Jenni, Handong Zhao, John Collomosse, Jing Shi
- Internal Value Alignment in Large Language Models through Controlled Value Vector Activation
Haoran Jin, Meng Li, Xiting Wang, Zhihao Xu, Minlie Huang, Yantao Jia, Defu Lian
- A Dual-Perspective NLG Meta-Evaluation Framework with Automatic Benchmark and Better Interpretability
Xinyu Hu, Mingqi Gao, Li Lin, Zhenghan Yu, Xiaojun Wan
- Recurrent Knowledge Localization and Fusion for Language Model Continual Learning
Yujie Feng, Xujia Wang, ZEXIN LU, FuShenghong, Guangyuan SHI, Yongxin Xu, Yasha Wang, Philip S. Yu, Xu Chu, Xiao-Ming Wu
- Data-Constrained Synthesis of Training Data for De-Identification
Thomas Vakili, Aron Henriksson, Hercules Dalianis
- Just a Scratch: Enhancing LLM Capabilities for Self-harm Detection through Intent Differentiation and Emoji Interpretation
Soumitra Ghosh, gopendra Vikram singh, Shambhavi, Sabarna Choudhury, Asif Ekbal
- Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing
Peiming Guo, Meishan Zhang, jianling li, Min Zhang, Yue Zhang
- MMDEND: Dendrite-Inspired Multi-Branch Multi-Compartment Parallel Spiking Neuron for Sequence Modeling
Kexin Wang, Yuhong Chou, Di Shang, Shijie Mei, Jiahong Zhang, Yanbin Huang, Man Yao, Bo XU, Guoqi Li
- Inconsistent Tokenizations Cause Language Models to be Perplexed by Japanese Grammar
Andrew Gambardella, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo
- Understanding Impact of Human Feedback via Influence Functions
Taywon Min, Haeone Lee, Yongchan Kwon, Kimin Lee
- T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts
Ziwei Huang, Wanggui He, Quanyu Long, Yandi Wang, Haoyuan Li, Zhelun Yu, Fangxun Shu, Weilong Dai, Hao Jiang, Leilei Gan, Fei Wu
- InspireDebate: Multi-Dimensional Subjective-Objective Evaluation-Guided Reasoning and Optimization for Debating
Fuyu Wang, Jiangtong Li, Kun Zhu, Changjun Jiang
- WAVE: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Hongming Zhang, Tianqing Fang, Zhenzhong Lan, Dong Yu
- FOCUS: Evaluating Pre-trained Vision-Language Models on Underspecification Reasoning
Kankan Zhou, Eason Lai, Kyriakos Mouratidis, Jing Jiang
- Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
Wan Ju Kang, Eunki Kim, Na Min An, Sangryul Kim, Haemin Choi, Ki Hoon Kwak, James Thorne
- Personal Travel Solver: A Preference-Driven LLM-Solver System for Travel Planning
Zijian Shao, Jiancan Wu, Weijian Chen, Xiang Wang
- Counterspeech the ultimate shield! Multi-Conditioned Counterspeech Generation through Attributed Prefix Learning
Aswini Kumar Padhi, Anil Bandhakavi, Tanmoy Chakraborty
- Unique Hard Attention: A Tale of Two Sides
Selim Jerad, Anej Svete, Jiaoda Li, Ryan Cotterell
- LLM$\times$MapReduce: Simplified Long-Sequence Processing using Large Language Models
Zihan Zhou, Chong Li, 陈昕怡, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun
- CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback
Dennis Hein, Zhihong Chen, Sophie Ostmeier, Justin Xu, Maya Varma, Eduardo Pontes Reis, Arne Edward Michalson MD, Christian Bluethgen, Hyun Joo Shin, Curtis Langlotz, Akshay S Chaudhari
- Knowledge Tracing in Programming Education Integrating Students’ Questions
Doyoun Kim, Suin Kim, Yohan Jo
- PRISM: A Framework for Producing Interpretable Political Bias Embeddings with Political-Aware Cross-Encoder
Yiqun Sun, Qiang Huang, Anthony Kum Hoe Tung, Jun Yu
- Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes
Meng Li, Michael Vrazitulis, David Schlangen
- Lexical Diversity-aware Relevance Assessment for Retrieval-Augmented Generation
Zhange Zhang, Yuqing Ma, Yulong Wang, Shan He, Tianbo Wang, Siqi He, Jiakai Wang, Xianglong Liu
- Weaving Context Across Images: Improving Vision-Language Models through Focus-Centric Visual Chains
Juntian Zhang, Chuanqi Cheng, Yuhan Liu, Wei Liu, Jian Luan, Rui Yan
- Online Iterative Self-Alignment for Radiology Report Generation
Ting Xiao, Lei Shi, Yang Zhang, HaoFeng Yang, Zhe Wang, Chenjia Bai
- Chinese Inertial GAN for Handwriting Signal Generation and Recognition
Yifeng Wang, Yi Zhao
- LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges
Haoyang Li, Huan Gao, Zhiyuan Zhao, Zhiyu Lin, Junyu Gao, Xuelong Li
- Evaluating Sequence Labeling on the basis of Information Theory
Enrique Amigo, Elena Álvarez-Mellado, Julio Gonzalo, Jorge Carrillo-de-Albornoz
- GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search
Xianshu Peng, Wei Wei
- T-REG: Preference Optimization with Token-Level Reward Regularization
Wenxuan Zhou, Shujian Zhang, Lingxiao Zhao, Tao Meng
- Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding
Keqin Peng, Liang Ding, Yuanxin Ouyang, Meng Fang, Yancheng Yuan, Dacheng Tao
- Gödel Agent: A Self-Referential Agent Framework for Recursively Self-Improvement
Xunjian Yin, Xinyi Wang, Liangming Pan, Li Lin, Xiaojun Wan, William Yang Wang
- AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments
Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Xin Guo, Dingwen Yang, Chenyang Liao, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
- Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Yexiang Liu, Zekun Li, Zhi Fang, Nan Xu, Ran He, Tieniu Tan
- Learnability on the Information-Theoretic Continuum: Inductive Bias for Information Locality in Neural Language Models
Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O’Donnell, Mario Giulianelli, Ryan Cotterell
- Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models
Adrián Bazaga, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert
- Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies
Massimiliano Pronesti, Joao H Bettencourt-Silva, Paul Flanagan, Alessandra Pascale, Oisín Redmond, Anya Belz, Yufang Hou
- Towards Robust Universal Information Extraction: Dataset, Evaluation, and Solution
Jizhao Zhu, Akang Shi, Zixuan Li, Long Bai, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng
- Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation
Huiyuan Lai, Esther Ploeger, Rik van Noord, Antonio Toral
- Temporal reasoning for timeline summarisation in social media
Jiayu Song, Mahmud Elahi Akhter, Dana Atzil-Slonim, Maria Liakata
- Beyond Negative Stereotypes – Non-Negative Abusive Utterances about Identity Groups and Their Semantic Variants
Tina Lommel, Elisabeth Eder, Josef Ruppenhofer, Michael Wiegand
- Persistent Homology of Topic Networks for the Prediction of Reader Curiosity
Manuel D.S. Hopp, Vincent Labatut, Arthur Amalvy, Richard Dufour, Hannah Stone, Hayley K Jach, Kou Murayama
- Tokenisation is NP-Complete
Philip Whittington, Gregor Bachmann, Tiago Pimentel
- Understanding Language Model Scaling Laws in Terms of Training Dynamics via Loss Deceleration and Zero-Sum Learning
Andrei Mircea, Ekaterina Lobacheva, Supriyo Chakraborty, Nima Chitsazan, Irina Rish
- Parameter-Aware Contrastive Knowledge Editing: Tracing and Rectifying based on Critical Transmission Paths
Songlin Zhai, Yuan Meng, Yuxin Zhang, Guilin Qi
- Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System
Haoyang Su, Renqi Chen, SHIXIANG TANG, Zhenfei Yin, Xinzhe Zheng, Jinzhe Li, Biqing Qi, Qi Wu, Hui Li, Wanli Ouyang, Philip Torr, Bowen Zhou, Nanqing Dong
- Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking
Yilong Chen, Junyuan Shang, Zhenyu Zhang, Yanxi Xie, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang
- Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport
Yuu Jinnai
- Opt-Out: Investigating Entity-Level Unlearning for Large Language Models via Optimal Transport
Minseok Choi, Daniel Rim, Dohyun Lee, Jaegul Choo
- Mixture of Small and Large Models for Chinese Spelling Check
Ziheng Qiao, Houquan Zhou, Zhenghua Li
- DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check
Ziheng Qiao, Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang
- The Causal Effect of Merge Operations in Bottom-up Tokenisers
Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel
- Value Residual Learning
Zhanchao Zhou, Tianyi Wu, Zhiyun Jiang, Fares Obeid, Zhenzhong Lan
- SGIC: A Self-Guided Iterative Calibration Framework for RAG
Guanhua Chen, Yutong Yao, Lidia S. Chao, Xuebo Liu, Derek F. Wong
- NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts
Muhammad Farid Adilazuarda, Musa Izzanardi Wijanarko, Lucky Susanto, Khumaisa Nur’aini, Derry Tanti Wijaya, Alham Fikri Aji
- LLM-based Rumor Detection via Influence Guided Sample Selection and Game-based Perspective Analysis
Zhiliang Tian, jingyuan huang, Zejiang He, Zhen Huang, Menglong Lu, Linbo Qiao, Songzhu Mei, Yijie Wang, Dongsheng Li
- Hierarchical-Task-Aware Multi-modal Mixture of Incremental LoRA Experts for Embodied Continual Learning
Ziqi Jia, Anmin Wang, Xiaoyang Qu, Xiaowen Yang, Jianzong Wang
- SpindleKV: A Novel KV Cache Reduction Method Balancing Both Shallow and Deep Layers
Zicong Tang, Shi Luohe, Zuchao Li, Baoyuan Qi, Liu Guoming, Lefei Zhang, Ping Wang
- Medical Graph RAG: Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation
Junde Wu, Jiayuan Zhu, Yunli Qi, Jingkun Chen, Min Xu, Filippo Menolascina, Yueming Jin, Vicente Grau
- Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models
Seungcheol Park, Jeongin Bae, Beomseok Kwon, Minjun Kim, Byeongwook Kim, Se Jung Kwon, U Kang, Dongsoo Lee
- Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools
Junde Wu, Jiayuan Zhu, Yuyuan Liu, Min Xu, Yueming Jin
- Probing Relative Interaction and Dynamic Calibration in Multi-modal Entity Alignment
Chenxiao Li, Jingwei Cheng, Qiang Tong, Fu Zhang, Cairui Wang
- Learn to Memorize: Scalable Continual Learning in Semiparametric Language Models with Mixture-of-Neighbors Induction Memory
Guangyue Peng, Tao Ge, Wen Luo, Wei Li, Houfeng Wang
- Adverse Event Extraction from Discharge Summaries: A New Dataset, Annotation Scheme, and Initial Findings
Imane Guellil, Salomé Andres, Atul Anand, Bruce Guthrie, Huayu Zhang, Abul Hasan, Honghan Wu, Beatrice Alex
- Speed Up Your Code: Progressive Code Acceleration Through Bidirectional Tree Editing
Longhui Zhang, Jiahao Wang, Meishan Zhang, GaoXiong Cao, Ensheng Shi, mayuchi, Jun Yu, Honghai LIU, Jing Li, Min Zhang
- Multi-Facet Blending for Faceted Query-by-Example Retrieval
Heejin Do, Sangwon Ryu, Jonghwi Kim, Gary Lee
- PIPER: Benchmarking and Prompting Event Reasoning Boundary of LLMs via Debiasing-Distillation Enhanced Tuning
Zhicong Lu, Changyuan Tian, PeiguangLi, Li Jin, Sirui Wang, Wei Jia, Ying Shen, Guangluan Xu
- MIR: Methodology Inspiration Retrieval for Scientific Research Problems
Aniketh Garikaparthi, Manasi Patwardhan, Aditya Sanjiv Kanade, Aman Hassan, Lovekesh Vig, Arman Cohan
- Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models
Kexin Chen, Dongxia Wang, Yi Liu, Haonan Zhang, Wenhai Wang
- Different Speech Translation Models Encode and Translate Speaker Gender Differently
Dennis Fucci, Marco Gaido, Matteo Negri, Luisa Bentivogli, Andre Martins, Giuseppe Attanasio
- Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning
Ruoxi Xu, Yunjie Ji, Boxi Cao, Yaojie Lu, Hongyu Lin, Xianpei Han, Ben He, Yingfei Sun, Xiangang Li, Le Sun
- Improving Dialogue State Tracking through Combinatorial Search for In-Context Examples
Haesung Pyun, Yoonah Park, Yohan Jo
- Pretraining Context Compressor for Large Language Models with Embedding-Based Memory
Yuhong Dai, Jianxun Lian, Yitian Huang, Wei Zhang, Mingyang Zhou, Mingqi Wu, Xing Xie, Hao Liao
- Dialogue Systems for Emotional Support via Value Reinforcement
Juhee Kim, Chunghu Mok, Jisun LEE, Hyang Sook Kim, Yohan Jo
- Length-Induced Embedding Collapse in PLM-based Models
Yuqi Zhou, Sunhao Dai, Zhanshuo Cao, Xiao Zhang, Jun Xu
- SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction
Shester Gueuwou, Xiaodan Du, Greg Shakhnarovich, Karen Livescu, Alexander H. Liu
- ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase Generation
Lam Thanh Do, Aaditya Bodke, Pritom Saha Akash, Kevin Chen-Chuan Chang
- Know Your Mistakes: Towards Preventing Overreliance on Task-Oriented Conversational AI Through Accountability Modeling
Suvodip Dey, Yi-Jyun Sun, Gokhan Tur, Dilek Hakkani-Tür
- LLMs Trust Humans More, That’s a Problem! Unveiling and Mitigating the Authority Bias in Retrieval-Augmented Generation
Yuxuan LI, Xinwei Guo, Jiashi Gao, Guanhua Chen, Xiangyu Zhao, Jiaxin Zhang, Quanying Liu, Haiyan Wu, Xin Yao, Xuetao Wei
- Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation
Dongsheng Zhu, Weixian Shi, Zhengliang Shi, Zhaochun Ren, Shuaiqiang Wang, Lingyong Yan, Dawei Yin
- Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration
Yuyi Zhang, Peirong Zhang, Zhenhua Yang, Pengyu Yan, Yongxin Shi, Pengwei Liu, Fengjun Guo, Lianwen Jin
- PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment
Zekun Moore Wang, Shenzhi Wang, King Zhu, Jiaheng Liu, Ke Xu, Jie Fu, Wangchunshu Zhou, Wenhao Huang
- Robust Utility-Preserving Text Anonymization Based on Large Language Models
Tianyu Yang, Xiaodan Zhu, Iryna Gurevych
- SEAL: Scaling to Emphasize Attention for Long-Context Retrieval
Changhun Lee, Minsang Seok, Jun-gyu Jin, YoungHyun Cho, Eunhyeok Park
- From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons Alignment
Chongxuan Huang, Yongshi Ye, Biao Fu, Qifeng Su, Xiaodong Shi
- $\mathcal{A}^3$: Automatic Alignment Framework for Attributed Text Generation
Yue Wang, Haoke Zhang, Juntao Li, Jinxiong Chang, Min Zhang
- Towards Better Value Principles for Large Language Model Alignment: A Systematic Evaluation and Enhancement
Bingbing Xu, Jing Yao, Xiaoyuan Yi, Aishan Maoliniyazi, Xing Xie, Xiaofeng Meng
- Rethinking Semantic Parsing for Large Language Models: Enhancing LLM Performance with Semantic Hints
Kaikai An, Shuzheng Si, Helan Hu, Haozhe Zhao, Yuchi Wang, Qingyan Guo, Baobao Chang
- Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
- Comprehensive Analysis of Minimum Bayes Risk Decoding through Bias and Diversity Decomposition
Hidetaka Kamigaito, Hiroyuki Deguchi, Yusuke Sakai, Katsuhiko Hayashi, Taro Watanabe
- Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models
Ido Cohen, Daniela Gottesman, Mor Geva, Raja Giryes
- SDD: Self-Degraded Defense against Malicious Fine-tuning
ZiXuan Chen, Weikai Lu, Xin Lin, Ziqian Zeng
- CoachMe: Decoding Sport Elements with a Reference-Based Coaching Instruction Generation Model
Wei-Hsin Yeh, Yu-An Su, Chih-Ning Chen, Yi-Hsueh Lin, Calvin Ku, WENHSIN CHIU, Min-Chun Hu, Lun-Wei Ku
- DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization
Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Jing Li, Min Zhang, Zhaopeng Tu
- How LLMs Comprehend Temporal Structure in Narratives: A Case Study in Cognitive Evaluation of LLMs
Karin De Langis, Jong Inn Park, Andreas Schramm, Bin Hu, Khanh Chi Le, Dongyeop Kang
- Data Caricatures: On the Representation of African American Language in Pretraining Corpora
Nicholas Deas, Blake Vente, Amith Ananthram, Jessica A Grieser, Desmond U. Patton, Shana Kleiner, James R. Shepard III, Kathleen McKeown
- Language Model Probabilities are $Not$ Calibrated in Numeric Contexts
Charles Lovering, Michael Krumdick, Viet Dac Lai, Varshini Reddy, Seth Ebner, Nilesh Kumar, Rik Koncel-Kedziorski, Chris Tanner
- MDCure: A Scalable Pipeline for Multi-Document Instruction-Following
Gabrielle Kaili-May Liu, Bowen Shi, Avi Caciularu, Idan Szpektor, Arman Cohan
- Misattribution Matters: Quantifying Unfairness in Authorship Attribution
Pegah Alipoormolabashi, Ajay Patel, Niranjan Balasubramanian
- Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs
Sumanth Doddapaneni, Mohammed Safi Ur Rahman Khan, Dilip Venkatesh, Raj Dabre, Anoop Kunchukuttan, Mitesh M Khapra
- DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process
Minjun Zhu, Yixuan Weng, Linyi Yang, Yue Zhang
- Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient
Yuan Gao, Zujing Liu, WEIZHONG ZHANG, Bo Du, Gui-Song Xia
- Zero-Shot Text-to-Speech for Vietnamese
Thi Vu, Linh The Nguyen, Dat Quoc Nguyen
- Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis
Priyanka Kargupta, Ishika Agarwal, Tal August, Jiawei Han
- Hierarchical Memory Organization for Wikipedia Generation
Eugene J. Yu, Dawei Zhu, Yifan Song, Xiangyu Wong, Jiebin Zhang, Wenxuan Shi, Xiaoguang Li, Qun Liu, Sujian Li
- Class Distillation with Mahalanobis Contrast: An Efficient Training Paradigm for Pragmatic Language Understanding Tasks
Chenlu Wang, Weimin Lyu, Ritwik Banerjee
- Structure-aware Domain Knowledge Injection for Large Language Models
Kai Liu, Ze Chen, Zhihang Fu, Wei Zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye
- FinMME: A Financial Multi-Modal Evaluation Dataset
Junyu Luo, Zhizhuo KOU, Liming Yang, Xiao Luo, Jinsheng Huang, Zhiping Xiao, Jingshu Peng, Chengzhong LIU, Jiaming Ji, Xuanzhe Liu, Sirui Han, Ming Zhang, Yike Guo
- Dialectal Coverage And Generalization in Arabic Speech Recognition
Amirbek Djanibekov, Hawau Olamide Toyin, Raghad Alshalan, Abdullah Alatir, Hanan Aldarmaki
- Can LLMs Generate High-Quality Test Cases for Algorithm Problems? TestCase-Eval: A Systematic Evaluation of Fault Coverage and Exposure
Zheyuan Yang, Zexi Kuang, Yilun Zhao
- EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
Ron Yosef, Yonatan Bitton, Dani Lischinski, Moran Yanuka
- Reconsidering LLM Uncertainty Estimation Methods in the Wild
Duygu Nur Yaldiz, Yavuz Faruk Bakman, Sungmin Kang, Tuo Zhang, Baturalp Buyukates, Sai Praneeth Karimireddy, Salman Avestimehr
- Are Optimal Algorithms Still Optimal? Rethinking Sorting in LLM-Based Pairwise Ranking with Batching and Caching
Juan Wisznia, Cecilia Bolaños, Juan Tollo, Giovanni Franco Gabriel Marraffini, Agustín Andrés Gianolini, Noe Fabian Hsueh, Luciano Del Corro
- Bregman Conditional Random Fields: Sequence Labeling with Parallelizable Inference Algorithms
Caio Corro, Mathieu Lacroix, Joseph Le Roux
- SEE: Strategic Exploration and Exploitation for Cohesive In-Context Prompt Optimization
Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A. Malin, Sricharan Kumar
- Programming by Example meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction
Atharva Naik, Darsh Agrawal, Hong Sng, Clayton Marr, Kexun Zhang, Nathaniel Romney Robinson, Kalvin Chang, Rebecca Byrnes, Aravind Mysore, Carolyn Rose, David R Mortensen
- Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events
Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han
- Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims
Priyanka Kargupta, Runchu Tian, Jiawei Han
- The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents
Feiran Jia, Tong Wu, Xin Qin, Anna Squicciarini
- Sandcastles in the Storm: Revisiting the (Im)possibility of Strong Watermarking
Fabrice Y Harel-Canada, Boran Erol, Connor Choi, Jason Liu, Gary Jiarui Song, Nanyun Peng, Amit Sahai
- Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement
Yaxuan Kong, Yiyuan Yang, Yoontae Hwang, Wenjie Du, Stefan Zohren, Zhangyang Wang, Ming Jin, Qingsong Wen
- From Perceptions to Decisions: Wildfire Evacuation Decision Prediction with Behavioral Theory-informed LLMs
Ruxiao Chen, Chenguang Wang, Yuran Sun, Xilei Zhao, Susu Xu
- GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning
Shikhhar Siingh, Abhinav Rawat, Chitta Baral, Vivek Gupta
- Hanging in the Balance: Pivotal Moments in Crisis Counseling Conversations
Vivian Nguyen, Lillian Lee, Cristian Danescu-Niculescu-Mizil
- Unveiling the Potential of BERT-family: A New Recipe for Building Scalable, General and Competitive Large Language Models
Yisheng Xiao, Juntao Li, Wenpeng Hu, Min Zhang
- TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora
Priyanka Kargupta, Nan Zhang, Yunyi Zhang, Rui Zhang, Prasenjit Mitra, Jiawei Han
- An Empirical Study of Iterative Refinements for Non-autoregressive Translation
Yisheng Xiao, Pei Guo, Zechen Sun, Juntao Li, Kai Song, Min Zhang
- Retrofitting Large Language Models with Dynamic Tokenization
Darius Feher, Ivan Vulić, Benjamin Minixhofer
- Principled Content Selection to Generate Diverse and Personalized Multi-Document Summaries
Vishakh Padmakumar, Zichao Wang, David Arbour, Jennifer Healey
- Bilingual Zero-Shot Stance Detection
Chenye Zhao, Cornelia Caragea
- GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning
Rita Ramos, Everlyn Asiko Chimoto, Maartje Ter Hoeve, Natalie Schluter
- Theorem Prover as a Judge for Synthetic Data Generation
Joshua Ong Jun Leang, Giwon Hong, Wenda Li, Shay B Cohen
- Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks
Ori Shapira, Shlomo Chazan, Amir David Nissan Cohen
- Can You Trust LLMs’ Judgements of the Validity of Simple Inferences With Partisan Conclusions? – No!
Reto Gubelmann, Ghassen Karray
- PARME: Parallel Corpora for Low-Resourced Middle Eastern Languages
Sina Ahmadi, Rico Sennrich, Erfan Karami, Ako Marani, Parviz Fekrazad, Gholamreza Akbarzadeh Baghban, Hanah Hadi, Semko Heidari, Mahîr Dogan, Pedram Asadi, Dashne Bashir, Mohammad Amin Ghodrati, Kourosh Amini, Zeynab Ashourinezhad, Mana Baladi, Farshid Ezzati, Alireza Ghasemifar, Daryoush Hosseinpour, Behrooz Abbaszadeh, Amin Hassanpour, Bahaddin jalal hamaamin, Saya Kamal Hama, Ardeshir Mousavi, Sarko Nazir Hussein, Isar Nejadgholi, Mehmet Ölmez, Horam Osmanpour, Rashid Roshan Ramezani, Aryan Sediq Aziz, Ali Salehi, Mohammadreza Yadegari, Kewyar Yadegari, Sedighe Zamani Roodsari
- METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling
Bingxuan Li, Yiwei Wang, Jiuxiang Gu, Kai-Wei Chang, Nanyun Peng
- ConLoan: A Contrastive Multilingual Dataset for Evaluating Loanwords
Sina Ahmadi, Micha David Hess, Elena Álvarez-Mellado, Alessia Battisti, Cui Ding, Anne Göhring, Yingqiang Gao, Zifan Jiang, Andrianos Michail, Peshmerge Morad, Joel Niklaus, Maria Christina Panagiotopoulou, Stefano Perrella, Juri Opitz, Anastassia Shaitarova, Rico Sennrich
- A Theory of LLM Sampling: Part Descriptive and Part Prescriptive
Sarath Sivaprasad, Pramod Kaushik, Sahar Abdelnabi, Mario Fritz
- MEraser: An Effective Fingerprint Erasure Approach for Large Language Models
Jingxuan Zhang, Zhenhua Xu, Rui Hu, Wenpeng Xing, Xuhong Zhang, Meng Han
- VISA: Retrieval Augmented Generation with Visual Source Attribution
Xueguang Ma, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Wenhu Chen, Jimmy Lin
- DRAMA: Diverse Augmentation from Large Language Models Towards Smaller Generalizable Dense Retrievers
Xueguang Ma, Xi Victoria Lin, Barlas Oguz, Jimmy Lin, Wen-tau Yih, Xilun Chen
- Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs
Ziling Cheng, Meng Cao, Marc-Antoine Rondeau, Jackie CK Cheung
- TreeCut: A Synthetic Unanswerable Math Word Problem Dataset for LLM Hallucination Evaluation
Jialin Ouyang
- MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning
Chanwoo Park, Seungju Han, Xingzhi Guo, Asuman E. Ozdaglar, Kaiqing Zhang, Joo-Kyung Kim
- Map&Make: Schema Guided Text to Table Generation
Naman Ahuja, Fenil Bardoliya, Chitta Baral, Vivek Gupta
- WinSpot: GUI Grounding Benchmark with Multimodal Large Language Models
Zheng Hui, Yinheng Li, Dan Zhao, Colby Banbury, Tianyi Chen, Kazuhito Koishida
- IRIS: Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences
Fengnan Li, Elliot D. Hill, JIANG SHU, Jiaxin Gao, Matthew M. Engelhard
- Spurious Correlations and Beyond: Understanding and Mitigating Shortcut Learning in SDOH Extraction with Large Language Models
Fardin Ahsan Sakib, Ziwei Zhu, Karen Trister Grace, Meliha Yetisgen, Ozlem Uzuner
- Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images
Shengguang Wu, Fan-Yun Sun, Kaiyue Wen, Nick Haber
- Enhancing NER by Harnessing Multiple Datasets with Conditional Variational Autoencoders
Taku Oi, Makoto Miwa
- CHEER-Ekman: Fine-grained Embodied Emotion Classification
Phan Anh Duong, Cat Luong, Divyesh Bommana, Tianyu Jiang
- Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method
Peter Baile Chen, Yi Zhang, Mike Cafarella, Dan Roth
- R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory
Tenghao Huang, Kinjal Basu, Ibrahim Abdelaziz, Pavan Kapanipathi, Jonathan May, Muhao Chen
- ScanEZ: Integrating Cognitive Models with Self-Supervised Learning for Spatiotemporal Scanpath Prediction
Ekta Sood, Prajit Dhar, Enrica Troiano, Rosy Southwell, Sidney K. DMello
- FairI Tales: Evaluation of Fairness in Indian Contexts with a Focus on Bias and Stereotypes
Janki Atul Nawale, Mohammed Safi Ur Rahman Khan, Janani D, Danish Pruthi, Mitesh M Khapra
- SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models
Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li, Ke Hu, Zhehuai Chen, Shinji Watanabe, Fei Cheng, Chenhui Chu, Yu-Chiang Frank Wang, Sadao Kurohashi
- Predicting Implicit Arguments in Procedural Video Instructions
Anil Batra, Laura Sevilla-Lara, Marcus Rohrbach, Frank Keller
- InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Hao Li, Xiaogeng Liu, Ning Zhang, Chaowei Xiao
- CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP
Tianyu Yang, Lisen Dai, Xiangqi Wang, Minhao Cheng, Yapeng Tian, Xiangliang Zhang
- ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding
Austin Wang, ZeMing Gong, Angel X Chang
- The time scale of redundancy between prosody and linguistic context
Tamar I Regev, Chiebuka Ohams, Shaylee Xie, Lukas Wolf, Evelina Fedorenko, Alex Warstadt, Ethan Wilcox, Tiago Pimentel
- Improving Fairness of Large Language Models in Multi-document Summarization
Haoyuan Li, Rui Zhang, Snigdha Chaturvedi
- Basic Reading Distillation
Zhi Zhou, Sirui Miao, Xiangyu Duan, Hao Yang, Min Zhang
- Quantized Can Still Be Calibrated: A Unified Framework to Calibration in Quantized Large Language Models
Mingyu Zhong, Guanchu Wang, Yu-Neng Chuang, Na Zou
- Fine-Grained Spatio-Temporal Modeling of Reading Behavior
Francesco Ignazio Re, Andreas Opedal, Glib Manaiev, Mario Giulianelli, Ryan Cotterell
- More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives
Xiaoqing Zhang, Ang Lv, Yuhan Liu, Flood Sung, Wei Liu, Jian Luan, Shuo Shang, Xiuying Chen, Rui Yan
- Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models
Fei Wang, Xingchen Wan, Ruoxi Sun, Jiefeng Chen, Sercan O Arik
- SubLIME: Subset Selection via Rank Correlation Prediction for Data-Efficient LLM Evaluation
Gayathri Saranathan, Cong Xu, Mahammad Parwez Alam, Tarun Kumar, Martin Foltin, Soon Yee Wong, Suparna Bhattacharya
- $\text{M}^3\text{GQA}$: A Multi-Entity Multi-Hop Multi-Setting Graph Question Answering Benchmark
Boci Peng, Yongchao Liu, Xiaohe Bo, Jiaxin Guo, Yun Zhu, Xuanbo Fan, Chuntao Hong, Yan Zhang
- LSSF: Safety Alignment for Large Language Models through Low-Rank Safety Subspace Fusion
Guanghao Zhou, Panjia Qiu, Cen Chen, Hongyu Li, Jason Chu, Xin Zhang, JUN ZHOU
- Should I Believe in What Medical AI Says? A Chinese Benchmark for Medication Based on Knowledge and Reasoning
Yue Wu, Yangmin Huang, Qianyun Du, Lixian Lai, Zhiyang He, Jiaxue Hu, Xiaodong Tao
- ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries
Kishan Maharaj, Vitobha Munigala, Srikanth G. Tamilselvam, Prince Kumar, Sayandeep Sen, Palani Kodeswaran, Abhijit Mishra, Pushpak Bhattacharyya
- Meta-Tool: Unleash Open-World Function Calling Capabilities of General-Purpose Large Language Models
Shengqian Qin, Yakun Zhu, Linjie Mu, Shaoting Zhang, Xiaofan Zhang
- Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning
Yingjie Zhu, Xuefeng Bai, Kehai Chen, Yang Xiang, Jun Yu, Min Zhang
- ISR: Self-Refining Referring Expressions for Entity Grounding
Zhuocheng Yu, Bingchan Zhao, Yifan Song, Sujian Li, ZHONGHUI HE
- Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference
Siyuan Wang, Dianyi Wang, Chengxing Zhou, Zejun Li, Zhihao Fan, Xuanjing Huang, zhongyu wei
- CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models
Yongheng Zhang, Xu Liu, Ruoxi Zhou, Qiguang Chen, Hao Fei, Wenpeng Lu, Libo Qin
- TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
Henry Peng Zou, Zhengyao Gu, Yue Zhou, Yankai Chen, Weizhi Zhang, Liancheng Fang, Yibo Wang, Yangning Li, Kay Liu, Philip S. Yu
- The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages
Jenalea Rajab, Anuoluwapo Aremu, Everlyn Asiko Chimoto, Graham Morrissey, Fadel Thior, Jessica Ojo, Atnafu Lambebo Tonja, Wilhelmina Nekoto, Pelonomi Moiloa, Jade Abbott, Vukosi Marivate, Benjamin Rosman
- Theoretical Analysis of Hierarchical Language Recognition and Generation by Transformers without Positional Encoding
Daichi Hayakawa, Issei Sato
- Less is More: Explainable and Efficient ICD Code Prediction with Clinical Entities
James Douglas, Yidong Gan, Ben Hachey, Jonathan K. Kummerfeld
- Benchmarking LLMs and LLM-based Agents in Practical Vulnerability Detection for Code Repositories
Alperen Yildiz, Sin G Teo, Yiling Lou, Yebo Feng, Chong Wang, Dinil Mon Divakaran
- Multi-Modality Expansion and Retention for LLMs through Parameter Merging and Decoupling
Junlin Li, Guodong DU, Jing Li, Sim Kuan Goh, Wenya Wang, Yequan Wang, Fangming Liu, Ho-Kin Tang, Saleh Alharbi, Daojing He, Min Zhang
- Serial Lifelong Editing via Mixture of Knowledge Experts
YuJu Cheng, Yu-Chu Yu, Kai-Po Chang, Yu-Chiang Frank Wang
- Towards Efficient LLM Post Training: A Data-centric Perspective
Junyu Luo, Bohan Wu, Xiao Luo, Zhiping Xiao, Yiqiao Jin, Rong-Cheng Tu, Nan Yin, Yifan Wang, Jingyang Yuan, Wei Ju, Ming Zhang
- IMOL: Incomplete-Modality-Tolerant Learning for Multi-Domain Fake News Video Detection
Zhi Zeng, Jiaying Wu, Minnan Luo, Herun Wan, Xiangzheng Kong, Zihan Ma, Guang Dai, Qinghua Zheng
- DDxTutor: Clinical Reasoning Tutoring System with Differential Diagnosis-Based Structured Reasoning
Qian Wu, Zheyao Gao, Longfei Gou, Qi Dou
- SocialEval: Evaluating Social Intelligence of Large Language Models
Jinfeng Zhou, Yuxuan Chen, Yihan Shi, Xuanming Zhang, Leqi Lei, Yi Feng, Zexuan Xiong, Miao Yan, Xunzhi Wang, Yaru Cao, Jianing Yin, Shuai Wang, Quanyu Dai, Zhenhua Dong, Hongning Wang, Minlie Huang
- Hidden in Plain Sight: Evaluation of the Deception Detection Capabilities of LLMs in Multimodal Settings
Md Messal Monem Miah, Adrita Anika, Xi Shi, Ruihong Huang
- Analyzing and Mitigating Inconsistency in Discrete Speech Tokens for Neural Codec Language Models
Wenrui Liu, Zhifang Guo, Jin Xu, Yuanjun Lv, Yunfei Chu, Zemin Liu, Junyang Lin
- PlanningArena: A Modular Benchmark for Multidimensional Evaluation of Planning and Tool Learning
Zihan Zheng, Tianle Cui, Chuwen Xie, Jiahui Pan, Qianglong Chen, Lewei He
- FocusLLM: Precise Understanding of Long Context by Dynamic Condensing
Zhenyu Li, Yike Zhang, Tengyu Pan, Yutao Sun, Zhichao Duan, Junjie Fang, Rong Han, Zixuan Wang, Jianyong Wang
- Negative Matters: Multi-Granularity Hard-Negative Synthesis and Anchor-Token-Aware Pooling for Enhanced Text Embeddings
Tengyu Pan, Zhichao Duan, Zhenyu Li, Bowen Dong, Ning Liu, Xiuxing Li, Jianyong Wang
- GPT-4 as a Homework Tutor Can Improve Student Engagement and Learning Outcomes
Alessandro Vanzo, Sankalan Pal Chowdhury, Mrinmaya Sachan
- Diffusion Models Through a Global Lens: Are They Culturally Inclusive?
Zahra Bayramli, Ayhan Suleymanzade, Na Min An, Huzama Ahmad, Eunsu Kim, Junyeong Park, James Thorne, Alice Oh
- Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model
Deng Qiyuan, Xuefeng Bai, Kehai Chen, Yaowei Wang, Liqiang Nie, Min Zhang
- English-based acoustic models perform well in the forced-alignment of two English-Based Pacific Creoles
Sam Passmore, Lila San Roque, Saurabh Nath, Keira Mullan, Kira Davey, Rosey Billington, Nick Thieberger, Danielle Barth
- Subtle Errors in Reasoning: Preference Learning via Error-injected Self-editing
Kaishuai Xu, Tiezheng YU, Wenjun Hou, Yi Cheng, Chak Tou Leong, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li
- Truth Knows No Language: Evaluating Truthfulness Beyond English
Blanca Calvo Figueras, Eneko Sagarzazu, Julen Etxaniz, Jeremy Barnes, Pablo Gamallo, Iria de-Dios-Flores, Rodrigo Agerri
- Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability
Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe
- Batayan: A Filipino NLP benchmark for evaluating Large Language Models
Jann Railey Montalan, Jimson Paulo Layacan, David Demitri Africa, Richell Isaiah S. Flores, Michael T. Lopez II, Theresa Denise Magsajo, Anjanette Cayabyab, William Chandra Tjhi
- HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims
Michiel van der Meer, Pavel Korshunov, Sébastien Marcel, Lonneke van der Plas
- CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory
Weichen Zhang, Chen Gao, Shiquan Yu, Ruiying Peng, Baining Zhao, Qian Zhang, Jinqiang Cui, Xinlei Chen, Yong Li
- It’s Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems
Iuliia Zaitova, Badr M. Abdullah, Wei Xue, Dietrich Klakow, Bernd Möbius, Tania Avgustinova
- PolyNarrative: A Multilingual, Multilabel, Multi-domain Dataset for Narrative Extraction from News Articles
Nikolaos Nikolaidis, Nicolas Stefanovitch, Purificação Silvano, Dimitar Iliyanov Dimitrov, Roman Yangarber, Nuno Guimarães, Elisa Sartori, Ion Androutsopoulos, Preslav Nakov, Giovanni Da San Martino, Jakub Piskorski
- Rethinking Evaluation Metrics for Grammatical Error Correction: Why Use a Different Evaluation Process than Human?
Takumi Goto, Yusuke Sakai, Taro Watanabe
- A Parameter-Efficient and Fine-Grained Prompt Learning for Vision-Language Models
Yongbin Guo, Shuzhen Li, zhulin liu, Tong Zhang, C.L.Philip Chen
- Persona Dynamics: Unveiling the Impact of Persona Traits on Agents in Text-Based Games
Seungwon Lim, Seungbeen Lee, Dongjun Min, Youngjae Yu
- SeedBench: A Multi-task Benchmark for Evaluating Large Language Models in Seed Science
Jie Ying, Zihong Chen, Zhefan Wang, Wanli Jiang, Chenyang Wang, Zhonghang Yuan, Haoyang Su, Huanjun Kong, Fan Yang, Nanqing Dong
- 𝛿-Stance: A Large-Scale Real World Dataset of Stances in Legal Argumentation
Ankita Gupta, Douglas Rice, Brendan O’Connor
- Re$^{3}$Syn: A Dependency-Based Data Synthesis Framework for Long-Context Post-training
Zhiyang Zhang, Ziqiang Liu, Huiming Wang, Renke Shan, Li Kuang, Lu Wang
- Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions
Jihyoung Jang, Minwook Bae, Minji Kim, Dilek Hakkani-Tür, Hyounghun Kim
- Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation
Chengwei Qin, Wenxuan Zhou, Karthik Abinav Sankararaman, Nanshu Wang, Tengyu Xu, Alexander Radovic, Eryk Helenowski, Arya Talebzadeh, Aditya Tayade, Sinong Wang, Shafiq Joty, Han Fang, Hao Ma
- Multimodal Coreference Resolution for Chinese Social Media Dialogues: Dataset and Benchmark Approach
Xingyu Li, Chen Gong, Guohong Fu
- TACLR: A Scalable and Efficient Retrieval-based Method for Industrial Product Attribute Value Identification
Yindu Su, Huike Zou, Lin Sun, Ting Zhang, Haiyang Yang, Chen Li Yu, David Lo, qingheng zhang, Shuguang Han, jufeng chen
- A Review of Theory of Mind Capabilities in Large Language Models
Ruirui Chen, Weifeng Jiang, Chengwei Qin, Cheston Tan
- Completing A Systematic Review in Hours instead of Months with Interactive AI Agents
Rui Qiu, Shijie Chen, Yu Su, Po-Yin Yen, Han Wei Shen
- CMHKF: Cross-Modality Heterogeneous Knowledge Fusion for Weakly Supervised Video Anomaly Detection
Shengping Song, Yongsen Zheng, Wuchun He, Guohua Wang
- CLaSp: In-Context Layer Skip for Self-Speculative Decoding
Longze Chen, Renke Shan, Huiming Wang, Lu Wang, Ziqiang Liu, Run Luo, Jiawei Wang, Hamid Alinejad-Rokny, Min Yang
- Teaching Text Agents to Learn Sequential Decision Making from Failure
Canasai Kruengkrai, Koichiro Yoshino
- The Harmonic Structure of Information Contours
Eleftheria Tsipidi, Samuel Kiegeland, Franz Nowak, Tianyang Xu, Ethan Wilcox, Alex Warstadt, Ryan Cotterell, Mario Giulianelli
- REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark
navve wasserman, roi pony, Oshri Naparstek, Adi Raz Goldfarb, Eli Schwartz, Udi Barzelay, Leonid Karlinsky
- Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models
Mats Faulborn, Indira Sen, Max Pellert, Andreas Spitz, David Garcia
- LongSafety: Evaluating Long-Context Safety of Large Language Models
Yida Lu, Jiale Cheng, Zhexin Zhang, Shiyao Cui, Cunxiang Wang, Xiaotao Gu, Yuxiao Dong, Jie Tang, Hongning Wang, Minlie Huang
- Exploiting Contextual Knowledge in LLMs through $\mathcal{V}$-usable Information based Layer Enhancement
Xiaowei Yuan, Zhao Yang, Ziyang Huang, Yequan Wang, Siqi Fan, Yiming Ju, Jun Zhao, Kang Liu
- Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights
Sooyung Choi, Jaehyeok Lee, Xiaoyuan Yi, Jing Yao, Xing Xie, JinYeong Bak
- Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval
Hani Alomari, Anushka Sivakumar, Andrew Zhang, Chris Thomas
- The Noisy Path from Source to Citation: Measuring How Scholars Engage with Past Research
Hong Chen, Misha Teplitskiy, David Jurgens
- MAPLE: Enhancing Review Generation with Multi-Aspect Prompt LEarning in Explainable Recommendation
Ching-Wen Yang, Zhi-Quan Feng, Ying-Jia Lin, Che Wei Chen, Kun-da Wu, Hao Xu, Yao Jui-Feng, Hung-Yu Kao
- Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers
Clément Dumas, Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West
- Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey
Ivan Vegner, Sydelle de Souza, Valentin Forch, Martha Lewis, Leonidas A. A. Doumas
- Dynamic Chunking and Selection for Reading Comprehension of Ultra-Long Context in Large Language Models
Boheng Sheng, Jiacheng Yao, Meicong Zhang, Guoxiu He
- DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering
Rong Cheng, Jinyi Liu, YAN ZHENG, Fei Ni, Jiazhen Du, Hangyu Mao, Fuzheng Zhang, Bo Wang, Jianye HAO
- Deliberate Reasoning for Language Models as Structure-aware Planning with Accurate World Model
Siheng Xiong, Ali Payani, Yuan Yang, Faramarz Fekri
- Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models
Xinxin Liu, Aaron Thomas, Cheng Zhang, Jianyi Cheng, Yiren Zhao, Xitong Gao
- Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention
Emily Xiao, Chin-Jou Li, Yilin Zhang, Graham Neubig, Amanda Bertsch
- ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
Rui Pan, Dylan Zhang, Hanning Zhang, Xingyuan Pan, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang
- BeaverTails v2: Towards Multi-Level Safety Alignment for LLMs with Human Preference
Jiaming Ji, Donghai Hong, Borong Zhang, Boyuan Chen, Josef Dai, Boren Zheng, Tianyi Qiu, Jiayi Zhou, Kaile Wang, Boxun Li, Sirui Han, Yike Guo, Yaodong Yang
- What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective
Ming Li, Yanhong Li, Tianyi Zhou
- Beyond Text Compression: Evaluating Tokenizers Across Scales
Jonas F. Lotz, António V. Lopes, Stephan Peitz, Hendra Setiawan, Leonardo Emili
- WiCkeD: A Simple Method to Make Multiple Choice Benchmarks More Challenging
Ahmed Elhady, Eneko Agirre, Mikel Artetxe
- Emergent Abilities of Large Language Models under Continued Pre-training for Language Adaptation
Ahmed Elhady, Eneko Agirre, Mikel Artetxe
- R-Fairness: Assessing Fairness of Ranking in Subjective Data
Lorenzo Balzotti, Donatella Firmani, Jerin George Mathew, Riccardo Torlone, Sihem Amer-Yahia
- RePanda: Pandas-powered Tabular Verification and Reasoning
Atoosa Chegini, Keivan Rezaei, Hamid Eghbalzadeh, Soheil Feizi
- Towards Style Alignment in Cross-Cultural Translation
Shreya Havaldar, Adam Stein, Eric Wong, Lyle Ungar
- TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li, Mohammadreza Armandpour, Seyed Iman Mirzadeh, Sachin Mehta, Vaishaal Shankar, Raviteja Vemulapalli, Samy Bengio, Oncel Tuzel, Mehrdad Farajtabar, Hadi Pouransari, Fartash Faghri
- Entailed Between the Lines: Incorporating Implication into NLI
Shreya Havaldar, Hamidreza Alvari, John Palowitch, Mohammad Javad Hosseini, Senaka Buthpitiya, Alex Fabrikant
- Multi-Level Explanations for Generative Language Models
Lucas Monteiro Paes, Dennis Wei, Hyo Jin Do, Hendrik Strobelt, Ronny Luss, Amit Dhurandhar, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Werner Geyer, Soumya Ghosh
- A Multi-Agent Framework for Mitigating Dialect Biases in Privacy Policy Question-Answering Systems
Đorđe Klisura, Astrid R Bernaga Torres, Anna Karen Gárate-Escamilla, Rajesh Roshan Biswal, Ke Yang, Hilal Pataci, Anthony Rios
- Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu
- Enhancing User-Controlled Text-to-Image Generation with Layout-Aware Personalization
Hongliang Luo, Wei Xi
- LETS-C: Leveraging Text Embedding for Time Series Classification
Rachneet Kaur, Zhen Zeng, Tucker Balch, Manuela Veloso
- Benchmarking Video-Language Models for Embodied Motion Cognition in Urban Open-Ended Spaces
Baining Zhao, Jianjie Fang, Zichao Dai, Ziyou Wang, Jirong Zha, Weichen Zhang, Chen Gao, Yue Wang, Jinqiang Cui, Xinlei Chen, Yong Li
- HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval
Sungho Park, Joohyung Yun, Jongwuk Lee, Wook-Shin Han
- ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities
Adhiraj Ghosh, Sebastian Dziadzio, Ameya Prabhu, Vishaal Udandarao, Samuel Albanie, Matthias Bethge
- La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America
María Grandury, Javier Aula-Blasco, Júlia Falcão, Clémentine Fourrier, Miguel González Saiz, Gonzalo Martínez, Gonzalo Santamaria Gomez, Rodrigo Agerri, Nuria Aldama García, Luis Chiruzzo, Javier Conde, Helena Gomez Adorno, Marta Guerrero Nieto, Guido Ivetta, Natàlia López Fuertes, Flor Miriam Plaza-del-Arco, María-Teresa Martín-Valdivia, Helena Montoro Zamorano, Carmen Muñoz Sanz, Pedro Reviriego, Leire Rosado Plaza, Alejandro Vaca Serrano, Estrella Vallecillo-Rodríguez, Jorge Vallego, Irune Zubiaga
- Navigating the Prompt Space: Supervision Matters in CoT When Reasoning Misleads
Xiang Zhang, Juntai Cao, Chenyu You, Dujian Ding
- Energy Considerations of Large Language Model Inference and Efficiency Optimizations
Jared Fernandez, Clara Na, Vashisth Tiwari, Yonatan Bisk, Sasha Luccioni, Emma Strubell
- Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models
Lior Belenki, Alekh Agarwal, Tianze Shi, Kristina Toutanova
- BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving
Ran Xin, Chenguang Xi, Jie Yang, Feng Chen, Hang Wu, Xia Xiao, Yifan Sun, Shen Zheng, Ming Ding
- Tempest: Automatic Multi-Turn Jailbreaking of Large Language Models with Tree Search
Andy Zhou, Ron Arel
- Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation
Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang, Yanfei Chen, Jindong Gu, Long Le, Kai-Wei Chang, Chen-Yu Lee, Hamid Palangi, Tomas Pfister
- Cross-Lingual Representation Alignment Through Contrastive Image-Caption Tuning
Nathaniel Krasner, Nicholas Lanuzo, Antonios Anastasopoulos
- Logic-Regularized Verifier Elicits Reasoning from LLMs
Xinyu Wang, Changzhi Sun, Lian Cheng, Yuanbin Wu, Dell Zhang, Xuelong Li, Xiaoling Wang
- Squeezed Attention: Fast Fixed Context Processing for Long Context Length LLM Applications
Coleman Richard Charles Hooper, Sehoon Kim, Hiva Mohammadzadeh, Monishwaran Maheswaran, Sebastian Zhao, June Paik, Michael W. Mahoney, Kurt Keutzer, Amir Gholami
- LangMark: A Multilingual Dataset for Automatic Post-Editing
Diego Velazquez, Mikaela Grace, Konstantinos Karageorgos, Lawrence Carin, Aaron Schliem, Dimitrios Zaikis, Roger Wechsler
- Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer
Guodong DU, Jing Li, Zitao Fang, Junlin Li, Runhua Jiang, Shuyang Yu, Yifei Guo, Yangneng Chen, Sim Kuan Goh, Ho-Kin Tang, Daojing He, Honghai LIU, Min Zhang
- Merge Hijacking: Backdoor Attacks to Model Merging of Large Language Models
Zenghui Yuan, Yangming Xu, Jiawen Shi, Pan Zhou, Lichao Sun
- LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering
Zhifan Ye, Zheng Wang, Kejing Xia, Jihoon Hong, Leshu Li, Lexington Whalen, Cheng Wan, Yonggan Fu, Yingyan Celine Lin, Souvik Kundu
- Where Are We? Evaluating LLM Performance on African Languages
Ife Adebara, Hawau Olamide Toyin, Nahom Tesfu Ghebremichael, AbdelRahim A. Elmadany, Muhammad Abdul-Mageed
- Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning
Chengwei Qin, Wenhan Xia, Fangkai Jiao, Chen Chen, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty
- CiteEval: Principle-Driven Citation Evaluation for Source Attribution
Yumo Xu, Peng Qi, Jifan Chen, Kunlun Liu, Rujun Han, Lan Liu, Bonan Min, Vittorio Castelli, Arshit Gupta, Zhiguo Wang
- HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agentic Tasks with Large Language Models
Mengkang Hu, Tianxing Chen, Qiguang Chen, Yao Mu, Wenqi Shao, Ping Luo
- Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models
Jongho Kim, seung-won hwang
- EducationQ: Evaluating LLMs’ Teaching Capabilities Through Multi-Agent Dialogue Framework
Yao Shi, Rongkeng Liang, Yong Xu
- KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning
Peiqi Sui, Juan Diego Rodriguez, Philippe Laban, J. Dean Murphy, Joseph P. Dexter, Richard Jean So, Samuel Baker, Pramit Chaudhuri
- Efficient Domain Continual pretraining by Mitigating the Stability Gap
Yiduo Guo, Jie Fu, Huishuai Zhang, Dongyan Zhao
- Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs
Fakhraddin Alwajih, Abdellah EL MEKKI, Samar Mohamed Magdy, AbdelRahim A. Elmadany, OMER NACAR, El Moatez Billah Nagoudi, Reem Abdel-Salam, Hanin atwany, Youssef Nafea, Abdulfattah Mohammed Yahya, Rahaf Alhamouri, Hamzah A. Alsayadi, Hiba Zayed, Sara Shatnawi, Serry Sibaee, Yasir ECH-CHAMMAKHY, Walid Al-Dhabyani, Marwa Mohamed Ali, Imen JARRAYA, Ahmed Oumar El-Shangiti, Aisha Alraeesi, Mohammed Anwar AL-Ghrawi, Abdulrahman S. Al-Batati, Elgizouli Mohamed, Noha Taha Elgindi, Muhammed Saeed, Houdaifa Atou, Issam AIT YAHIA, Abdelhak Bouayad, Mohammed Machrouh, AMAL MAKOUAR, Dania Alkawi, Mukhtar Mohamed, Safaa Taher Abdelfadil, Amine Ziad Ounnoughene, Anfel ROUABHIA, Rwaa Assi, Ahmed Sorkatti, Mohamedou cheikh tourad, Anis Koubaa, Ismail Berrada, Mustafa Jarrar, Shady Shehata, Muhammad Abdul-Mageed
- NewsInterview: a Dataset and a Playground to Evaluate LLMs’ Grounding Gap via Informational Interviews
Alexander Spangher, Michael Lu, Sriya Kalyan, Hyundong Justin Cho, Tenghao Huang, Weiyan Shi, Jonathan May
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
Tao Zhang, ChengLIn Zhu, Yanjun Shen, Wenjing Luo, Yan Zhang, Hao Liang, Tao Zhang, Fan Yang, Mingan Lin, Yujing Qiao, Weipeng Chen, Bin CUI, Wentao Zhang, Zenan Zhou
- Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 13 Indian Languages
Ashwin Sankar, Sparsh Jain, Nikhil Narasimhan, Devilal Choudhary, Dhairya Suman, Mohammed Safi Ur Rahman Khan, Anoop Kunchukuttan, Mitesh M Khapra, Raj Dabre
- CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG
yang tian, Fan Liu, Jingyuan Zhang, V. W., Yupeng Hu, Liqiang Nie
- Mapping 1,000+ Language Models via the Log-Likelihood Vector
Momose Oyama, Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira
- ConsistencyChecker: Tree-based Evaluation of LLM Generalization Capabilities
Zhaochen Hong, Haofei Yu, Jiaxuan You
- Robust Estimation of Population-Level Effects in Repeated-Measures NLP Experimental Designs
Alejandro Benito-Santos, Adrian Ghajari, Víctor Fresno
- FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation
Farima Fatahi Bayat, Lechen Zhang, Sheza Munir, Lu Wang
- Training-free LLM Merging for Multi-task Learning
Zichuan Fu, Xian Wu, Yejing Wang, Wanyu Wang, Shanshan Ye, Hongzhi Yin, Yi Chang, Yefeng Zheng, Xiangyu Zhao
- Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection
Mingyu Derek Ma, Yanna Ding, Zijie Huang, Jianxi Gao, Yizhou Sun, Wei Wang
- Comparison-based Active Preference Learning for Multi-dimensional Personalization
Minhyeon Oh, Seungjoon Lee, Jungseul Ok
- OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Siming Huang, Tianhao Cheng, Jason Klein Liu, Weidi Xu, JIARAN HAO, Liuyihan Song, Yang Xu, Jian Yang, Jiaheng Liu, Chenchen Zhang, Linzheng Chai, Ruifeng Yuan, Xianzhen Luo, Qiufeng Wang, YuanTao Fan, Qingfu Zhu, Zhaoxiang Zhang, Yang Gao, Jie Fu, Qian Liu, Houyi Li, Ge Zhang, Yuan Qi, Xu Yinghui, Wei Chu, Zili Wang
- LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs
Chansung Park, Juyong Jiang, Fan Wang, Sayak Paul, Jing Tang
- AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment
Anastasia Ivanova, Zoya Volovikova, Bakaeva Eva, Alexey Kovalev, Aleksandr Panov
- SocialDuolingo: Interactive Evaluation for Cultural Competence in Language Agents
Jincenzi Wu, Jianxun Lian, Dingdong WANG, Helen M. Meng
- Scalable Vision Language Model Training via High Quality Data Curation
Hongyuan Dong, Zijian Kang, Weijie Yin, LiangXiao, ChaoFeng, Ran Jiao
- GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion
Sunkyung Lee, Minjin Choi, Eunseong Choi, Hye-young Kim, Jongwuk Lee
- Towards Economical Inference: Enabling DeepSeek’s Multi-Head Latent Attention in Any Transformer-based LLMs
Tao Ji, Bin Guo, Yuanbin Wu, Qipeng Guo, shenlixing, chenzhan, Xipeng Qiu, Qi Zhang, Tao Gui
- TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding
Zhaoxuan Wu, Zijian Zhou, Arun Verma, Alok Prakash, Daniela Rus, Bryan Kian Hsiang Low
- Introducing Verification Task of Set Consistency with Set-Consistency Energy Networks
Mooho Song, Hye Ryung Son, Jay-Yoon Lee
- A subtle deception beyond lying: LLMs for strategic phrasing in legislation
Atharvan Dogra, Krishna Pillutla, Ameet Deshpande, Ananya B. Sai, John J Nay, Tanmay Rajpurohit, Ashwin Kalyan, Balaraman Ravindran
- AfroCS-xs: Creating a Compact, High-Quality, Human-Validated Code-Switched Dataset for African Languages
Kayode Olaleye, Arturo Oncevay, Mathieu Sibue, Nombuyiselo Zondi, Michelle Terblanche, Sibongile Mapikitla, Richard Lastrucci, Charese Smiley, Vukosi Marivate
- Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models
Muhammad Reza Qorib, Junyi Li, Hwee Tou Ng
- Design Choices for Extending the Context Length of Visual Language Models
Mukai Li, Lei Li, Shansan Gong, Qi Liu