- Explicit vs. Implicit: Investigating Social Bias in Large Language Models through Self-Reflection
Yachao Zhao, Bo Wang, Yan Wang, Dongming Zhao, Ruifang He, Yuexian Hou
- Beyond Perception: Evaluating Abstract Visual Reasoning through Multi-Stage Task
Yanbei Jiang, Yihao Ding, Chao Lei, Jiayang Ao, Jey Han Lau, Krista A. Ehinger
- How Numerical Precision Affects Arithmetical Reasoning Capabilities of LLMs
Guhao Feng, Kai Yang, Yuntian Gu, Xinyue Ai, Shengjie Luo, Jiacheng Sun, Di He, Zhenguo Li, Liwei Wang
- Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts
Zeliang Zhang, Xiaodong Liu, Hao Cheng, Chenliang Xu, Jianfeng Gao
- A Persona-Aware LLM-Enhanced Framework for Multi-Session Personalized Dialogue Generation
Dongshuo Liu, Zhijing Wu, Dandan Song, Heyan Huang
- Exploring In-Image Machine Translation with Real-World Background
Yanzhi Tian, Zeming Liu, Zhengyang Liu, Yuhang Guo
- BayesKD: Bayesian Knowledge Distillation for Compact LLMs in Constrained Fine-tuning Scenarios
Wei Li, Lujun Li, Mark G. Lee, Shengjie Sun, Lei Zhang, Wei Xue, Yike Guo
- GOLFer: Smaller LMs-Generated Documents Hallucination Filter & Combiner for Query Expansion in Information Retrieval
Lingyuan Liu, Mengxiang Zhang
- Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Models-based Query Expansion
Lingyuan Liu, Mengxiang Zhang
- Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification
Alexander Shvets
- Multi-Prompting Decoder Helps Better Language Understanding
Zifeng Cheng, Zhaoling Chen, Zhiwei Jiang, Yafeng Yin, Cong Wang, Shiping Ge, Qing Gu
- Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction
Sam O’Connor Russell, Naomi Harte
- The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning
Bingxiang He, Ning Ding, Cheng Qian, Jia Deng, Ganqu Cui, Lifan Yuan, Haiwen Hong, Huan-ang Gao, Longtao Huang, Hui Xue, Huimin Chen, Zhiyuan Liu, Maosong Sun
- $\texttt{M$^3$FinMeeting}$: A Multilingual, Multi-Sector, and Multi-Task Financial Meeting Understanding Evaluation Dataset
Jie Zhu, Junhui Li, yalong wen, Xiandong Li, Lifan Guo, Feng Chen
- ODDA: An OODA-Driven Diverse Data Augmentation Framework for Low-Resource Relation Extraction
Yijie Zhong, Yunfan Gao, Xiaolian Zhang, Haofen Wang
- Detecting and Mitigating Challenges in Zero-Shot Video Summarization with Video LLMs
Luca Cagliero, Lorenzo Vaiani, Eliana Pastor, Alkis Koudounas, Elena Baralis, Vittorio Mazzia, Sandro Pollastrini, Thomas Gueudre, Manuel Giollo, Daniele Amberti, Yue Wu
- Entity Framing and Role Portrayal in the News
Tarek Mahmoud, Zhuohan Xie, Dimitar Iliyanov Dimitrov, Nikolaos Nikolaidis, Purificação Silvano, Roman Yangarber, Shivam Sharma, Elisa Sartori, Nicolas Stefanovitch, Giovanni Da San Martino, Jakub Piskorski, Preslav Nakov
- Derailer-Rerailer: Adaptive Verification for Efficient and Reliable Language Model Reasoning
Guangya Wan, Yuqi Wu, Hao Wang, Shengming Zhao, Jie Chen, Sheng Li
- Leveraging Large Language Models for Conversational Multi-Doc Question Answering: The First Place of WSDM Cup 2024
Yiming Li, Zhao Zhang
- TreeRAG: Unleashing the Power of Hierarchical Storage for Enhanced Knowledge Retrieval in Long Documents
Wenyu Tao, Xiaofen Xing, Yirong Chen, Linyi Huang, Xiangmin Xu
- Attention with Dependency Parsing Augmentation for Fine-Grained Attribution
Qiang Ding, Lvzhou Luo, Yixuan Cao, Ping Luo
- ASTRO: Automatic Strategy Optimization For Non-Cooperative Dialogues
Yikuan Hu, Chen Huang, Wenqiang Lei
- Defensive Prompt Patch: A Robust and Generalizable Defense of Large Language Models against Jailbreak Attacks
Chen Xiong, Xiangyu Qi, Pin-Yu Chen, Tsung-Yi Ho
- GUM-SAGE: A Novel Dataset and Approach for Graded Entity Salience Prediction
Jessica Lin, Amir Zeldes
- Verifying the Steps of Deductive Reasoning Chains
Zacchary Sadeddine, Fabian M. Suchanek
- Translate With Care: Addressing Gender Bias, Neutrality, and Reasoning in Large Language Model Translations
Pardis Sadat Zahraei, Ali Emami
- Utilizing Semantic Textual Similarity for Clinical Survey Data Feature Selection
Benjamin C Warner, Ziqi Xu, Simon Haroutounian, Thomas Kannampallil, Chenyan Lu
- Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs
Runchu Tian, Yanghao Li, Yuepeng Fu, Siyang Deng, Qinyu Luo, Cheng Qian, Shuo Wang, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Huadong Wang, Xiaojiang Liu
- Variable Layerwise Quantization: A Simple and Effective Approach to Quantize LLMs
Razvan-Gabriel Dumitru, Vikas Yadav, Rishabh Maheshwary, Paul Ioan Clotan, Sathwik Tejaswi Madhusudhan, Mihai Surdeanu
- Why Are Positional Encodings Nonessential for Deep Autoregressive Transformers? A Petroglyph Revisited
Kazuki Irie
- CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation
Guofeng Cui, Pichao WANG, Yang Liu, Zemian Ke, Zhu Liu, Vimal Bhat
- Talking Point based Ideological Discourse Analysis in News Events
Nishanth Sridhar Nakshatri, Nikhil Mehta, Siyi Liu, Sihao Chen, Daniel Hopkins, Dan Roth, Dan Goldwasser
- FlashBack: Efficient Retrieval-Augmented Language Modeling for Fast Inference
Runheng Liu, Xingchen Xiao, Heyan Huang, Zewen Chi, Zhijing Wu
- CMQCIC-Bench: A Chinese Benchmark for Evaluating Large Language Models in Medical Quality Control Indicator Calculation
Guangya Yu, Yanhao Li, Zongying Jiang, Yuxiong Jin, Li Dai, Yupian Lin, Ruihui Hou, Weiyan Zhang, Yongqi Fan, Qi Ye, Jingping Liu, Tong Ruan
- ConceptEdit: Conceptualization-Augmented Knowledge Editing in Large Language Models for Commonsense Reasoning
Liyu Zhang, Weiqi Wang, Tianqing Fang, Yangqiu Song
- Exploring Multi-Modal Integration with Tool-Augmented LLM Agents for Precise Causal Discovery
ChengAo Shen, Zhengzhang Chen, Dongsheng Luo, Dongkuan Xu, Haifeng Chen, Jingchao Ni
- PARSQL: Enhancing Text-to-SQL through SQL Parsing and Reasoning
Yaxun dai, Haiqin Yang, Mou Hao, Pingfu Chao
- Revisiting ``The Geometry of Truth’’: Emergent Consistent Linear Representation of Truthfulness in Capable Language Models
Yuntai Bao, Tianyu Du, Xuhong Zhang, Xinkui Zhao, Zhengwen Feng, Jianwei Yin
- Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization
Hritik Bansal, Ashima Suvarna, Gantavya Bhatt, Nanyun Peng, Kai-Wei Chang, Aditya Grover
- TestAgent: An Adaptive and Intelligent Expert for Human Assessment
Junhao Yu, Yan Zhuang, Yuxuan Sun, Weibo Gao, Qi Liu, Mingyue Cheng, Zhenya Huang, Enhong Chen
- SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment
Quan Ze Chen, Kevin Feng, Chan Young Park, Amy X Zhang
- First-Step Advantage: Importance of Starting Right in Multi-Step Math Reasoning
Kushal Jain, Moritz Miller, Niket Tandon, Kumar Shridhar
- Evaluating Instructively Generated Statement by Large Language Models for Directional Event Causality Identification
Wei Xiang, Chuanhong Zhan, Qing Zhang, Bang Wang
- CoinMath: Harnessing the Power of Coding Instruction for Math LLM
Chengwei Wei, Bin Wang, Jung-jae Kim, Guimei Liu, Nancy F. Chen
- Profiling News Media for Factuality and Bias Using LLMs and the Fact-Checking Methodology of Human Experts
Zain Muhammad Mujahid, Dilshod Azizov, Maha Tufail Agro, Preslav Nakov
- Structured Discourse Representation for Factual Consistency Verification
Kun Zhang, Oana Balalau, Ioana Manolescu
- SHARP: Unlocking Interactive Hallucination via Stance Transfer in Role-Playing LLMs
Chuyi Kong, Ziyang Luo, Hongzhan Lin, Zhiyuan Fan, Yaxin Fan, Yuxi SUN, Jing Ma
- Understanding the Gap: an Empirical Study of Research Collaborations in NLP and Language Documentation
Luke Gessler, Alexis Palmer, Katharina von der Wense
- PersonaBench: Evaluating AI Models on Understanding Personal Information through Accessing (Synthetic) Private User Data
Juntao Tan, Liangwei Yang, Zuxin Liu, Zhiwei Liu, Rithesh R N, Tulika Manoj Awalgaonkar, Jianguo Zhang, Weiran Yao, Ming Zhu, Shirley Kokane, silvio savarese, Huan Wang, Caiming Xiong, Shelby Heinecke
- Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning
Simret A Gebreegziabher, Kuangshi Ai, Zheng Zhang, Elena Glassman, Toby Jia-Jun Li
- ORBIT: Cost-Effective Dataset Curation for Large Language Model Domain Adaptation with an Astronomy Case Study
Eric Modesitt, Ke Yang, Spencer Hulsey, Xin Liu, ChengXiang Zhai, Volodymyr Kindratenko
- Serial Position Effects of Large Language Models
Xiaobo Guo, Soroush Vosoughi
- scRAG: Hybrid Retrieval-Augmented Generation for LLM-based Cross-Tissue Single-Cell Annotation
Zhiyin Yu, Chao Zheng, Chong Chen, Xian-Sheng Hua, Xiao Luo
- Can Large Language Models Address Open-Target Stance Detection?
Abu Ubaida Akash, Ahmed Fahmy, Amine Trabelsi
- Improve Language Model and Brain Alignment via Associative Memory
Congchi Yin, Yongpeng Zhang, Xuyun Wen, Piji Li
- Towards Reliable Large Audio Language Model
Ziyang Ma, Xiquan Li, Yakun Song, Wenxi Chen, Chenpeng Du, Jian Wu, Yuanzhe Chen, Zhuo Chen, Yuping Wang, Yuxuan Wang, Xie Chen
- Large Vocabulary Size Improves Large Language Models
Sho Takase, Ryokan Ri, Shun Kiyono, Takuya Kato
- MUSE: A Multimodal Conversational Recommendation Dataset with Scenario-Grounded User Profiles
Zihan Wang, Xiaocui Yang, YongKang Liu, Shi Feng, Daling Wang, Yifei Zhang
- Machine Translation Models are Zero-Shot Detectors of Translation Direction
Michelle Wastl, Jannis Vamvas, Rico Sennrich
- Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination
Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Boxing Chen, Sarath Chandar
- GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation
Jie He, Jennifer Neville, Mengting Wan, Longqi Yang, Hui Liu, Xiaofeng Xu, Xia Song, Jeff Z. Pan, Pei Zhou
- SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution
Chengxing Xie, Bowen Li, Chang Gao, He Du, Wai Lam, Difan Zou, Kai Chen
- GlyphPattern: An Abstract Pattern Recognition for Vision-Language Models
Zixuan Wu, Yoolim Kim, Carolyn Jane Anderson
- FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation
Qianli Wang, Nils Feldhus, Simon Ostermann, Luis Felipe Villa-Arenas, Sebastian Möller, Vera Schmitt
- From Misleading Queries to Accurate Answers: A Three-Stage Fine-Tuning Method for LLMs
Guocong Li, Weize Liu, Yihang Wu, Ping Wang, Shuaihan Huang, Hongxia Xu, Jian Wu
- Separate the Wheat from the Chaff: A Post-Hoc Approach to Safety Re-Alignment for Fine-Tuned Language Models
Di Wu, Xin Lu, Yanyan Zhao, Bing Qin
- Nuclear Deployed!: Analyzing Catastrophic Risks in Decision-making of Autonomous LLM Agents
Rongwu Xu, Xiaojian Li, Shuo Chen, Wei Xu
- MoRE: A Mixture of Low-Rank Experts for Adaptive Multi-Task Learning
Dacao Zhang, Kun Zhang, Shimao Chu, Le Wu, Xin Li, Si Wei
- Lunar Twins: We Choose to Go to the Moon with Large Language Models
Xin-Yu Xiao, Erwei Yin, Xiangyu Liu, Zengrui Li, Yalei Liu, qianchen xia
- SPHERE: An Evaluation Card for Human-AI Systems
Dora Zhao, Qianou Ma, Xinran Zhao, Chenglei Si, Chenyang Yang, Ryan Louie, Ehud Reiter, Diyi Yang, Tongshuang Wu
- Data-Centric Improvements for Enhancing Multi-Modal Understanding in Spoken Conversation Modeling
Maximillian Chen, Ruoxi Sun, Sercan O Arik
- Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models
Haochen Liu, Song Wang, Chen Chen, Jundong Li
- $\texttt{UQ-Merge}$: Uncertainty Guided Multimodal Large Language Model Merging
Huaizhi Qu, Xinyu Zhao, Jie Peng, Kwonjoon Lee, Behzad Dariush, Tianlong Chen
- AQuAECHR: Attributed Question Answering for European Court of Human Rights
Korbinian Q. Weidinger, Santosh T.Y.S.S, Oana Ichim, Matthias Grabmair
- Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation
Yuhao Zhang, Xiangnan Ma, Kaiqi Kou, Peizhuo Liu, Weiqiao Shan, Benyou Wang, Tong Xiao, Yuxin Huang, Zhengtao Yu, JingBo Zhu
- Ponder & Press: Advancing Visual GUI Agent towards General Computer Control
Yiqin Wang, Haoji Zhang, Jingqi Tian, Yansong Tang
- LogicGame: Benchmarking Rule-Based Reasoning Abilities of Large Language Models
Jiayi Gui, Yiming Liu, Jiale Cheng, Xiaotao Gu, Xiao Liu, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang
- LLM-Based Multi-Agent Systems are Scalable Graph Generative Models
Jiarui Ji, Runlin Lei, Jialing Bi, Zhewei Wei, Xu Chen, Yankai Lin, Xuchen Pan, Yaliang Li, Bolin Ding
- AD-LLM: Benchmarking Large Language Models for Anomaly Detection
Tiankai Yang, Yi Nian, Li Li, Ruiyao Xu, Yuangang Li, Jiaqi li, Zhuo Xiao, Xiyang Hu, Ryan A. Rossi, Kaize Ding, Xia Hu, Yue Zhao
- RTADev: Intention Aligned Multi-Agent Framework for Software Development
Jie Liu, Guohua Wang, Ronghui Yang, Jiajie Zeng, Mengchen Zhao, Yi Cai
- TACO-RL: Task Aware Prompt Compression Optimization with Reinforcement Learning
Shivam Shandilya, Menglin Xia, Supriyo Ghosh, Huiqiang Jiang, Jue Zhang, Qianhui Wu, Victor Rühle, Saravan Rajmohan
- A Character-Centric Creative Story Generation via Imagination
Kyeongman Park, Minbeom Kim, Kyomin Jung
- Proverbs Run in Pairs: Evaluating Proverb Translation Capability of Large Language Model
Minghan Wang, Viet Thanh Pham, Farhad Moghimifar, Thuy-Trang Vu
- Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration
Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li
- Evaluating LLMs’ Factual Knowledge Utilization on Unanswerable Questions
Chuanyuan Tan, Wenbiao Shao, Hao Xiong, Tong Zhu, Zhenhua Liu, Kai Shi, Wenliang Chen
- Exploring Knowledge Filtering for Retrieval-Augmented Discriminative Tasks
Minjie Qiang, Zhongqing Wang, Xiaoyi Bao, HaoYuan Ma, Shoushan Li, Guodong Zhou
- Group then Scale: Dynamic Mixture-of-Experts Multilingual Language Model
Chong Li, Yingzhuo Deng, Jiajun Zhang, Chengqing Zong
- Beyond Verbal Cues: Emotional Contagion Graph Network for Causal Emotion Entailment
Fangxu Yu, Junjie Guo, Zhen Wu, Xinyu Dai
- Critic-CoT: Boosting the Reasoning Abilities of Large Language Model via Chain-of-Thought Critic
Xin Zheng, Jie Lou, Boxi Cao, Xueru Wen, Yuqiu Ji, Hongyu Lin, Yaojie Lu, Xianpei Han, Debing Zhang, Le Sun
- Systematic Generalization in Language Models Scales with Information Entropy
Sondre Wold, Lucas Georges Gabriel Charpentier, Étienne Simon
- Assessing the Leakage of Naturalistic Reading Time Corpora in Language Model Pre-Training Datasets
Byung-Doh Oh, Hongao Zhu, William Schuler
- Logical Consistency is Vital: Neural-Symbolic Information Retrieval for Negative-Constraint Queries
Ganlin Xu, Zhoujia Zhang, Wangyi Mei, Jiaqing Liang, Weijia Lu, xiaodong Zhang, ZHIFEI YANG, Xiaofeng Ma, Yanghua Xiao, Deqing Yang
- ‘No’ Matters: Out-of-Distribution Detection in Multimodality Multi-Turn Interactive Dialogue Download PDF
Rena Wei Gao, Xuetong Wu, Siwen Luo, Caren Han, Feng Liu
- Event Pattern-Instance Graph: A Multi-Round Role Representation Learning Strategy for Document-Level Event Argument Extraction
Qizhi Wan, LiuTao, Changxuan Wan, Rong Hu, Keli Xiao, Yuxin Shuai
- EXECUTE: A Multilingual Benchmark for LLM Token Understanding
Lukas Edman, Helmut Schmid, Alexander Fraser
- Explainable Hallucination through Natural Language Inference Mapping
Wei-Fan Chen, Zhixue Zhao, Akbar Karimi, Lucie Flek
- HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval Augmented Generation
Hao Liu, Zhengren Wang, Xi Chen, Zhiyu li, Feiyu Xiong, Qinhan Yu, Wentao Zhang
- Double Entendre: Robust Audio-Based AI-Generated Lyrics Detection via Multi-View Fusion
Markus Frohmann, Gabriel Meseguer-Brocal, Markus Schedl, Elena V. Epure
- Don’t Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models
Sangmin Woo, Donguk Kim, Jaehyuk Jang, Yubin Choi, Changick Kim
- SATA: A Paradigm for LLM Jailbreak via Simple Assistive Task Linkage
Xiaoning Dong, Wenbo Hu, Wei Xu, Tianxing He
- Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis
Yifan Hu, Rui Liu, Yi Ren, Xiang Yin, Haizhou Li
- Parameter-Efficient Fine-Tuning via Circular Convolution
Aochuan Chen, Jiashun Cheng, Zijing Liu, Ziqi Gao, Fugee Tsung, Yu Li, Jia Li
- Alleviating Hallucinations in Large Language Models \via Truthfulness-driven Rank-adaptive LoRA
Jiahao Li, Zhendong Mao, Quan Wang
- ScEdit: Script-based Assessment of Knowledge Editing
Xinye Li, Zunwen Zheng, Qian Zhang, Dekai Zhuang, Jiabao Kang, Liyan Xu, Qingbin Liu, Xi Chen, Zhiying Tu, Dianhui Chu, Dianbo Sui
- SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
Seanie Lee, Dong Bok Lee, Dominik Wagner, Minki Kang, Haebin Seong, Tobias Bocklet, Juho Lee, Sung Ju Hwang
- Moderation Matters: Measuring Conversational Moderation Impact in English as a Second Language Group Discussion
Rena Wei Gao, Ming-Bin Chen, Lea Frermann, Jey Han Lau
- Measuring Bias and Agreement in Large Language Model Presupposition Judgments
Katherine Atwell
- Harnessing PDF Data for Improving Japanese Large Multimodal Models
Jeonghun Baek, Akiko Aizawa, Kiyoharu Aizawa
- EnerGIZAr: Leveraging GIZA++ for Effective Tokenizer Initialization
Pranaydeep Singh, Eneko Agirre, Gorka Azkune, Orphee De Clercq, Els Lefever
- AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents
Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Guozhi Wang, Dingyu Zhang, Shuai Ren, Hongsheng Li
- Drop Dropout on Single Epoch Language Model Pretraining
Houjun Liu, John Bauer, Christopher D Manning
- Robust and Minimally Invasive Watermarking for EaaS
Zongqi Wang, Baoyuan Wu, Jingyuan Deng, Yujiu Yang
- Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text
Jarca Andrei, Florinel Alin Croitoru, Radu Tudor Ionescu
- CARMO: Dynamic Criteria Generation for Context Aware Reward Modelling
Taneesh Gupta, Shivam Shandilya, Xuchao Zhang, Rahul Madhavan, Supriyo Ghosh, Chetan Bansal, Huaxiu Yao, Saravan Rajmohan
- SLAM-Omni: Timbre-Controllable Voice Interaction System with Single-Stage Training
Wenxi Chen, Ziyang Ma, Ruiqi Yan, Yuzhe Liang, Xiquan Li, Ruiyang Xu, Zhikang Niu, Yanqiao Zhu, Yifan Yang, Zhanxun Liu, Kai Yu, Yuxuan Hu, Jinyu Li, Yan Lu, Shujie LIU, Xie Chen
- C$^2$LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation
Yanyang Li, Wong Tin Long, Cheung To Hung, Jianqiao Zhao, Duo Zheng, Liu Ka Wai, Michael Lyu, Liwei Wang
- Texts or Images? A Fine-grained Analysis on the Effectiveness of Input Representations and Models for Table Question Answering
Wei Zhou, Mohsen Mesgar, Heike Adel, Annemarie Friedrich
- Adaptive-VP: A Framework for LLM-Based Virtual Patients that Adapts to Trainees’ Dialogue to Facilitate Nurse Communication Training
Keyeun Lee, Seolhee Lee, Esther Hehsun Kim, Yena Ko, Jinsu Eun, Dahee Kim, Hyewon Cho, Haiyi Zhu, Robert E. Kraut, Eunyoung E. Suh, Eun-mee Kim, Hajin Lim
- Enhancing Multimodal Unified Representations for Cross Modal Generalization
Hai Huang, Yan Xia, Shengpeng Ji, Shulei Wang, Hanting Wang, Minghui Fang, Jieming Zhu, Zhenhua Dong, Sashuai zhou, Zhou Zhao
- Domain Regeneration: How well do LLMs match syntactic properties of text domains?
Da JU, Hagen Blix, Adina Williams
- Structural Deep Encoding for Table Question Answering
Raphaël Mouravieff, Benjamin Piwowarski, Sylvain Lamprier
- MPL: Multiple Programming Languages with Large Language Models for Information Extraction
Bo Li, Gexiang Fang, Wei Ye, Zhenghua Xu, Jinglei Zhang, Hao Cheng, Shikun Zhang
- Self-Critique Guided Iterative Reasoning for Multi-hop Question Answering
Zheng Chu, huiming fan, Jingchang Chen, Qianyu Wang, Mingda Yang, Jiafeng Liang, Zhongjie Wang, Hao Li, Guo Tang, Ming Liu, Bing Qin
- Anchored Answers: Unravelling Positional Bias in GPT-2’s Multiple-Choice Questions
Ruizhe Li, Yanjun Gao
- Failing Forward: Improving Generative Error Correction for ASR with Synthetic Data and Retrieval Augmentation
Sreyan Ghosh, Mohammad Sadegh Rasooli, Michael Levit, Peidong Wang, Jian Xue, Dinesh Manocha, Jinyu Li
- LTRAG: Enhancing autoformalization and self-refinement for logical reasoning with Thought-Guided RAG
Ruikang Hu, Shaoyu Lin, Yeliang Xiu, Yongmei Liu
- Eta-WavLM: Efficient Speaker Identity Removal in Self-Supervised Speech Representations Using a Simple Linear Equation
Giuseppe Ruggiero, Matteo Testa, Jurgen Van de Walle, Luigi Di Caro
- MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning
Ke Wang, Junting Pan, Linda Wei, Aojun Zhou, Weikang Shi, Zimu Lu, Han Xiao, Yunqiao Yang, Houxing Ren, Mingjie Zhan, Hongsheng Li
- MlingConf: A Comprehensive Study of Multilingual Confidence Estimation on Large Language Models
Boyang XUE, Hongru WANG, Rui Wang, Sheng Wang, Zezhong WANG, Yiming Du, Bin Liang, Wenxuan Zhang, Kam-Fai Wong
- COMPKE: Complex Question Answering under Knowledge Editing
Keyuan Cheng, Zijian Kan, Zhuoran Zhang, Muhammad Asif Ali, Lijie Hu, Di Wang
- RaaS: Reasoning-Aware Attention Sparsity for Efficient LLM Inference with Long Decoding Chains
Junhao Hu, Wenrui Huang, Weidong Wang, Zhenwen Li, tiancheng hu, Zhixia Liu, Xusheng Chen, Tao Xie, Yizhou Shan
- One-for-All Pruning: A Universal Model for Customized Compression of Large Language Models
Rongguang Ye, Ming Tang
- CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages
Shangda Wu, Guo Zhancheng, Ruibin Yuan, Junyan Jiang, SeungHeon Doh, Gus Xia, Juhan Nam, Xiaobing Li, Feng Yu, Maosong Sun
- PFDial: A Structured Dialogue Instruction Fine-tuning Method Based on UML Flowcharts
Ming Zhang, Yuhui Wang, Yujiong Shen, Tingyi Yang, Changhao Jiang, Yilong Wu, Shihan Dou, Qinhao Chen, Zhiheng Xi, Zhihao Zhang, Yi Dong, Zhen Wang, Zhihui Fei, Mingyang Wan, Tao Liang, Guojun Ma, Qi Zhang, Tao Gui, Xuanjing Huang
- Listening to Patients: Detecting and Mitigating Patient Misreport in Medical Dialogue System
Lang Qin, YAO ZHANG, Hongru Liang, Adam Jatowt, Zhenglu Yang
- Do Language Models Understand the Cognitive Tasks Given to Them? Investigations with the N-Back Paradigm
Xiaoyang Hu, Richard Lewis
- Graph-guided Cross-composition Feature Disentanglement for Compositional Zero-shot Learning
Yuxia Geng, Runkai Zhu, Jiaoyan Chen, Jintai Chen, Xiang Chen, Zhuo Chen, Shuofei Qiao, Yuxiang Wang, Xiaoliang Xu, Sheng-Jun Huang
- Training Long-Context LLMs Efficiently via Chunk-wise Optimization
Wenhao Li, Yuxin Zhang, Gen Luo, Daohai Yu, Rongrong Ji
- Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps
Jiashun Cheng, Aochuan Chen, Nuo Chen, Ziqi Gao, Yuhan Li, Jia Li, Fugee Tsung
- Can Language Models Be Used for Code Migration?
Keyuan Cheng, Xudong Shen, Yihao yang, TengyueWang, Yang Cao, Muhammad Asif Ali, Hanbin Wang, Lijie Hu, Di Wang
- A Case Study of Cross-Lingual Zero-Shot Generalization for Classical Languages in LLMs
V.S.D.S.Mahesh Akavarapu, Hrishikesh Terdalkar, Pramit Bhattacharyya, Shubhangi Agarwal, Dr. Vishakha Deulgaonkar, Chaitali Dangarikar, PRALAY MANNA, Arnab Bhattacharya
- BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation
Jilong Li, Zhenxi Song, Jiaqi Wang, Meishan Zhang, Honghai LIU, Min zhang, Zhiguo Zhang
- Progressive LoRA for Multimodal Continual Instruction Tuning
Yahan Yu, Duzhen Zhang, Yong Ren, Xuanle Zhao, Xiuyi Chen, Chenhui Chu
- ARC ‘Challenge’ Is Not That Challenging
Łukasz Borchmann
- Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek, Arianna Bisazza, Raquel Fernández
- Tracr-Injection: Distilling Algorithms into Pre-trained Language Models
Tomás Vergara Browne, Alvaro Soto
- Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization
Ximing Dong, Shaowei Wang, Dayi Lin, Ahmed Hassan
- Revisiting Weak-to-Strong Generalization in Theory and Practice: Reverse KL vs. Forward KL
Wei Yao, Wenkai Yang, Ziqiao Wang, Yankai Lin, Yong Liu
- Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups
Felix Drinkall, Stefan Zohren, Michael McMahon, Janet B. Pierrehumbert
- NetSafe: Exploring the Topological Safety of Multi-agent System
Miao Yu, Shilong Wang, Guibin Zhang, Junyuan Mao, Chenlong Yin, Qijiong Liu, Kun Wang, Qingsong Wen, Yang Wang
- Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation
Qiji Zhou, YiFan Gong, Guangsheng Bao, Hongjie Qiu, Jinqiang Li, Xiangrong Zhu, Huajian Zhang, Yue Zhang
- Initializing and Retrofitting Key-Value Adaptors for Traceable Model Editing
Hanlun Zhu, Yunshi Lan, Xiang Li, Weining Qian
- Know the Unknown: An Uncertainty-Sensitive Method for LLM Instruction Tuning
Jiaqi Li, Yixuan Tang, Yi Yang
- Position-Aware Depth Decay Decoding ($D^3$): Boosting Large Language Model Inference Efficiency
Siqi Fan, Xuezhi Fang, Xingrun Xing, Peng Han, Shuo Shang, Yequan Wang
- Explaining Puzzle Solutions in Natural Language: An Exploratory Study on 6x6 Sudoku
Anirudh Maiya, Razan Alghamdi, Maria Leonor Pacheco, Ashutosh Trivedi, Fabio Somenzi
- Stress-testing Machine Generated Text Detection: Shifting Language Models Writing Style to Fool Detectors
Andrea Pedrotti, Michele Papucci, Cristiano Ciaccio, Alessio Miaschi, Giovanni Puccetti, Felice Dell’Orletta, Andrea Esuli
- InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model
Siqi Ouyang, Xi Xu, Lei Li
- VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration
Jiahui Geng, Qing Li, Zongxiong Chen, Yuxia Wang, Derui Zhu, Zhuohan Xie, Chenyang Lyu, Xiuying Chen, Preslav Nakov, Fakhri Karray
- Learning Autnomous Code Integration for Math Language Models
Haozhe Wang, Long Li, Chao Qu, Weidi Xu, Fengming ZHU, Wei Chu, Fangzhen Lin
- GOODLIAR: A Reinforcement Learning-Based Deceptive Agent for Disrupting LLM Beliefs on Foundational Principles
Soo Kyung Kim, Hyunsoo Cho
- How Does Generation Length Affect Long-Form Factuality
James Xu Zhao, Jimmy Z.J. Liu, Bryan Hooi, See-Kiong Ng
- Scaling LLMs’ Social Reasoning: Sprinkle Cognitive “Aha Moment” into Fundamental Long-thought Logical Capabilities
Guiyang Hou, Wenqi Zhang, Zhe Zheng, Yongliang Shen, Weiming Lu
- SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation
Yuzheng Cai, Zhenyue Guo, YiWen Pei, WanRui Bian, Weiguo Zheng
- RuleEdit: Towards Rule-Level Knowledge Generalization to Mitigate Over-Editing in Large Language Models
Bihan Zhou, HaoPeng Ren, li yuan, Yi Cai, Liuwen Cao, Zikun Deng
- Eliciting In-context Retrieval and Reasoning for Long-context Language Models
Yifu QIU, Varun R. Embar, Yizhe Zhang, Navdeep Jaitly, Shay B Cohen, Benjamin Han
- GeAR: Generation Augmented Retrieval
Haoyu Liu, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Weiwei Deng, Feng Sun, Furu Wei, Qi Zhang
- A Unified Taxonomy-Guided Instruction Tuning Framework for Entity Set Expansion and Taxonomy Expansion
Yanzhen Shen, Yu Zhang, Yunyi Zhang, Jiawei Han
- Zero-Shot Conversational Stance Detection: Dataset and Approaches
Yuzhe Ding, Kang He, Bobo Li, Li Zheng, Haijun He, Fei Li, Chong Teng, Donghong Ji
- LongFaith: Enhancing Long-Context Reasoning in LLMs with Faithful Synthetic Data
Cehao Yang, Xueyuan Lin, Chengjin Xu, Xuhui Jiang, Shengjie Ma, Aofan Liu, Hui Xiong, Jian Guo
- SYNTHVERIFY: Unleashing the Power of LLMs for Zero-Shot Claim Verification via Step-by-Step Synthetic Data Generation
Rongwen Zhao, Jeffrey Flanigan
- Domain$o1$s: Guiding LLM Reasoning for Explainable Answers in High-Stakes Domains
Xu Chu, Zhijie Tan, Hanlin Xue, Guanyu Wang, Tong Mo, Weiping Li
- Dynamic Prefix as Instructor for Incremental Named Entity Recognition: A Unified Seq2Seq Generation Framework
Zihao Wu, YongXiang Hua, Yongxin Zhu, Fang Zhang, Linli Xu
- Who Taught You That? Tracing Teachers in Model Distillation
Somin Wadhwa, Chantal Shaib, Silvio Amir, Byron C Wallace
- \textit{D-GEN}: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Models
Grace Byun, Jinho D. Choi
- HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Assistant Scenarios
Jun Wang, Jiamu Zhou, Xihuai Wang, Xiaoyun Mo, Haoyu Zhang, Qiqiang Lin, jincheng, Muning Wen, Weinan Zhang, Qiuying Peng, Jun Wang
- Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines
Do Xuan Long, Duong Ngoc Yen, Do Xuan Trong, Anh Tuan Luu, Kenji Kawaguchi, Shafiq Joty, Min-Yen Kan, Nancy F. Chen
- Grammar-Constrained Natural Language Generation
Gabriele Tuccio, Luana Bulla, Maria Madonia, Aldo Gangemi, Misael Mongiovì
- MANBench: Is Your Multimodal Model Smarter than Human?
Han Zhou, Qitong Xu, Yiheng Dong, Xin Yang
- BanStereoSet: A Dataset to Measure Stereotypical Social Biases in LLMs for Bangla
Mahammed Kamruzzaman, Abdullah Al Monsur, Shrabon Kumar Das, Enamul Hassan, Gene Louis Kim
- mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus
Matthieu Futeral, Armel Randy Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot
- NorEval: A Norwegian Language Understanding and Generation Evaluation Benchmark
Vladislav Mikhailov, Tita Enstad, David Samuel, Hans Christian Farsethås, Andrey Kutuzov, Erik Velldal, Lilja Øvrelid
- Massively Multilingual Instruction-Following Information Extraction
Thang Le, Huy Huu Nguyen, Anh Tuan Luu, Thien Huu Nguyen
- DALR: Dual-level Alignment Learning for Multimodal Sentence Representation Learning
Kang He, Yuzhe Ding, Haining Wang, Fei Li, Chong Teng, Donghong Ji
- Large Language Models in Bioinformatics: A Survey
Zhenyu Wang, Zikang Wang, Jiyue Jiang, Pengan CHEN, Xiangyu Shi, Yu Li
- ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs’ Capability via Chart Editing
Xuanle Zhao, Xuexin Liu, Yang Haoyue, Xianzhen Luo, Fanhu Zeng, jianling li, Qi Shi, Chi Chen
- Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models
Qin Liu, Chao Shang, Ling Liu, Nikolaos Pappas, Jie Ma, Neha Anna John, Srikanth Doss, Lluis Marquez, Miguel Ballesteros, Yassine Benajiba
- Turbocharging Web Automation: The Impact of Compressed History States
Xiyue Zhu, Peng Tang, Haofu Liao, srikar appalaraju
- Making RALM Robust to Irrelevant Contexts via Layer Knowledge Guided Attention
Weijie Shi, Hao Chen, Jiaming Li, Yao Zhao, Yazhong Zhang, Qijin Chen, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Xiaofang Zhou
- Rewrite to Jailbreak: Discover Learnable and Transferable Implicit Harmfulness Instruction
Yuting Huang, Chengyuan Liu, Yifeng Feng, Chao Wu, Fei Wu, Kun Kuang
- Large Language Models as Sign Language Interfaces: Mitigating the Requests of Deaf Users of LLMs in a Hearing-Centric World
Mert Inan, Anthony Sicilia, Malihe Alikhani
- NegVQA: Can Vision Language Models Understand Negation?
Yuhui Zhang, Yuchang Su, Yiming Liu, Serena Yeung-Levy
- Natural Language Reasoning in Large Language Models: Analysis and Evaluation
Debela Gemechu, Ramon Ruiz-Dolz, Henrike Beyer, Chris Reed
- SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling
Haoran Wang, Zhenyu Hou, Yao Wei, Jie Tang, Yuxiao Dong
- The Two Paradigms of LLM Detection: Authorship Attribution versus Authorship Verification
Janek Bevendorff, Matti Wiegmann, Emmelie Richter, Martin Potthast, Benno Stein
- Unveiling Confirmation Bias in Chain-of-Thought Reasoning
Yue Wan, Xiaowei Jia, Xiang Lorraine Li
- GRNFormer: A Biologically-Guided Framework for Integrating Gene Regulatory Networks into RNA Foundation Models
Mufan Qiu, Xinyu Hu, Fengwei Zhan, Sukwon Yun, Jie Peng, Ruichen Zhang, Bhavya Kailkhura, Jiekun Yang, Tianlong Chen
- RemoteRAG: A Privacy-Preserving LLM Cloud RAG Service
Yihang Cheng, Lan Zhang, Junyang Wang, Mu Yuan, Yunhao Yao
- “My life is miserable, have to sign 500 autographs everyday”: Exposing Humblebragging, the Brags in Disguise
Sharath Naganna, Saprativa Bhattacharjee, Biplab Banerjee, Pushpak Bhattacharyya
- SCITAT: A Question Answering Benchmark for Scientific Tables and Text Covering Diverse Reasoning Types
Xuanliang Zhang, Dingzirui Wang, Baoxin Wang, Longxu Dou, Xinyuan Lu, Keyan Xu, Dayong Wu, Qingfu Zhu
- TokenShapley: Token Level Context Attribution with Shapley Value
Yingtai Xiao, Yuqing Zhu, Sirat Samyoun, Wanrong Zhang, Jiachen T. Wang, Jian Du
- Entropy-based Exploration Conduction for Multi-step Reasoning
Jinghan Zhang, Xiting Wang, Fengran Mo, Yeyang Zhou, Wanfu Gao, Kunpeng Liu
- Taxonomizing Representational Harms using Speech Act Theory
Hannah Washington, Emily Corvi, Stefanie Reed, Chad Atalla, Alexandra Chouldechova, P. Alex Dow, Jean Garcia-Gathright, Nicholas J Pangakis, Emily Sheng, Dan Vann, Matthew Vogel, Hanna Wallach
- Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents
Prafulla Kumar Choubey, XIANGYU PENG, Shilpa Bhagavath, Caiming Xiong, Shiva Kumar Pentyala, Chien-Sheng Wu
- Statistical inference on black-box generative models in the data kernel perspective space
Hayden Helm, Aranyak Acharyya, Youngser Park, Brandon Duderstadt, Carey Priebe
- Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Sohee Yang, Nora Kassner, Elena Gribovskaya, Sebastian Riedel, Mor Geva
- AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
Zihan Liu, Yang Chen, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
- WXImpactBench: A Disruptive Weather Impact Understanding Benchmark for Evaluating Large Language Models
Yongan Yu, Qingchen Hu, Xianda Du, Jiayin Wang, Fengran Mo, Renée Sieber
- MeMoTune: A Measure and Moment-Driven Fine-Tuning Framework for Quantized Large Language Models
Yun Zhang, Xue Geng, Lizi Liao, Jintong Sun, Minghe Yu, Ge Yu
- MALAMUTE: A Multilingual, Highly-granular, Template-free, Education-based Probing Dataset
Sagi Shaier, George Arthur Baker, Chiranthan Sridhar, Lawrence Hunter, Katharina von der Wense
- Sentimental Image Generation for Aspect-based Sentiment Analysis
Xiaoyi Bao, Jinghang Gu, Zhongqing Wang, Chu-Ren Huang
- Long-form Hallucination Detection with Self-elicitation
Zihang Liu, Jiawei Guo, Hao Zhang, Hongyang Chen, Jiajun Bu, Haishuai Wang
- ComparisonQA: Evaluating Factuality Robustness of LLMs Through Knowledge Frequency Control and Uncertainty
Qing Zong, Zhaowei Wang, Tianshi Zheng, Xiyu REN, Yangqiu Song
- One-Dimensional Object Detection for Streaming Text Segmentation of Meeting Dialogue
Rui He, Zhongqing Wang, Minjie Qiang, Hongling Wang, Yifan.zhang, Hua Xu, Shuai Fan, Guodong Zhou
- CodeTaxo: Enhancing Taxonomy Expansion with Limited Examples via Code Language Prompts
Qingkai Zeng, Yuyang Bai, Zhaoxuan Tan, Zhenyu Wu, Shangbin Feng, Meng Jiang
- Predicate-Conditional Conformalized Answer Sets for Knowledge Graph Embeddings
Yuqicheng Zhu, Daniel Hernández, Yuan He, Zifeng Ding, Bo Xiong, Evgeny Kharlamov, Steffen Staab
- Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts
Yifan Zhang, Yifan Luo, Yang Yuan, Andrew C Yao
- Learning from Committee: Reasoning Distillation from a Mixture of Teachers with Peer-Review
Zhuochun Li, Yuelyu Ji, Rui Meng, Daqing He
- Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution
Orchid Chetia Phukan, Drishti Singh, Swarup Ranjan Behera, Arun Balaji Buduru, Rajesh Sharma
- Multilingual Retrieval Augmented Generation for Culturally-Sensitive Tasks: A Benchmark for Cross-lingual Robustness
Bryan Li, Fiona Luo, Samar Haider, Adwait Agashe, Siyu Li, Runqi Liu, Muqing Miao, Shriya Ramakrishnan, Yuan Yuan, Chris Callison-Burch
- Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation
Pengyue Jia, Derong Xu, Xiaopeng Li, Zhaocheng Du, Xiangyang Li, Yichao Wang, Yuhao Wang, Qidong Liu, Maolin Wang, Huifeng Guo, Ruiming Tang, Xiangyu Zhao
- Scaling Laws for Multilingual Language Models
Yifei He, Alon Benhaim, Barun Patra, Praneetha Vaddamanu, Sanchit Ahuja, Parul Chopra, Vishrav Chaudhary, Han Zhao, Xia Song
- Corpus Poisoning via Approximate Greedy Gradient Descent
Jinyan Su, Preslav Nakov, Claire Cardie
- Taxonomy-Driven Knowledge Graph Construction for Domain-Specific Scientific Applications
Huitong Pan, Qi Zhang, Mustapha Adamu, Eduard Dragut, Longin Jan Latecki
- Wanda++: Pruning Large Language Models via Regional Gradients
Yifan Yang, Kai Zhen, Bhavana Ganesh, Aram Galstyan, Goeric Huybrechts, Markus Müller, Jonas M. Kübler, Rupak Vignesh Swaminathan, Athanasios Mouchtaris, Sravan Babu Bodapati, Nathan Susanj, Zheng Zhang, Jack FitzGerald, Abhishek Kumar
- MATCHED: Multimodal Authorship-Attribution To Combat Human Trafficking in Escort-Advertisement Data
Vageesh Kumar Saxena, Benjamin Ashpole, Gijs van Dijck, Gerasimos Spanakis
- Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements
Shu Yang, Shenzhe Zhu, Zeyu Wu, Keyu Wang, Junchi Yao, Junchao Wu, Lijie Hu, Mengdi Li, Derek F. Wong, Di Wang
- Mitigating Paraphrase Attacks on Machine-Text Detection via Paraphrase Inversion
Rafael Alberto Rivera Soto, Barry Y. Chen, Nicholas Andrews
- SANSKRITI: A Comprehensive Benchmark for Evaluating Language Models’ Knowledge of Indian Culture
Arijit Maji, Raghvendra Kumar, Akash Ghosh, Anushka, Sriparna Saha
- System Prompt Hijacking via Permutation Triggers in LLM Supply Chains
Lu Yan, Siyuan Cheng, Xuan Chen, Kaiyuan Zhang, Guangyu Shen, Xiangyu Zhang
- Frequency matters: Modeling irregular morphological patterns in Spanish with Transformers
Akhilesh Kakolu Ramarao, Kevin Tang, Dinah Baer-Henney
- From Heart to Words: Generating Empathetic Responses via Integrated Figurative Language and Semantic Context Signals
Gyeongeun Lee, Zhu Wang, Sathya N. Ravi, Natalie Parde
- There’s No Such Thing as Simple Reasoning for LLMs
Nurul Fajrin Ariyani, Zied Bouraoui, Richard Booth, Steven Schockaert
- CLIX: Cross-Lingual Explanations of Idiomatic Expressions
Aaron Gluck, Katharina von der Wense, Maria Leonor Pacheco
- Beyond Semantic Entropy: Boosting LLM Uncertainty Quantification with Pairwise Semantic Similarity
Dang Nguyen, Ali Payani, Baharan Mirzasoleiman
- R$^3$Mem: Bridging Memory Retention and Retrieval via Reversible Compression
Xiaoqiang Wang, Suyuchen Wang, Yun Zhu, Bang Liu
- Vision Language Model Helps Private Information De-Identification in Vision Data
Tiejin Chen, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei
- Unveiling Privacy Risks in Multi-modal Large Language Models: Task-specific Vulnerabilities and Mitigation Challenges
Tiejin Chen, Pingzhi Li, Kaixiong Zhou, Tianlong Chen, Hua Wei
- DeFine: Enhancing LLM Decision-Making with Factor Profiles and Analogical Reasoning
Yebowen Hu, Xiaoyang Wang, Wenlin Yao, Yiming Lu, Daoan Zhang, Hassan Foroosh, Dong Yu, Fei Liu
- SMART: Self-Aware Agent for Tool Overuse Mitigation
Cheng Qian, Emre Can Acikgoz, Hongru WANG, Xiusi Chen, Avirup Sil, Dilek Hakkani-Tür, Gokhan Tur, Heng Ji
- Enhancing Multilingual Large Language Models: Continued Pretraining and Comprehensive Evaluation for Underrepresented Languages
Pablo Rodríguez, Silvia Paniagua Suárez, Pablo Gamallo, Susana Sotelo Docio
- TC-Bench: Benchmarking Temporal Compositionality in Conditional Video Generation
Weixi Feng, Jiachen Li, Michael Saxon, Tsu-Jui Fu, Wenhu Chen, William Yang Wang
- DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference Acceleration
Hanzhi Zhang, Heng Fan, Kewei Sha, Yan Huang, Yunhe Feng
- Arbiters of Ambivalence: Challenges of using LLMs in No-Consensus tasks
Bhaktipriya Radharapu, Manon Revel, Megan Ung, Sebastian Ruder, Adina Williams
- Beyond Text: Characterizing Domain Expert Needs in Document Research
Sireesh Gururaja, Nupoor Gandhi, Jeremiah Milbauer, Emma Strubell
- Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack
Murong Yue, Ziyu Yao
- MM-R$^3$: On (In-)Consistency of Vision-Language Models (VLMs)
Shih-Han Chou, Shivam Chandhok, Jim Little, Leonid Sigal
- Investigating Context Faithfulness in Large Language Models: The Roles of Memory Strength and Evidence Style
Yuepei Li, Kang Zhou, Qiao Qiao, Bach Nguyen, Qing Wang, Qi Li
- Shadow-Activated Backdoor Attacks on Multimodal Large Language Models
Ziyi Yin, Muchao Ye, Yuanpu Cao, Jiaqi Wang, Aofei Chang, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma
- Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding
Kung-Hsiang Huang, Can Qin, Haoyi Qiu, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Wu
- K-order Ranking Preference Optimization for Large Language Models
Shihao Cai, Chongming Gao, Yang Zhang, Wentao Shi, Jizhi Zhang, Keqin Bao, Qifan Wang, Fuli Feng
- Spectral Insights into Data-Oblivious Critical Layers in Large Language Models
Xuyuan Liu, Lei Hsiung, Yaoqing Yang, Yujun Yan
- SynFix: Synchronous Repair for Codebase via RelationGraph
Xunzhu Tang, Jiechao Gao, Jin Xu, Tiezhu Sun, Yewei Song, Saad Ezzini, Wendkûuni C. Ouédraogo, Jacques Klein, Tegawendé F. Bissyandé
- EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation
Taeho Hwang, Sukmin Cho, Soyeong Jeong, Hoyun Song, SeungYoon Han, Jong C. Park
- Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives
Zhihu Wang, Shiwan Zhao, Yu Wang, Heyuan Huang, Sitao Xie, Yubo Zhang, Jiaxin Shi, Zhixing Wang, Hongyan Li, Junchi Yan
- Unlearning Backdoor Attacks for LLMs with Weak-to-Strong Knowledge Distillation
Shuai Zhao, Xiaobao Wu, Cong-Duy T Nguyen, Yanhao Jia, Meihuizi Jia, Feng Yichao, Anh Tuan Luu
- Packing Analysis: Packing Is More Appropriate for Large Models or Datasets in Supervised Fine-tuning
Shuhe Wang, Guoyin Wang, Yizhong Wang, Jiwei Li, Eduard Hovy, Chen Guo
- Better Red Teaming via Searching with Large Language Model
Yongkang Chen, Chongyang Zhao, jianwentian, Guiling Cao, Hu LI, Xiaohui Kuang
- AdaV: Adaptive Text-visual Redirection for Vision-Language Models
Jiayi Han, Liang Du, Yiwen Wu, Guanming Liang, Xiangguo Zhou, Weibo Zheng, Donghong Han, Zixun Sun
- MegaAgent: Dynamic Agent Coordination Without SOPs
Qian Wang, Tianyu Wang, Zhenheng Tang, Qinbin Li, Nuo Chen, Jingsheng Liang, Bingsheng He
- Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Xiaotian Zhang, Ruizhe Chen, YANG FENG, Zuozhu Liu
- A Self-Distillation Recipe for Neural Machine Translation
Hongfei Xu, Zhuofei Liang, Qiuhui Liu, Lingling Mu
- BlockPruner: Fine-grained Pruning for Large Language Models
Longguang Zhong, Fanqi Wan, Ruijun Chen, Xiaojun Quan, Liangzhi Li
- Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen, Keping Bi, Wei Chen, Jiafeng Guo, Xueqi Cheng
- LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-Context QA
Jiajie Zhang, Yushi Bai, Xin Lv, Wanjun Gu, Danqing Liu, Minhao Zou, Shulin Cao, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li
- An Empirical Study of Group Conformity in Multi-Agent Systems
Min Choi, Keonwoo Kim, Sungwon Chae, Sangyeop Baek
- Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation
Zhanglin Wu, Daimeng Wei, Xiaoyu Chen, Hengchao Shang, Jiaxin GUO, Zongyao Li, Yuanchang Luo, Jinlong Yang, Zhiqiang Rao, Hao Yang
- ASPO: Adaptive Sentence-Level Preference Optimization for Fine-Grained Multimodal Reasoning
Yeyuan Wang, Dehong Gao, Rujiao Long, Lei Yi, Linbo Jin, Libin Yang, Xiaoyan Cai
- NovelCR: A Large-Scale Bilingual Dataset Tailored for Long-Span Coreference Resolution
MeiHan Tong, Shuai Wang
- Dynamic Attention-Guided Context Decoding for Mitigating Context Faithfulness Hallucinations in Large Language Models
Huangyw, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao
- Exploring the Choice Behavior of Large Language Models
Weidong Wu, Qinlin Zhao, Hao Chen, Lexin Zhou, Defu Lian, Hong Xie
- On-Policy Self-Alignment with Fine-grained Knowledge Feedback for Hallucination Mitigation
Xueru Wen, Jie Lou, Xinyu Lu, Yuqiu Ji, Xinyan Guan, Yaojie Lu, Hongyu Lin, Ben He, Xianpei Han, Debing Zhang, Le Sun
- From Phrases to Subgraphs: Fine-Grained Semantic Parsing with Knowledge Graphs
Yurun Song, Xiangqing Shen, Rui Xia
- StableToolBench-MirrorAPI: Modeling Tool Environments as Mirrors of 7,000+ Real-World APIs
Zhicheng Guo, Sijie Cheng, Yuchen Niu, Hao Wang, Sicheng Zhou, Wenbing Huang, Yang Liu
- ClaimPKG: Enhancing Claim Verification via Pseudo-Subgraph Generation with Lightweight Specialized LLM
Hoang Pham, Thanh-Do Nguyen, Khac-Hoai Nam Bui
- TriEmbed: Bridge the Gap between Text and Token Indices with Embedding Reparameterization
Baizhou Huang, Xiaojun Wan
- Chain of Methodologies: Scaling Test Time Computation without Training
Cong Liu, Jie Wu, Weigang Wu, Xu Chen, Liang Lin, Wei-Shi Zheng
- A Survey on Personalized Alignment—The Missing Piece for Large Language Models in Real-World Applications
Jian Guan, Junfei Wu, Jia-Nan Li, Chuanqi Cheng, Wei Wu
- SuLoRA: Subspace Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Chenhao Ding, Jiangyang Li, SongLin Dong, Xinyuan Gao, Yuhang He, Yihong Gong
- MIRe: Enhancing Multimodal Queries Representation via Fusion-Free Modality Interaction for Multimodal Retrieval
Yeong-Joon Ju, Ho-Joong Kim, Seong-Whan Lee
- Correcting on Graph: Faithful Semantic Parsing over Knowledge Graphs with Large Language Models
Ruilin Zhao, Feng Zhao, Hong Zhang
- COPR: Continual Human Preference Learning via Optimal Policy Regularization
Han Zhang, Lin Gui, Yu Lei, Yuanzhao Zhai, Yehong Zhang, Zhuo Zhang, Yulan He, Hui Wang, Yue Yu, Kam-Fai Wong, Bin Liang, Ruifeng Xu
- Robust Preference Optimization via Dynamic Target Margins
Jie Sun, Junkang Wu, Jiancan Wu, Lintao Ma, Zhibo Zhu, Xingyu Lu, JUN ZHOU, Xiang Wang
- AdaReTaKe: Adaptive Redundancy Reduction to Perceive Longer for Video-language Understanding
Xiao Wang, Qingyi Si, Shiyu Zhu, Jianlong Wu, Li Cao, Liqiang Nie
- Rethinking Stateful Tool Use in Multi-Turn Dialogues: Benchmarks and Challenges
Hongru WANG, Wenyu Huang, Yufei Wang, Yuanhao Xi, Jianqiao Lu, Huan Zhang, Nan Hu, Zeming Liu, Jeff Z. Pan, Kam-Fai Wong
- Open-Set Living Need Prediction with Large Language Models
Xiaochong Lan, Jie Feng, Yizhou Sun, Chen Gao, Jiahuan Lei, Xinleishi, Hengliang Luo, Yong Li
- Improve Rule Retrieval and Reasoning with Self-Induction and Relevance ReEstimate
Ziyang Huang, Wangtao Sun, Jun Zhao, Kang Liu
- Beyond Words: Integrating Theory of Mind into Conversational Agents for Human-Like Belief, Desire, and Intention Alignment
Mehdi Jafari, YUNCHENG HUA, Hao Xue, Flora D. Salim
- Multimodal Causal Reasoning Benchmark: Challenging Multimodal Large Language Models to Discern Causal Links Across Modalities
Zhiyuan Li, Heng Wang, Dongnan Liu, Chaoyi Zhang, Ao Ma, Jieting Long, Weidong Cai
- Context-Aware Hierarchical Merging for Long Document Summarization
Litu Ou, Mirella Lapata
- VCD: A Dataset for Visual Commonsense Discovery in Images
Xiangqing Shen, Fanfan Wang, Siwei Wu, Rui Xia
- Self-Reasoning Language Models: Unfold Hidden Reasoning Chains with Few Reasoning Catalyst
Hongru WANG, Deng Cai, Wanjun Zhong, Shijue Huang, Jeff Z. Pan, Zeming Liu, Kam-Fai Wong
- HyperCRS: Hypergraph-Aware Multi-Grained Preference Learning to Burst Filter Bubbles in Conversational Recommendation System
Yongsen Zheng, Mingjie Qian, Guohua Wang, Yang Liu, Ziliang Chen, Mingzhi Mao, Liang Lin, Kwok-Yan Lam
- Is LLM an Overconfident Judge? Unveiling the Capabilities of LLMs in Detecting Offensive Language with Annotation Disagreement
Junyu Lu, Kai Ma, Kaichun Wang, Kelaiti Xiao, Roy Ka-Wei Lee, Bo Xu, Liang Yang, Hongfei Lin
- Language Repository for Long Video Understanding
Kumara Kahatapitiya, Kanchana Ranasinghe, Jongwoo Park, Michael S Ryoo
- Investigating Language Preference of Multilingual RAG Systems
Jeonghyun Park, Hwanhee Lee
- FGDGNN: Fine-Grained Dynamic Graph Neural Network for Rumor Detection on Social Media
Mei Guo, Chen Chen, Chunyan Hou, Yike Wu, Xiaojie Yuan
- Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching
Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Yipeng Zhang, Haitao Mi, Helen M. Meng
- QueryAttack: Jailbreaking Aligned Large Language Models Using Structured Non-natural Query Language
Qingsong Zou, Jingyu Xiao, Qing Li, Zhi Yan, Yuhang Wang, Li Xu, Wenxuan Wang, Kuofeng Gao, Ruoyu Li, Yong Jiang
- Memory or Reasoning? Explore How LLMs Compute Mixed Arithmetic Expressions
Chengzhi Li, Heyan Huang, Ping Jian, Zhen Yang, Chenxu Wang, Yifan Wang
- PersonaX: A Recommendation Agent-Oriented User Modeling Framework for Long Behavior Sequence
Yunxiao Shi, Wujiang Xu, Zhang Zeqi, Xing Zi, Qiang Wu, Min Xu
- Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models
Shuliang Liu, Xinze Li, Zhenghao Liu, Yukun Yan, Cheng Yang, Zheni Zeng, Zhiyuan Liu, Maosong Sun, Ge Yu
- Rationales Are Not Silver Bullets: Measuring the Impact of Rationales on Model Performance and Reliability
Chiwei Zhu, Benfeng Xu, An Yang, Junyang Lin, Quan Wang, Chang Zhou, Zhendong Mao
- CA-GAR: Context-Aware Alignment of LLM Generation for Document Retrieval
Heng Yu, Junfeng Kang, Rui Li, Qi Liu, Liyang He, Zhenya Huang, Shuanghong Shen, Junyu Lu
- AgentCourt: Simulating Court with Adversarial Evolvable Lawyer Agents
Guhong Chen, Liyang Fan, Zihan Gong, Nan Xie, Zixuan Li, Ziqiang Liu, Chengming Li, QIANG QU, Hamid Alinejad-Rokny, Shiwen Ni, Min Yang
- MLDebugging: Towards Benchmarking Code Debugging Across Multi-Library Scenarios
JinYang Huang, Xiachong Feng, Qiguang Chen, Hanjie Zhao, Zihui Cheng, Jiesong Bai, Jingxuan Zhou, Min Li, Libo Qin
- An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Model is not a General Substitute for GPT-4
Hui Huang, Xingyuan Bu, Hongli Zhou, Yingqi Qu, Jing Liu, Muyun Yang, Bing Xu, Tiejun Zhao
- Expectation Confirmation Preference Optimization for Multi-Turn Conversational Recommendation Agent
Xueyang Feng, Jingsen Zhang, Jiakai Tang, Wei Li, Guohao Cai, Xu Chen, Quanyu Dai, Yue Zhu, Zhenhua Dong
- ProMedTS: A Self-Supervised, Prompt-Guided Multimodal Approach for Integrating Medical Text and Time Series
Shuai Niu, Jing Ma, Hongzhan Lin, Liang Bai, Zhihua Wang, V. W., Richard Yi Da Xu, Guo Li, Xian Yang
- CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenge
Yu Li, Qizhi Pei, Mengyuan Sun, Honglin Lin, Chenlin Ming, Xin Gao, Jiang Wu, Conghui He, Lijun Wu
- Which Retain Set Matters for LLM Unlearning? A Case Study on Entity Unlearning
Hwan Chang, Hwanhee Lee
- Tell Me What You Don’t Know: Enhancing Refusal Capabilities of Role-Playing Agents via Representation Space Analysis and Editing
Wenhao Liu, Siyu An, Junru Lu, Muling Wu, Tianlong Li, Xiaohua Wang, Changze Lv, Xiaoqing Zheng, di yin, Xing Sun, Xuanjing Huang
- LR^2Bench: Evaluating Long-chain Reflective Reasoning Capabilities of Large Language Models via Constraint Satisfaction Problems
Jianghao Chen, Zhenlin Wei, Zhenjiang Ren, Ziyong Li, Jiajun Zhang
- McBE: A Multi-task Chinese Bias Evaluation Benchmark for Large Language Models
Tian Lan, Xiangdong Su, Xu Liu, Ruirui Wang, Ke Chang, Jiang Li, Guanglai Gao
- MARK: Multi-agent Collaboration with Ranking Guidance for Text-attributed Graph Clustering
Yiwei Fu, Yuxing Zhang, Chunchun Chen, JianwenMa, Quan Yuan, Rong-Cheng Tu, Xinli Huang, Wei Ye, Xiao Luo, Minghua Deng
- Can Language Models Capture Human Writing Preferences on Text Summarization?
Jingbao Luo, Ming Liu, Ran Liu, Yongpan Sheng, Gang Li, Xin Hu, WupengNjust
- Mitigate Position Bias in LLMs via Scaling a Single Hidden States Channel
Yijiong Yu, Huiqiang Jiang, Xufang Luo, Qianhui Wu, Chin-Yew Lin, Dongsheng Li, Yuqing Yang, Yongfeng Huang, Lili Qiu
- Self-attention-based Graph-of-Thought for Math Problem Solving
Ruiqiao Bai, Xue Han, Shuo Lei, Junlan Feng, Yanyan Luo, Chao Deng
- BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks
Weihong Du, Wenrui Liao, Binyu Yan, Hongru Liang, Anthony G Cohn, Wenqiang Lei
- KAPA: A Deliberative Agent Framework with Tree-Structured Knowledge Base for Multi-Domain User Intent Understanding
Jiakai Tang, Shiqi Shen, ZhipengWang, Gong Zhi, Xueyang Feng, Zexu Sun, Haoran Tan, Xu Chen
- RASD: Retrieval-Augmented Speculative Decoding
Guofeng Quan, Wenfeng Feng, Chuzhan Hao, Guochao Jiang, Yuewei Zhang, Hao Henry Wang
- FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs
Zengyi Gao, Yukun Cao, Hairu Wang, Ao Ke, Yuan Feng, S Kevin Zhou, Xike Xie
- Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models
Kening Zheng, Junkai Chen, Yibo Yan, Xin Zou, Huiyu Zhou, Xuming Hu
- Blessing of Multilinguality: A Systematic Analysis of Multilingual In-Context Learning
Yilei Tu, Andrew Xue, Freda Shi
- SEK: Self-Explained Keywords Empower Large Language Models for Code Generation
Lishui Fan, Mouxiang Chen, Zhongxin Liu
- Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement
Peng Ding, Jun Kuang, ZongYu Wang, Xuezhi Cao, Xunliang Cai, Jiajun Chen, Shujian Huang
- Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Vardaan Pahuja, Yadong Lu, Corby Rosset, Boyu Gou, Arindam Mitra, Spencer Whitehead, Yu Su, Ahmed Hassan Awadallah
- Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding
Zhanpeng Chen, Mingxiao Li, Ziyang Chen, nan du, Xiaolong Li, Yuexian Zou
- P-React: Synthesizing Topic-Adaptive Reactions of Personality Traits via Mixture of Specialized LoRA Experts
Yuhao Dan, Jie Zhou, Qin Chen, Junfeng Tian, Liang He
- EssayJudge: A Multi-Granular Benchmark for Assessing Automated Essay Scoring Capabilities of Multimodal Large Language Models
Jiamin Su, Yibo Yan, Fangteng FU, Zhang Han, Jingheng Ye, Xiang Liu, Jiahao Huo, Huiyu Zhou, Xuming Hu
- Streamlining the Collaborative Chain of Models into A Single Forward Pass in Generation-Based Tasks
Yuanjie Lyu, Chao Zhang, Yuhao Chen, Yong Chen, Tong Xu
- Self-Correction is More than Refinement: A Learning Framework for Visual and Language Reasoning Tasks
Jiayi He, Hehai Lin, Qingyun Wang, Yi R. Fung, Heng Ji
- Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation
Chenkai Sun, Denghui Zhang, ChengXiang Zhai, Heng Ji
- Probability-Consistent Preference Optimization for Enhanced LLM Reasoning
Yunqiao Yang, Houxing Ren, Zimu Lu, Ke Wang, Weikang Shi, Aojun Zhou, Junting Pan, Mingjie Zhan, Hongsheng Li
- IW-Bench: Evaluating Large Multimodal Models for Converting Image-to-Web
Hongcheng Guo, Wei Zhang, Junhao Chen, Yaonan Gu, Jian Yang, Junjia Du, Shaosheng Cao, Binyuan Hui, Tianyu Liu, Jianxin Ma, Chang Zhou, Zhoujun Li
- TDCSA: LLM-Guided Top-Down Approach for Robust Citation Sentiment Analysis
Fan Gao, Jieyang Peng, Xiaoming Tao, WANG Youzheng
- DeepRTL2: A Versatile Model for RTL-Related Tasks
Yi Liu, Hongji Zhang, Yunhao Zhou, Zhengyuan Shi, Changran XU, Qiang Xu
- The Self-Improvement Paradox: Can Language Models Bootstrap Reasoning Capabilities without External Scaffolding?
Yutao Sun, Mingshuai Chen, Tiancheng Zhao, Ruochen Xu, Zilun Zhang, Jianwei Yin
- Cross-lingual Multimodal Sentiment Analysis for Low-Resource Languages via Language Family Disentanglement and Rethinking Transfer
Long Chen, Shuoyu Guan, Xiaohua Huang, Wen-Jing Wang, Cai Xu, Ziyu Guan, Wei Zhao
- Does Chain-of-Thought Reasoning Really Reduce Harmfulness from Jailbreaking?
Chengda Lu, Xiaoyu Fan, Yu Huang, Rongwu Xu, Jijie Li, Wei Xu
- InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Yuhang Zang, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Ziyu Liu, Shengyuan Ding, Shenxi Wu, Yubo Ma, Haodong Duan, Wenwei Zhang, Kai Chen, Dahua Lin, Jiaqi Wang
- RATE-Nav: Region-Aware Termination Enhancement for Zero-shot Object Navigation with Vision-Language Models
Junjie Li, Nan Zhang, Xiaoyang Qu, Kai Lu, Guokuan Li, Jiguang Wan, Jianzong Wang
- RMoA: Optimizing Mixture-of-Agents through Diversity Maximization and Residual Compensation
Zhentao Xie, Chengcheng Han, Jinxin Shi, Wenjun Cui, Xin Zhao, Xingjiao Wu, Jiabao Zhao
- Instruction-Tuning Data Synthesis from Scratch via Web Reconstruction
Yuxin Jiang, Yufei Wang, Chuhan Wu, Xinyi Dai, Yan Xu, Weinan Gan, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Wei Wang
- RLKGF: Reinforcement Learning from Knowledge Graph Feedback Without Human Annotations
Lian Yan, Chen Tang, Yi Guan, Haotian Wang, Songyuan Wang, Haifeng Liu, Yang Yang, Jingchi Jiang
- Learning Task Representations from In-Context Learning
Baturay Saglam, Xinyang Hu, Zhuoran Yang, Dionysis Kalogerias, Amin Karbasi
- CAVGAN: Unifying Jailbreak and Defense of LLMs via Generative Adversarial Attacks on their Internal Representations
Xiaohu Li, Yunfeng Ning, Zepeng Bao, Mayi Xu, Jianhao Chen, Tieyun Qian
- Firm or Fickle? Evaluating Large Language Models Consistency in Sequential Interactions
Yubo Li, Yidi Miao, Xueying Ding, Ramayya Krishnan, Rema Padman
- OS-Kairos: Adaptive Interaction for MLLM-Powered GUI Agents
Pengzhou Cheng, Zheng Wu, Zongru Wu, Tianjie Ju, Aston Zhang, Zhuosheng Zhang, Gongshen Liu
- Red-Teaming LLM Multi-Agent Systems via Communication Attacks
Pengfei He, Yuping Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu
- Can We Trust AI Doctors? A Survey of Medical Hallucination in Large Language and Large Vision-Language Models
Zhihong Zhu, Yunyan Zhang, Xianwei Zhuang, Fan Zhang, Zhongwei Wan, Yuyan Chen, QingqingLong, Yefeng Zheng, Xian Wu
- DRT: Deep Reasoning Translation via Long Chain-of-Thought
Jiaan Wang, Fandong Meng, Yunlong Liang, Jie Zhou
- CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis
Fuying Wang, Feng Wu, Yihan Tang, Lequan Yu
- Vision-aided Unsupervised Constituency Parsing with Multi-MLLM Debating
Dong Zhang, Haiyan Tian, Qingying Sun, Shoushan Li
- Inter-Passage Verification for Multi-evidence Multi-answer QA
Bingsen Chen, Shengjie Wang, Xi Ye, Chen Zhao
- PROMTEC: Fast LLM Inference Decoding using Prompt Multi-Lookup with Template Database and Common Sequences
Alan Chi-Man Lee, Wing-Sun Cheng, Calvin Chun-Kit Chan
- Logical DA: Enhancing Data Augmentation for Logical Reasoning via a Multi-Agent System
Haoqi Zheng, DongWang, Silin Yang, Yunpeng Qi, Ruochun Jin
- Adapting General-Purpose Embedding Models to Private Datasets Using Keyword-based Retrieval
Yubai Wei, Jiale Han, Yi Yang
- SQL Injection Jailbreak: A Structural Disaster of Large Language Models
Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu
- TAMP: Token-Adaptive Layerwise Pruning in Multimodal Large Language Models
Jaewoo Lee, Keyang Xuan, Chanakya Ekbote, Sandeep Polisetty, Yi R. Fung, Paul Pu Liang
- Generative Music Models’ Alignment with Professional and Amateur Users’ Expectations
Zihao Wang, Jiaxing Yu, Haoxuan Liu, Zehui Zheng, Yuhang Jin, Shuyu Li, Shulei Ji, Kejun Zhang
- LLM-Forest: Ensemble Learning of LLMs with Graph-Augmented Prompts for Data Imputation
Xinrui He, Yikun Ban, Jiaru Zou, Tianxin Wei, Curtiss Cook, Jingrui He
- Task Calibration: Calibrating Large Language Models on Inference Tasks
Yingjie Li, Yun Luo, Xiaotian Xie, Yue Zhang
- MiniELM: A Lightweight and Adaptive Query Rewriting Framework for E-Commerce Search Optimization
Duy A. Nguyen, Rishi Kesav Mohan, Shimeng Yang, Pritom Saha Akash, Kevin Chen-Chuan Chang
- Visibility as Survival: Generalizing NLP for Native Alaskan Language Identification
Ivory Yang, Chunhui Zhang, Yuxin Wang, Zhongyu Ouyang, Soroush Vosoughi
- KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding
Zhangchen Xu, Yang Liu, Yueqin Yin, Mingyuan Zhou, Radha Poovendran
- Select, Read, and Write: A Multi-Agent Framework of Full-Text-based Related Work Generation
Xiaochuan Liu, Ruihua Song, Xiting Wang, Xu Chen
- Graph-Assisted Culturally Adaptable Idiomatic Translation for Indic languages
Pratik Rakesh Singh, Kritarth Prasad, Mohammadi Zaki, Pankaj Wasnik
- Question answering in Climate Adaptation: Model Development and Evaluation with Expert Feedback
Vincent Nguyen, Sarvnaz Karimi, Willow Hallgren, Mahesh Prakash
- AGRec: Adapting Autoregressive Decoders with Graph Reasoning for LLM-based Sequential Recommendation
Xinfeng Wang, Jin Cui, Fumiyo Fukumoto, Yoshimi Suzuki
- Causal Denoising Prototypical Network for Few-Shot Multi-label Aspect Category Detection
Jin Cui, Xinfeng Wang, Yoshimi Suzuki, Fumiyo Fukumoto
- RealHiTBench: A Comprehensive Realistic Hierarchical Table Benchmark for Evaluating LLM-Based Table Analysis
Pengzuo Wu, Yuhang Yang, Guangcheng Zhu, Chao Ye, Hong Gu, Xu Lu, Ruixuan Xiao, Bowen Bao, Yijing He, Liangyu Zha, Wentao Ye, Junbo Zhao, Haobo Wang
- A Query-Response Framework for Whole-Page Complex-Layout Document Image Translation with Relevant Regional Concentration
Zhiyang Zhang, Yaping Zhang, Yupu Liang, Zhiyuan Chen, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong
- DependEval: Benchmarking LLMs for Repository Dependency Understanding
Junjia Du, Yadi Liu, Hongcheng Guo, Jiawei Wang, Haojian Huang, Yunyi Ni, Zhoujun Li
- A General Knowledge Injection Framework for ICD Coding
Xu Zhang, Kun Zhang, Wenxin ma, Rongsheng Wang, Chenxu Wu, Yingtai Li, S Kevin Zhou
- MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models
Jiahao Huo, Yibo Yan, Xu Zheng, Yuanhuiyi Lyu, Xin Zou, Zhihua Wei, Xuming Hu
- Generating Questions, Answers, and Distractors for Videos: Exploring Semantic Uncertainty of Object Motions
Wenjian Ding, YAO ZHANG, Jun Wang, Adam Jatowt, Zhenglu Yang
- DiffSkip: Differential Layer Skipping in Large Language Models
Xuan Luo, Weizhi Wang, Xifeng Yan
- Towards Explainable Temporal Reasoning in Large Language Models: A Structure-Aware Generative Framework
Zihao Jiang, Ben Liu, Miao Peng, Wenjie Xu, Yao Xiao, Zhenyan Shan, Min Peng
- A Bounding Box is Worth One Token - Interleaving Layout and Text in a Large Language Model for Document Understanding
Jinghui Lu, Haiyang Yu, Yanjie Wang, Yongjie Ye, Jingqun Tang, Ziwei Yang, Binghong Wu, Qi Liu, Hao Feng, Han Wang, Hao Liu, Can Huang
- Self-Foveate: Enhancing Diversity and Difficulty of Synthesized Instructions from Unsupervised Text via Multi-Level Foveation
Mingzhe Li, Xin Lu, Yanyan Zhao
- TableDreamer: Progressive and Weakness-guided Data Synthesis from Scratch for Table Instruction Tuning
Mingyu Zheng, Zhifan Feng, Jia Wang, Lanrui Wang, Zheng Lin, Hao Yang, Weiping Wang
- Konooz: Cross-domain Cross-dialect Corpora for Named Entity Recognition
Nagham Hamad, Mohammed Khalilia, Mustafa Jarrar
- Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation
Hongji Yang, Yucheng Zhou, Wencheng Han, Jianbing Shen
- CodeV: Issue Resolving with Visual Data
Linhao Zhang, Daoguang Zan, Quanshun Yang, Zhirong Huang, Dong Chen, Bo Shen, Tianyu Liu, Yongshun Gong, Huang Pengjie, Xudong Lu, Guangtai Liang, Lizhen Cui, Qianxiang Wang
- A Survey of Large Language Models in Psychotherapy: Current Landscape and Future Directions
Hongbin Na, Yining Hua, Zimu Wang, Tao Shen, Beibei Yu, Lilin Wang, Wei Wang, John Torous, Ling Chen
- Breaking the Reasoning Barrier A Survey on LLM Complex Reasoning through the Lens of Self-Evolution
Tao He, Hao Li, Jingchang Chen, Runxuan Liu, Yixin Cao, Lizi Liao, Zihao Zheng, Zheng Chu, Jiafeng Liang, Ming Liu, Bing Qin
- SEE: Continual Fine-tuning with Sequential Ensemble of Experts
Zhilin Wang, Yafu Li, Xiaoye Qu, Yu Cheng
- Boosting Policy and Process Reward Models with Monte Carlo Tree Search in Open-Domain QA
Chi-Min Chan, Chunpu Xu, Junqi Zhu, Jiaming Ji, Donghai Hong, Pengcheng Wen, Chunyang Jiang, Zhen Ye, Yaodong Yang, Wei Xue, Sirui Han, Yike Guo
- Investigating and Enhancing Vision-Audio Capability in Omnimodal Large Language Models
Rui Hu, Delai Qiu, Shuyu Wei, Jiaming Zhang, Yining Wang, Shengping Liu, Jitao Sang
- OpenHuEval: Evaluating Large Language Model on Hungarian Specifics
Haote Yang, Xingjian Wei, Jiang Wu, Noémi Ligeti-Nagy, Jiaxing Sun, Yingfan Wang, Győző Zijian Yang, Junyuan Gao, Jingchao Wang, Bowen Jiang, Shasha Wang, Nanjun Yu, Zihao Zhang, Shixin Hong, Hongwei Liu, Wei Li, Songyang Zhang, Dahua Lin, Lijun Wu, Gábor Prószéky, Conghui He
- StructFact: Reasoning Factual Knowledge from Structured Data with Large Language Models
Sirui Huang, Yanggan Gu, Zhonghao Li, Xuming Hu, Li Qing, Guandong Xu
- From Imitation to Introspection: Probing Self-Consciousness in Language Models
Sirui Chen, Shu Yu, Shengjie Zhao, Chaochao Lu
- DocFusion: A Unified Framework for Document Parsing Tasks
Mingxu Chai, Ziyu Shen, Chong Zhang, Yue Zhang, Xiao Wang, Shihan Dou, Jihua Kang, Jiazheng Zhang, Qi Zhang
- Hierarchical Safety Realignment: Lightweight Restoration of Safety in Pruned Large Vision-Language Models
Yue Li, Xin Yi, Dongsheng Shi, Gerard de Melo, Xiaoling Wang, Linlin Wang
- LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information
Bowen Ping, Jiali Zeng, Fandong Meng, Shuo Wang, Jie Zhou, Shanghang Zhang
- Reinforcing Compositional Retrieval: Retrieving Step-by-Step for Composing Informative Contexts
Quanyu Long, Jianda Chen, Zhengyuan Liu, Nancy F. Chen, Wenya Wang, Sinno Jialin Pan
- Towards A Better Initial Policy Model For Scalable Long-CoT Reinforcement Learning
Bofei Gao, Yejie Wang, Yibo Miao, Feifan Song, Longhui Yu, Tianyu Liu, Baobao Chang
- Topic Modeling for Short Texts via Optimal Transport-Based Clustering
Tu Vu, Manh Do, Tung Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen
- Lemmatisation & Morphological Analysis of Unedited Greek: Do Simple Tasks Need Complex Solutions?
Colin Swaelens, Ilse De Vos, Els Lefever
- FRAME: Feedback-Refined Agent Methodology for Enhancing Medical Research Insights
Chengzhang Yu, Yiming Zhang, Zhixin Liu, Zenghui Ding, Yining Sun, Zhanpeng Jin
- Chain-of-Scrutiny: Detecting Backdoor Attacks for Large Language Models
Xi Li, Ruofan Mao, Yusen Zhang, Renze Lou, Chen Wu, Jiaqi Wang
- Relevance Scores Calibration for Ranked List Truncation via TMP Adapter
Pavel Posokhov, Sergei Masliukhin, Skrylnikov Stepan, Danil Tirskikh, Olesia Makhnytkina
- Neuron Activation Modulation for Text Style Transfer: Guiding Large Language Models
Chaona Kong, Jianyi Liu, Yifan Tang, Ru Zhang
- MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, An-Lan Wang, Chunhui Lin, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang
- HICD: Hallucination-Inducing via Attention Dispersion for Contrastive Decoding to Mitigate Hallucinations in Large Language Models
Xinyan Jiang, Hang Ye, Yongxin Zhu, Xiaoying Zheng, Zikang Chen, Jun Gong
- Understanding the Repeat Curse in Large Language Models from a Feature Perspective
Junchi Yao, Shu Yang, Jianhua Xu, Lijie Hu, Mengdi Li, Di Wang
- Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
Haneul Yoo, Cheonbok Park, Sangdoo Yun, Alice Oh, Hwaran Lee
- A Mousetrap: Fooling Large Reasoning Models for Jailbreak with Chain of Iterative Chaos
Yang Yao, Xuan Tong, Ruofan Wang, Yixu Wang, Lujundong Li, Liang Liu, Yan Teng, Yingchun Wang
- Tag-Evol: Achieving Efficient Instruction Evolving via Tag Injection
Yixuan Wang, Shiqi Zhou, Chuanzhe Guo, Qingfu Zhu
- Breaking the Ceiling: Exploring the Potential of Jailbreak Attacks through Expanding Strategy Space
Yao Huang, Yitong Sun, Shouwei Ruan, Yichi Zhang, Yinpeng Dong, Xingxing Wei
- GeNRe: a French Gender-Neutral Rewriting System Using Collective Nouns
Enzo Doyen, Amalia Todirascu
- LGAR: Zero-Shot LLM-Guided Neural Ranking for Abstract Screening in Systematic Literature Reviews
Christian Jaumann, Andreas Wiedholz, Annemarie Friedrich
- LCHAIM - Investigating Long Context Reasoning in Hebrew
Ehud Malul, Oriel Perets, Ziv Mor, Yigal Kassel, Elior Sulem
- CLeVeR: Multi-modal Contrastive Learning for Vulnerability Code Representation
Jiayuan Li, Lei Cui, Sen Zhao, Yun Yang, Lun Li, Hongsong Zhu
- MEMIT-Merge: Addressing MEMIT’s Key-Value Conflicts in Same-Subject Batch Editing for LLMs
Zilu dong, Xiangqing Shen, Rui Xia
- Large Language Models for Predictive Analysis: How Far Are They?
Qin Chen, Yuanyi Ren, Xiaojun Ma, Yuyang Shi
- Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking
Xiaoxue Cheng, Junyi Li, Xin Zhao, Ji-Rong Wen
- Towards Adaptive Memory-Based Optimization for Enhanced Retrieval-Augmented Generation
Qitao Qin, Yucong Luo, Yihang Lu, Zhibo Chu, Xiaoman Liu, Xianwei Meng
- Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping
Yijie Chen, Yijin Liu, Fandong Meng, Yufeng Chen, Jinan Xu, Jie Zhou
- A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Models
Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
- CNNSum: Exploring Long-Context Summarization with Large Language Models in Chinese Novels
Lingxiao Wei, He Yan, Lu Xiangju, Junmin Zhu, Jun Wang, Wei Zhang
- Document Segmentation Matters for Retrieval-Augmented Generation
Zhitong Wang, Cheng Gao, Yufei Huang, Shuzheng Si, Kangyang Luo, Yuzhuo Bai, Wenhao Li, Tangjian Duan, Chuancheng Lv, Guoshan Lu, Gang Chen, Fanchao Qi, Chaojun Xiao, Maosong Sun
- UBench: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions
Xunzhi Wang, Zhuowei Zhang, Gaonan Chen, Qiongyu Li, Bitong Luo, Zhixin Han, Haotian Wang, Zhiyu li, Hang Gao, Mengting Hu
- Embracing Large Language Models in Traffic Flow Forecasting
Yusheng Zhao, Xiao Luo, Haomin Wen, Zhiping Xiao, Wei Ju, Ming Zhang
- Flow2Code: Evaluating Large Language Models for Flowchart-based Code Generation Capability
Mengliang He, Jiayi Zeng, Yankai Jiang, Wei Zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou
- Smarter, Not Harder: Training-Free Adaptive Computation for Transformers
Romain Storaï, Jaeseong Lee, seung-won hwang
- UCS-SQL: Uniting Content and Structure for Enhanced Semantic Bridging In Text-to-SQL
Zhenhe Wu, Zhongqiu Li, JieZhangChinaTele, Zhongjiang He, Jian Yang, Yu Zhao, Ruiyu Fang, Bing Wang, Hongyan Xie, Shuangyong Song, Zhoujun Li
- CodePRM: Execution Feedback-enhanced Process Reward Model for Code Generation
Qingyao Li, Xinyi Dai, Xiangyang Li, Weinan Zhang, Yasheng Wang, Ruiming Tang, Yong Yu
- STEM-POM: Evaluating Language Models Math-Symbol Reasoning in Document Parsing
Jiaru Zou, Qing Wang, Pratyush Thakur, Nickvash Kani
- Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models
Jihoon Lee, Min Song
- Leveraging LLMs for Bangla Grammar Error Correction: Error Categorization, Synthetic Data, and Model Evaluation
Pramit Bhattacharyya, Arnab Bhattacharya
- Think Both Ways: Teacher-Student Bidirectional Reasoning Enhances MCQ Generation and Distractor Quality
Yimiao Qiu, Yang Deng, Quanming Yao, Zhimeng Zhang, Zhiang Dong, Chang Yao, Jingyuan Chen
- mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data
Haonan Chen, Liang Wang, Nan Yang, Yutao Zhu, Ziliang Zhao, Furu Wei, Zhicheng Dou
- Word2Passage : Word-level Importance Re-weighting for Query Expansion
Jeonghwan Choi, Minjeong Ban, Minseok Kim, Hwanjun Song
- MECoT: Markov Emotional Chain-of-Thought for Personality-Consistent Role-Playing
Yangbo Wei, Zhen huang, Fangzhou Zhao, Qi Feng, WEI W. XING
- FiDeLiS: Faithful Reasoning in Large Language Models for Knowledge Graph Question Answering
Yuan Sui, Yufei He, Nian Liu, Xiaoxin He, Kun Wang, Bryan Hooi
- REALM: A Dataset of Real-World LLM Use Cases
Jingwen Cheng, Kshitish Ghate, Wenyue Hua, William Yang Wang, Hong Shen, Fei Fang
- BABELEDITS: A Benchmark and a Modular Approach for Robust Cross-lingual Knowledge Editing of Large Language Models
Tommaso Green, Félix Gaschi, Fabian David Schmidt, Simone Paolo Ponzetto, Goran Glavaš
- CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory
Haokun Zhao, Jinyi Han, Jiaqing Liang, Yanghua Xiao, Xiaojun Meng, Jiansheng Wei
- Problem-Solving Logic Guided Curriculum In-Context Learning for LLMs Complex Reasoning
Xuetao Ma, Wenbin Jiang, Hua Huang
- BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English
Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia
- NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM
Zihan Wang, Yaohui Zhu, Gim Hee Lee, Yachun Fan
- SQLForge: Synthesizing Reliable and Diverse Data to Enhance Text-to-SQL Reasoning in LLMs
Yu Guo, Dong Jin, Shenghao Ye, Shuangwu chen, jianyang, Xiaobin Tan
- Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning
Jiachen Zhu, Congmin Zheng, Jianghao Lin, Kounianhua Du, Ying Wen, Yong Yu, Jun Wang, Weinan Zhang
- Contrastive Learning for Task-Independent SpeechLLM-Pretraining
Maike Züfle, Jan Niehues
- SOTA Attention Operator is generated by SOTA Attention Algorithm
Qirui Zhou, Shaohui Peng, Weiqiang Xiong, Haixin Chen, Yuanbo Wen, Haochen Li, Ling Li, Qi Guo, Yongwei Zhao, KE GAO, Ruizhi Chen, Yanjun Wu, Zhao Chen, Yunji Chen
- ALW: Adaptive Layer-Wise contrastive decoding enhancing reasoning ability in Large Language Models
Yuechi Zhou, Chuyue Zhou, Jianxin Zhang, Juntao Li, Min Zhang
- Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models
Xinlong Chen, Yuanxing Zhang, Qiang Liu, Junfei Wu, Fuzheng Zhang, Tieniu Tan
- VidCapBench: A Comprehensive Benchmark of Video Captioning for Controllable Text-to-Video Generation
Xinlong Chen, Yuanxing Zhang, Chongling Rao, Yushuo Guan, Jiaheng Liu, Fuzheng Zhang, Chengru Song, Qiang Liu, Di ZHANG, Tieniu Tan
- Mitigating Demonstration Bias through Global Coevolutionary Reasoning
Chuan Gou, Bangwei Li, Jianhua Dai, Xiaoyang Han, Ming Cai
- A Representation Level Analysis of NMT Model Robustness to Grammatical Errors
Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis
- $T^2DR$: A Two-Tier Deficiency-Resistant Framework for Incomplete Multimodal Learning
Han Lin, Xiu Tang, Huan Li, Wenxue Cao, Sai Wu, Chang Yao, Lidan Shou, Gang Chen
- From Specific-MLLMs to Omni-MLLMs: A Survey on MLLMs Aligned with Multi-modalities
Shixin Jiang, Jiafeng Liang, Jiyuan Wang, Xuan Dong, Heng Chang, Weijiang Yu, Jinhua Du, Ming Liu, Bing Qin
- Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter
Verena Blaschke, Masha Fedzechkina, Maartje Ter Hoeve
- Agents generalize to novel levels of abstraction by using adaptive linguistic strategies
Kristina Kobrock, Xenia Ohmer, Elia Bruni, Nicole Gotzner
- The Linguistic Connectivities Within Large Language Models
Dan Wang, Boxi Cao, Ning Bian, Xuanang Chen, Yaojie Lu, Hongyu Lin, Jia Zheng, Le Sun, Shanshan Jiang, Bin Dong, Xianpei Han
- XFinBench: Benchmarking LLMs in Complex Financial Problem Solving and Reasoning
Zhihan Zhang, Yixin Cao, Lizi Liao
- Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation
Hongzhe Huang, Jiang Liu, Zhewen Yu, Li Cai, Dian Jiao, Wenqiao Zhang, Siliang Tang, Juncheng Li, Hao Jiang, Haoyuan Li, Yueting Zhuang
- Achieving binary weight and activation for LLMs using Post-Training Quantization
Siqing Song, Chuang Wang, Rui-Qi Wang, Yi Yang, Xu-Yao Zhang
- Mitigating Negative Interference in Multilingual Knowledge Editing through Null-Space Constraints
Wei Sun, Tingyu Qu, Mingxiao Li, Jesse Davis, Marie-Francine Moens
- From Awareness to Adaptability: Enhancing Tool Utilization for Scientific Reasoning
wenjing Xie, Xiaobo Liang, Juntao Li, Wanfu Wang, Kehai Chen, Qiaoming Zhu, Min Zhang
- AMoPO: Adaptive Multi-objective Preference Optimization without Reward Models and Reference Models
Qi Liu, Jingqing Ruan, Hao Li, Haodong Zhao, Desheng Wang, Jiansong Chen, Wan Guanglu, Xunliang Cai, Zhi Zheng, Tong Xu
- Supervised Optimism Correction: Be Confident When LLMs Are Sure
Junjie Zhang, Rushuai Yang, Shunyu Liu, Ting-En Lin, Fei Huang, Yi Chen, Yongbin Li, Dacheng Tao
- Offline Reinforcement Learning for LLM Multi-step Reasoning
Huaijie Wang, Shibo Hao, Hanze Dong, Shenao Zhang, Yilin Bao, Ziran Yang, Yi Wu
- Sampling-based Pseudo-Likelihood for Membership Inference Attacks
Masahiro Kaneko, Youmi Ma, Yuki Wata, Naoaki Okazaki
- AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
Chengyou Jia, Minnan Luo, Zhuohang Dang, Qiushi Sun, Fangzhi Xu, Junlin Hu, Tianbao Xie, Zhiyong Wu
- Boosting Vulnerability Detection of LLMs via Curriculum Preference Optimization with Synthetic Reasoning Data
Xin-Cheng Wen, Yijun Yang, Cuiyun Gao, Yang Xiao, Qing Liao, Deheng Ye
- $GA-S^3$: Comprehensive Social Network Simulation with Group Agents
Yunyao Zhang, Zikai Song, Hang Zhou, Wenfeng Ren, Yi-Ping Phoebe Chen, Junqing Yu, Wei Yang
- M-RangeDetector: Enhancing Generalization in Machine-Generated Text Detection through Multi-Range Attention Masks
Kaijie Jiao, Quan Wang, Licheng Zhang, Zikang Guo, Zhendong Mao
- Does Your Voice Assistant Remember? Analyzing Conversational Context Recall and Utilization in Voice Interaction Models
Heeseung Kim, Che Hyun Lee, Sangkwon Park, Jiheum Yeom, Nohil Park, Sangwon Yu, Sungroh Yoon
- NeuronMerge: Merging Models via Functional Neuron Groups
Wangyun Gu, Qianghua Gao, Zhang Li-Xin, Xu Shen, Jieping Ye
- HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning
Xiaoyuan Li, Moxin Li, Rui Men, Yichang Zhang, Keqin Bao, Wenjie Wang, Fuli Feng, Dayiheng Liu, Junyang Lin
- Self-Steering Optimization: Autonomous Preference Optimization for Large Language Models
Hao Xiang, Bowen Yu, Hongyu Lin, Keming Lu, Yaojie Lu, Xianpei Han, Ben He, Le Sun, Jingren Zhou, Junyang Lin
- LIME: Less Is More for MLLM Evaluation
King Zhu, Qianbo Zang, Shian Jia, Siwei Wu, Feiteng Fang, Yizhi LI, Shuyue Guo, Tianyu Zheng, Jiawei Guo, Bo Li, Haoning Wu, Xingwei Qu, Jian Yang, Ruibo Liu, Xiang Yue, Jiaheng Liu, Chenghua Lin, Hamid Alinejad-Rokny, Min Yang, Shiwen Ni, Wenhao Huang, Ge Zhang
- Debate, Reflect, and Distill: Multi-Agent Feedback with Tree-Structured Preference Optimization for Efficient Language Model Enhancement
Xiaofeng Zhou, Heyan Huang, Lizi Liao
- The Code Review Comprehension Assessment for Language Models
Hong Yi Lin, Chunhua Liu, Haoyu Gao, Patanamon Thongtanunam, Christoph Treude
- A Framework of Narrative Media Framing in Political Discourse
Yulia Otmakhova, Lea Frermann
- MHALO: Evaluating MLLMs as Fine-grained Hallucination Detectors
Yishuo Cai, Renjie Gu, Jiaxu Li, Xuancheng Huang, Junzhe Chen, Xiaotao Gu, Minlie Huang
- Semantic Topology: a New Perspective for Communication Style Characterization
Barbara Scalvini, Alireza Mashaghi
- Decoding LLM Personality Measurement: Forced-Choice vs. Likert
Xiaoyu Li, Haoran Shi, Zengyi Yu, Yukun Tu, Chanjin Zheng
- MultiMSD: A Corpus for Multilingual Medical Text Simplification from Online Medical References
Koki Horiguchi, Tomoyuki Kajiwara, Takashi Ninomiya, Shoko Wakamiya, Eiji Aramaki
- BadWindtunnel: Defending Backdoor in High-noise Simulated Training with Confidence Variance
Ruyi Zhang, Songlei Jian, Yusong Tan, Heng Gao, Haifang Zhou, Kai Lu
- Multimodal Machine Translation with Text-Image In-depth Questioning
Yue Gao, Jing Zhao, Shiliang Sun, Xiaosong Qiao, Tengfei Song, Hao Yang
- ReKG-MCTS: Reinforcing LLM Reasoning on Knowledge Graphs via Training-Free Monte Carlo Tree Search
Xiaozhuang Song, Shufei Zhang, Tianshu Yu
- HTML: Hierarchical Topology Multi-task Learning for Semantic Parsing in Knowledge Base Question Answering
Aziguli Wulamu, Lyu Zhengyu, Kaiyuan Gong, Yu Han, Zewen Wang, Zhihong Zhu, Bowen Xing
- StructFlowBench: A Structured Flow Benchmark for Multi-turn Instruction Following
Jinnan Li, Jinzhe Li, Yue Wang, Yi Chang, Yuan Wu
- CMIE: Combining MLLM Insights with External Evidence for Explainable Out-of-Context Misinformation Detection
Fanxiao Li, Jiaying Wu, Canyuan He, Wei Zhou
- Towards Understanding Etiquettical Bias in LLMs
Ashutosh Dwivedi, Siddhant Shivdutt Singh, Ashutosh Modi
- FinRipple: Aligning Large Language Models with Financial Market for Event Ripple Effect Awareness
Yuanjian Xu, Jianing Hao, Kunsheng Tang, Jingnan Chen, Anxian Liu, Peng LIU, Guang Zhang
- Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation
yingfeng luo, Tong Zheng, Yongyu Mu, Bei Li, Qinghong Zhang, Yongqi Gao, Ziqiang Xu, Peinan Feng, Xiaoqian Liu, Tong Xiao, JingBo Zhu
- EC-RAFT: Automated Generation of Clinical Trial Eligibility Criteria through Retrieval-Augmented Fine-Tuning
Nopporn Lekuthai, Nattawit Pewngam, Supitcha Sokrai, Titipat Achakulvisut
- Pitfalls of Scale: Investigating the Inverse Task of Redefinition in Large Language Models
Elena Stringli, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou
- Implicit Reasoning in Transformers is Reasoning through Shortcuts
Tianhe Lin, Jian Xie, Siyu Yuan, Deqing Yang
- Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework
Kaishuai Xu, Tiezheng YU, Yi Cheng, Wenjun Hou, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li
- CortexDebate: Debating Sparsely and Equally for Multi-Agent Debate
Yiliu Sun, Zicheng Zhao, Sheng Wan, Chen Gong
- PAP2PAT: Benchmarking Outline-Guided Long-Text Patent Generation with Patent-Paper Pairs
Valentin Knappich, Anna Hätty, Simon Razniewski, Annemarie Friedrich
- Debt Collection Negotiations with Large Language Models: An Evaluation System and Optimizing Decision Making with Multi-Agent
Xiaofeng Wang, Zhixin Zhang, Jin Guang Zheng, Yiming Ai, Rui Wang
- Focused-DPO: Enhancing Code Generation Through Focused Preference Optimization on Error-Prone Points
Kechi Zhang, Ge Li, Jia Li, Yihong Dong, Jia Li, Zhi Jin
- Supervised and Unsupervised Probing of Shortcut Learning: Case Study on the Emergence and Evolution of Syntactic Heuristics in BERT
Elke Vandermeerschen, Miryam de Lhoneux
- GIMMICK: Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking
Florian Schneider, Carolin Holtermann, Chris Biemann, Anne Lauscher
- R-VLM: Region-Aware Vision Language Model for Precise GUI Grounding
Joonhyung Park, Peng Tang, Sagnik Das, srikar appalaraju, Kunwar Yashraj Singh, R. Manmatha, Shabnam Ghadar
- Perspective Transition of Large Language Models for Solving Subjective Tasks
Xiaolong Wang, Yuanchi Zhang, Ziyue Wang, Yuzhuang Xu, Fuwen Luo, Yile Wang, Peng Li, Yang Liu
- TripTailor: A Real-World Benchmark for Personalized Travel Planning
Kaimin Wang, Yuanzhe Shen, Changze Lv, Xiaoqing Zheng, Xuanjing Huang
- Random Splitting Negatively Impacts NER Evaluation: Quantifying and Eliminating the Overestimation of NER Performance
Florian Babl, Moritz Hennen, Jakob Murauer, Michaela Geierhos
- Structure-adaptive Adversarial Contrastive Learning for Multi-Domain Fake News Detection
Lingwei Wei, Dou Hu, Wei Zhou, Philip S. Yu, Songlin Hu
- BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
Zhiting Fan, Ruizhe Chen, Zuozhu Liu
- Qorǵau: Evaluating Safety in Kazakh-Russian Bilingual Contexts
Maiya Goloburda, Nurkhan Laiyk, Diana Turmakhan, Yuxia Wang, Mukhammed Togmanov, Jonibek Mansurov, Askhat Sametov, Nurdaulet Mukhituly, Minghan Wang, Daniil Orel, Zain Muhammad Mujahid, Fajri Koto, Timothy Baldwin, Preslav Nakov
- MMXU: A Multi-Modal and Multi-X-ray Understanding Dataset for Disease Progression
Linjie Mu, Zhongzhen Huang, Shengqian Qin, Yakun Zhu, Shaoting Zhang, Xiaofan Zhang
- Tree-of-Code: A Self-Growing Tree Framework for End-to-End Code Generation and Execution in Complex Tasks
Ziyi Ni, YIFAN LI, Ning Yang, Dou Shen, Pin Lyu, daxiang dong
- Akan Cinematic Emotions (ACE): A Multimodal Multi-party Dataset for Emotion Recognition in Movie Dialogues
David Sasu, zehui wu, Ziwei Gong, Run Chen, Pengyuan Shi, Lin Ai, Julia Hirschberg, Natalie Schluter
- A Cognitive Writing Perspective for Constrained Long-Form Text Generation
Kaiyang Wan, Honglin Mu, Rui Hao, Haoran Luo, Tianle Gu, Xiuying Chen
- Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models
You Li, Heyu Huang, Chi Chen, Kaiyu Huang, Chao Huang, Zonghao Guo, Zhiyuan Liu, Jinan Xu, Yuhua Li, Ruixuan Li, Maosong Sun
- SIKeD: Self-guided Iterative Knowledge Distillation for Mathematical Reasoning
Shivam Adarsh, Kumar Shridhar, Caglar Gulcehre, Nicholas Monath, Mrinmaya Sachan
- Chain of Attack: Hide Your Intention through Multi-Turn Interrogation
Xikang Yang, biyu zhou, Xuehai Tang, Jizhong Han, Songlin Hu
- MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space
Yicheng Chen, Yining Li, Kai Hu, Ma Zerun, HaochenYe, Kai Chen
- Enhancing Automatic Term Extraction in Large Language Models via Syntactic Retrieval
Yongchan Chun, Minhyuk Kim, Dongjun Kim, Chanjun Park, Heuiseok Lim
- Explainable Depression Detection in Clinical Interviews with Personalized Retrieval-Augmented Generation
Linhai Zhang, Ziyang Gao, Deyu Zhou, Yulan He
- EMPEC: A Comprehensive Benchmark for Evaluating Large Language Models Across Diverse Healthcare Professions
Zheheng Luo, Chenhan Yuan, Qianqian Xie, Sophia Ananiadou
- Beyond Numeric Rewards: In-Context Dueling Bandits with LLM Agents
Fanzeng Xia, Hao Liu, Yisong Yue, Tongxin Li
- “Well, Keep Thinking”: Enhancing LLM Reasoning with Adaptive Injection Decoding
Hyunbin Jin, Je Won Yeom, Seunghyun Bae, Taesup Kim
- SpeechT-RAG: Reliable Depression Detection in LLMs with Retrieval-Augmented Generation Using Speech Timing Information
Xiangyu Zhang, Hexin Liu, Qiquan Zhang, Beena Ahmed, Julien Epps
- Fine-grained Knowledge Enhancement for Retrieval-Augmented Generation
Jingxuan Han, Zhendong Mao, Yi Liu, Yexuan Che, Zheren Fu, Quan Wang
- Bayesian Optimization for Controlled Image Editing via LLMs
Chengkun Cai, Haoliang Liu, Xu Zhao, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, Jenq-Neng Hwang, Serge Belongie, Lei Li
- SPOT: Zero-Shot Semantic Parsing Over Property Graphs
Francesco Cazzaro, Justin Kleindienst, Sofia Márquez Gomez, Ariadna Quattoni
- Reasoning Circuits in Language Models: A Mechanistic Interpretation of Syllogistic Inference
Geonhee Kim, Marco Valentino, Andre Freitas
- Multi-Hop Question Generation via Dual-Perspective Keyword Guidance
Maodong Li, Longyin Zhang, Fang Kong
- LoRMA: Low Rank Multiplicative Adaptation for LLMs
Harsh Bihany, Shubham Patel, Ashutosh Modi
- DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale
Linghao Zhang, Junhao Wang, Shilin He, Chaoyun Zhang, Yu Kang, Bowen Li, Jiaheng Wen, Chengxing Xie, Maoquan Wang, Yufan Huang, Elsie Nallipogu, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
- Weak-to-Strong Honesty Alignment via Learning-to-Rank Supervision
YunfanXie, Lixin Zou, Dan Luo, Min Tang, Chenliang Li
- MultiHoax: A Dataset of Multi-hop False-premise questions
Mohammadamin Shafiei, Hamidreza Saffari, Nafise Sadat Moosavi
- Learning to Play Like Humans: A Framework for LLM Adaptation in Interactive Fiction Games
Jinming Zhang, Yunfei Long
- STATE ToxiCN: A Benchmark for Span-level Target-Aware Toxicity Extraction in Chinese Hate Speech Detection
Zewen Bai, shengdi yin, Junyu Lu, Jingjie Zeng, Haohao Zhu, Yuanyuan Sun, Liang Yang, Hongfei Lin
- RelEdit: Evaluating Conceptual Knowledge Editing in Language Models via Relational Reasoning
Yifan Niu, Miao Peng, Nuo Chen, Yatao Bian, Tingyang Xu, Jia Li
- Unlocking Speech Instruction Data Potential with Query Rewriting
Yonghua Hei, Yibo Yan, Shuliang Liu, Huiyu Zhou, Linfeng Zhang, Xuming Hu
- From Evasion to Concealment: Stealthy Knowledge Unlearning for LLMs
Tianle Gu, Kexin Huang, Ruilin Luo, Yuanqi Yao, Xiuying Chen, Yujiu Yang, Yan Teng, Yingchun Wang
- Context-DPO: Aligning Language Models for Context-Faithfulness
Baolong Bi, Shaohan Huang, Yiwei Wang, Tianchi Yang, Zihan Zhang, Haizhen Huang, Lingrui Mei, Junfeng Fang, Zehao Li, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang, Shenghua Liu
- Reasoning Does Not Necessarily Improve Role-Playing Ability
Xiachong Feng, Longxu Dou, Lingpeng Kong
- TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios
Xiaokang Zhang, Sijia Luo, Bohan Zhang, Zeyao Ma, Jing Zhang, Yang Li, Guanlin Li, Zijun Yao, Kangli Xu, Jinchang Zhou, Daniel Zhang-Li, Jifan Yu, Shu Zhao, Juanzi Li, Jie Tang
- A Survey of LLM-based Agents in Medicine
Wenxuan Wang, Zizhan Ma, Zheng WANG, Chenghan Wu, Jiaming Ji, Wenting Chen, Xiang Li, Yixuan Yuan
- Context Robust Knowledge Editing for Language Models
Haewon Park, Gyubin Choi, Minjun Kim, Yohan Jo
- Multi-Agent Collaboration via Cross-Team Orchestration
Zhuoyun Du, Chen Qian, Wei Liu, Zihao Xie, YiFei Wang, Rennai Qiu, Yufan Dang, Weize Chen, Cheng Yang, Ye Tian, Xuantang Xiong, Lei Han
- Semantic Evaluation of Multilingual Data-to-Text Generation via NLI Fine-Tuning: Precision, Recall and F1 scores
William Soto Martinez, Yannick Parmentier, Claire Gardent
- Optimized Text Embedding Models and Benchmarks for Amharic Passage Retrieval
Kidist Amde Mekonnen, Yosef Worku Alemneh, Maarten de Rijke
- Enhancing Transformation from Natural Language to Signal Temporal Logic Using LLMs with Diverse External Knowledge
Yue Fang, Zhi Jin, Jie An, Hongshen Chen, Xiaohong Chen, Naijun Zhan
- DAGS: A Dependency-Based Dual-Attention and Global Semantic Improvement Framework for Metaphor Recognition
Puli Chen, Cheng Yang, Xingmao Zhang, Qingbao Huang
- ESF: Efficient Sensitive Fingerprinting for Black-Box Tamper Detection of Large Language Models
Xiaofan Bai, Pingyi Hu, Xiaojing Ma, Bin Benjamin Zhu, Linchen Yu, Dongmei Zhang, Qi Zhang
- The Lessons of Developing Process Reward Models in Mathematical Reasoning
Zhenru Zhang, Chujie Zheng, Yangzhen Wu, Beichen Zhang, Runji Lin, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin
- MinosEval: Distinguishing Factoid and Non-Factoid for Tailored Open-Ended QA Evaluation with LLMs
Yongqi Fan, Yating Wang, Guandong Wang, Zhai Jie, Jingping Liu, Qi Ye, Tong Ruan
- Towards Conditioning Clinical Text Generation for User Control
Osman Alperen Koraş, Rabi Bahnan, Jens Kleesiek, Amin Dada
- CoDet-M4: Detecting Machine-Generated Code in Multi-Lingual, Multi-Generator and Multi-Domain Settings
Daniil Orel, Dilshod Azizov, Preslav Nakov
- Q-Mamba: Towards more efficient Mamba models via post-training quantization
Chen Tianqi, Yuanteng Chen, Peisong Wang, Weixiang Xu, Zeyu Zhu, Jian Cheng
- P²Net: Parallel Pointer-based Network for Key Information Extraction with Complex Layouts
kaiwen wei, Jie Yao, Jiang Zhong, Yangyang Kang, Jingyuan Zhang, Changlong Sun, Xin Zhang, Fengmao Lv, Li Jin
- Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language Models
Liyang He, Chenglong Liu, Rui Li, Zhenya Huang, Shulan Ruan, JUN ZHOU, Enhong Chen
- RQT: Hierarchical Residual Quantization for Multi-Model Compression
Chen Tianqi, Peisong Wang, Weixiang Xu, Zeyu Zhu, Jian Cheng
- taz2024full: Analysing German Newspapers for Gender Bias and Discrimination across Decades
Stefanie Urchs, Veronika Thurner, Matthias Aßenmacher, Christian Heumann, Stephanie Thiemichen
- LCFO: Long Context and Long Form Output Dataset and Benchmarking
Marta R. Costa-jussà, Pierre Andrews, Mariano Coria Meglioli, Joy Chen, Joe Chuang, David Dale, Christophe Ropers, Alexandre Mourachko, Eduardo Sánchez, Holger Schwenk, Tuan A. Tran, Arina Turkatenko, Carleigh Wood
- Span-based Semantic Role Labeling as Lexicalized Constituency Tree Parsing
Yang Hou, Zhenghua Li
- Learning from Negative Samples in Biomedical Generative Entity Linking
Chanhwi Kim, Hyunjae Kim, Sihyeon Park, Jiwoo Lee, Mujeen Sung, Jaewoo Kang
- Self-play through Computational Runtimes improves Chart Reasoning
Tautvydas Misiūnas, Hassan Mansoor, Jasper Uijlings, Oriana Riva, Victor Carbune
- Towards Better Chain-of-Thought: A Reflection on Effectiveness and Faithfulness
Jiachun Li, Pengfei Cao, Yubo Chen, Jiexin Xu, Huaijun Li, Xiaojian Jiang, Kang Liu, Jun Zhao
- A Couch Potato is not a Potato on a Couch: Prompting Strategies, Image Generation, and Compositionality Prediction for Noun Compounds
Sinan Kurtyigit, Diego Frassinelli, Carina Silberer, Sabine Schulte im Walde
- A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI
Beiduo Chen, Siyao Peng, Anna Korhonen, Barbara Plank
- Measuring What Matters: Evaluating Ensemble LLMs with Label Refinement in Inductive Coding
Angelina Parfenova, Jürgen Pfeffer
- Dynamic Evil Score-Guided Decoding: An Efficient Decoding Framework For Red-Team Model
Cong Gao, Bo Zhang, Linkang Yang, Minghao Hu, Zhunchen Luo, Xiaoying Bai, Guotong Geng, Jun Zhang, Yunhua XUE
- CoMuMDR: Code-mixed Multi-modal Multi-domain corpus for Discourse paRsing in conversations
Divyaksh Shukla, Ritesh Baviskar, Dwijesh Gohil, Aniket Tiwari, Atul Shree, Ashutosh Modi
- Multi-word Measures: Modeling Semantic Change in Compound Nouns
Chris Jenkins, Filip Miletić, Sabine Schulte im Walde
- Bridge-Coder: Transferring Model Capabilities from High-Resource to Low-Resource Programming Language
Jipeng Zhang, Jianshu Zhang, Yuanzhe LI, Renjie Pi, Rui Pan, Runtao Liu, Zheng Ziqiang, Tong Zhang
- ProBench: Judging Multimodal Foundation Models on Open-ended Multi-domain Expert Tasks
Yan Yang, Dongxu Li, Haoning Wu, Bei Chen, Liu Liu, Liyuan Pan, Junnan Li
- 2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset Download PDF
Marta R. Costa-jussà, Bokai YU, Pierre Andrews, Belen Alastruey, Necati Cihan Camgoz, Joe Chuang, Jean Maillard, Christophe Ropers, Arina Turkatenko, Carleigh Wood
- A General Framework to Evaluate Methods for Assessing Dimensions of Lexical Semantic Change Using LLM-Generated Synthetic Data
Naomi Baes, Raphael Merx, Nick Haslam, Ekaterina Vylomova, Haim Dubossarsky
- Chain-of-Jailbreak Attack for Image Generation Models via Step by Step Editing
Wenxuan Wang, Kuiyi Gao, Youliang Yuan, Jen-tse Huang, Qiuzhi Liu, Shuai Wang, Wenxiang Jiao, Zhaopeng Tu
- Tokenization is Sensitive to Language Variation
Anna Wegmann, Dong Nguyen, David Jurgens
- WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications
Xin Li, Mengbing Liu, LI WEI, Jiancheng An, Merouane Abdelkader DEBBAH, Chau Yuen
- Self-Improvement Towards Pareto Optimality: Mitigating Preference Conflicts in Multi-Objective Alignment
Moxin Li, Yuantao Zhang, Wenjie Wang, Wentao Shi, Zhuo Liu, Fuli Feng, Tat-Seng Chua
- Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training
Zhijun Wang, Jiahuan Li, Hao zhou, Rongxiang Weng, Jingang Wang, Xin Huang, Xue Han, Junlan Feng, Chao Deng, Shujian Huang
- User Behavior Prediction as a Generic, Robust, Scalable, and Low-Cost Evaluation Strategy for Estimating Generalization in LLMs
Sougata Saha, Monojit Choudhury
- Beyond Browsing: API-Based Web Agents
Yueqi Song, Frank F. Xu, Shuyan Zhou, Graham Neubig
- MiLiC-Eval: Benchmarking Multilingual LLMs for China’s Minority Languages
Chen Zhang, Mingxu Tao, Zhiyuan Liao, Yansong Feng
- ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation
Maja Stahl, Timon Ziegenbein, Joonsuk Park, Henning Wachsmuth
- Crabs: Consuming Resource via Auto-generation for LLM-DoS Attack under Black-box Settings
Yuanhe Zhang, Zhenhong Zhou, Wei Zhang, Xinyue Wang, Xiaojun Jia, Yang Liu, Sen Su
- Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models
Chenchen Yuan, Zheyu Zhang, Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
- Unlocking Recursive Thinking of LLMs: Alignment via Refinement
Haoke Zhang, xiaobo liang, Cunxiang Wang, Juntao Li, Min Zhang
- CitaLaw: Enhancing LLM with Citations in Legal Domain
Kepu Zhang, Weijie Yu, Sunhao Dai, Jun Xu
- MEGen: Generative Backdoor into Large Language Models via Model Editing
Jiyang Qiu, Xinbei Ma, Zhuosheng Zhang, hai zhao, Yun Li, Qianren Wang
- Social Bias Benchmark for Generation: A Comparison of Generation and QA-Based Evaluations
Jiho Jin, Woosung Kang, Junho Myung, Alice Oh
- Math2Visual: A Framework for Generating Pedagogically Meaningful Visuals for Teaching Math Word Problems
Junling Wang, Anna Rutkiewicz, April Wang, Mrinmaya Sachan
- RASPberry: Retrieval-Augmented Monte Carlo Tree Self-Play with Reasoning Consistency for Multi-Hop Question Answering
Baixuan Li, Yunlong Fan, Tianyi Ma, Miao Gao, Chuanqi Shi, Zhiqiang Gao
- All That Glitters is Not Gold: Improving Robust Retrieval-Augmented Language Models with Fact-Centric Preference Alignment
Jia Hao, Chunhong Zhang, Jiarun Liu, Haiyu Zhao, Zhiqiang Zhan, Zheng Hu
- FairSteer: Inference Time Debiasing for LLMs with Dynamic Activation Steering
Yichen Li, Zhiting Fan, Ruizhe Chen, Xiaotang Gai, Luqi Gong, Yan Zhang, Zuozhu Liu
- Listen, Watch, and Learn to Feel: Retrieval-Augmented Emotion Reasoning for Compound Emotion Generation
Zhuofan Wen, Zheng Lian, Shun Chen, Hailiang Yao, Longjiang Yang, Bin Liu, Jianhua Tao
- GLTW: Joint Improved Graph Transformer and LLM via Three-Word Language for Knowledge Graph Completion
Kangyang Luo, Yuzhuo Bai, Cheng Gao, Shuzheng Si, Zhu Liu, Yingli Shen, Zhitong Wang, Cunliang Kong, Wenhao Li, Yufei Huang, Ye Tian, Xuantang Xiong, Lei Han, Maosong Sun
- Learning to Select In-Context Demonstration Preferred by Large Language Model
Zheng Zhang, Shaocheng Lan, Lei Song, Jiang Bian, Yexin Li, Kan Ren
- Beyond the Spelling Miracle: Investigating Substring Awareness in Character-Blind Language Models
Cristiano Ciaccio, Marta Sartor, Alessio Miaschi, Felice Dell’Orletta
- DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
Minzheng Wang, Xinghua Zhang, Kun Chen, Nan Xu, Haiyang Yu, Fei Huang, Wenji Mao, Yongbin Li
- InfiniteICL: Breaking the Limit of Context Window Size via Long Short-term Memory Transformation
Bowen Cao, Deng Cai, Wai Lam
- M3HG: Multimodal, Multi-scale, and Multi-type Node Heterogeneous Graph for Emotion Cause Triplet Extraction in Conversations
Qiao Liang, Ying Shen, Tiantian Chen, Lin Zhang
- Large Language Models Are Natural Video Popularity Predictors
Pratik Kayal, Pascal Mettes, Nima Dehmamy, Minsu Park
- DELMAN: Dynamic Defense Against Large Language Model Jailbreaking with Model Editing
Yi Wang, Fenghua Weng, Sibei Yang, Zhan Qin, Minlie Huang, Wenjie Wang
- You need to MIMIC to get FAME: Solving Meeting Transcript Scarcity with a Multi-Agent Conversations
Frederic Kirstein, Muneeb Khan, Jan Philip Wahle, Terry Ruas, Bela Gipp
- Code-Switching and Syntax: A Large-Scale Experiment
Igor Sterner, Simone Teufel
- Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System
Weize Chen, Jiarui Yuan, Chen Qian, Cheng Yang, Zhiyuan Liu, Maosong Sun
- Generating Domain-Specific Knowledge Graphs from Large Language Models
Marinela Parović, Ze Li, Jinhua Du
- Large Language Models are Miscalibrated In-Context Learners
Chengzu Li, Han Zhou, Goran Glavaš, Anna Korhonen, Ivan Vulić
- STeCa: Step-level Trajectory Calibration for LLM Agent Learning
Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li
- LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
Zhuoshi Pan, Yu Li, Honglin Lin, Qizhi Pei, Zinan Tang, Wei Wu, Chenlin Ming, H. Vicky Zhao, Conghui He, Lijun Wu
- Voting or Consensus? Decision-Making in Multi-Agent Debate
Lars Benedikt Kaesberg, Jonas Becker, Jan Philip Wahle, Terry Ruas, Bela Gipp
- Rhetorical Device-Aware Sarcasm Detection with Counterfactual Data Augmentation
Qingqing Hong, Dongyu Zhang, Jiayi Lin, Dapeng Yin, Shuyue Zhu, Junli Wang
- Selecting Demonstrations for Many-Shot In-Context Learning via Gradient Matching
Jianfei Zhang, Bei Li, Jun Bai, Rumei Li, Yanmeng Wang, Chenghua Lin, Wenge Rong
- Cheap Character Noise for OCR-Robust Multilingual Embeddings
Andrianos Michail, Juri Opitz, Yining Wang, Robin Meister, Rico Sennrich, Simon Clematide
- Physics: Benchmarking Foundation Models for PhD-Qualifying Exam Physics Problem Solving
Kaiyue Feng, Yilun Zhao, Yixin Liu, Tianyu Yang, Chen Zhao, John Sous, Arman Cohan
- DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation
Eliya Habba, Ofir Arviv, Itay Itzhak, Yotam Perlitz, Elron Bandel, Leshem Choshen, Michal Shmueli-Scheuer, Gabriel Stanovsky
- ALPS: Attention Localization and Pruning Strategy for Efficient Adaptation of Large Language Models
Hao Chen, Haoze Li, Zhiqing Xiao, Lirong Gao, Qi Zhang, Xiaomeng Hu, NINGTAO WANG, Xing Fu, Junbo Zhao
- DeTAM: Defending LLMs Against Jailbreak Attacks via Targeted Attention Modification
Yu Li, Han Jiang, Zhihua Wei
- A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges
Yibo Yan, Jiamin Su, Jianxiang He, Fangteng FU, Xu Zheng, Yuanhuiyi Lyu, Kun Wang, Shen Wang, Qingsong Wen, Xuming Hu
- Fast-and-Frugal Text-Graph Transformers are Effective Link Predictors
Andrei Catalin Coman, Christos Theodoropoulos, Marie-Francine Moens, James Henderson
- NeoQA: Evidence-based Question Answering with Generated News Events
Max Glockner, Xiang Jiang, Leonardo F. R. Ribeiro, Iryna Gurevych, Markus Dreyer
- ChatMap: Mining Human Thought Processes for Customer Service Chatbots via Multi-Agent Collaboration
Xinyi Jiang, Tianyi Hu, Yuheng Qin, Guoming Wang, Zhou Huan, kehan chen, Gang Huang, Rongxing Lu, Siliang Tang
- P3: Prompts Promote Prompting
Xinyu Zhang, Yuanquan Hu, Fangchao Liu, Zhicheng Dou
- VAQUUM: Are Vague Quantifiers Grounded in Visual Data?
Hugh Mee Wong, Rick Nouwen, Albert Gatt
- Forgotten Polygons: Multimodal Large Language Models are Shape-Blind
William Rudman, Michal Golovanevsky, Amir Bar, Vedant Palit, Yann LeCun, Carsten Eickhoff, Ritambhara Singh
- MindBridge: Scalable and Cross-Model Knowledge Editing via Memory-Augmented Modality
Shuaike Li, Kai Zhang, Qi Liu, Enhong Chen
- FIHA: Automated Fine-grained Hallucinations Evaluations in Large Vision Language Models with Davidson Scene Graphs
Bowen Yan, Zhengsong Zhang, Liqiang Jing, Eftekhar Hossain, Xinya Du
- On the Role of Semantic Proto-roles in Semantic Analysis: What do LLMs know about agency?
Elizabeth Spaulding, Shafiuddin Rehan Ahmed, James Martin
- GeAR: Graph-enhanced Agent for Retrieval-augmented Generation
Zhili Shen, Chenxin Diao, Pavlos Vougiouklis, Pascual Merita, Shriram Piramanayagam, Enting Chen, Damien Graux, Andre Melo, Ruofei Lai, Zeren Jiang, Zhongyang Li, YE QI, Yang Ren, Dandan Tu, Jeff Z. Pan
- WebNLG-IT: Construction of an aligned RDF-Italian corpus through Machine Translation techniques
Michael Oliverio, Pier Felice Balestrucci, Alessandro Mazzei, Valerio Basile
- Towards Adapting Open-Source Large Language Models for Expert-Level Clinical Note Generation
Hanyin Wang, Chufan Gao, Bolun Liu, Qiping Xu, Guleid Hussein, Mohamad El Labban, Kingsley Iheasirim, Hariprasad Reddy Korsapati, Chuck Outcalt, Jimeng Sun
- Bridging Robustness and Generalization Against Word Substitution Attacks in NLP via the Growth Bound Matrix Approach
Mohammed Bouri, Adnane Saoud
- Neuro-Symbolic Query Compiler
Yuyao Zhang, Zhicheng Dou, Xiaoxi Li, Jiajie Jin, Yongkang Wu, Zhonghua Li, YE QI, Ji-Rong Wen
- Revealing and Mitigating the Local Pattern Shortcuts of Mamba
WangJie You, Zecheng Tang, Juntao Li, Lili Yao, Min Zhang
- Forget the Token and Pixel: Rethinking Gradient Ascent for Concept Unlearning in Multimodal Generative Models
Jiaqi Li, Chuanyi Zhang, Miaozeng Du, Hui Zhang, Yongrui Chen, Qianshan Wei, Junfeng Fang, Ruipeng Wang, Sheng Bi, Guilin Qi
- Slamming: Training a Speech Language Model on One GPU in a Day
Gallil Maimon, Avishai Elmakies, Yossi Adi
- Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation
Junhong Wu, Yang Zhao, Yangyifan Xu, Bing Liu, Chengqing Zong
- Clarifying Underspecified Discourse Relations in Instructional Texts
Berfin Aktas, Michael Roth
- WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & Dialects
Daniel Deutsch, Eleftheria Briakou, Isaac Rayburn Caswell, Mara Finkelstein, Rebecca Galor, Juraj Juraska, Geza Kovacs, Alison Lui, Ricardo Rei, Jason Riesa, Shruti Rijhwani, Parker Riley, Elizabeth Salesky, Firas Trabelsi, Stephanie Winkler, Biao Zhang, Markus Freitag
- Exploring Graph Representations of Logical Forms for Language Modeling
Michael Sullivan
- SEA-HELM: Southeast Asian Holistic Evaluation of Language Models
Yosephine Susanto, Adithya Venkatadri Hulagadri, Jann Railey Montalan, Jian Gang Ngui, Xianbin Yong, Wei Qi Leong, Hamsawardhini Rengarajan, Peerat Limkonchotiwat, Yifan Mai, William Chandra Tjhi
- TRANS-ZERO: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data
Wei Zou, Sen Yang, Yu Bao, Shujian Huang, Jiajun Chen, Shanbo Cheng
- A Conformal Risk Control Framework for Granular Word Assessment and Uncertainty Calibration of CLIPScore Quality Estimates
Goncalo Emanuel Cavaco Gomes, Bruno Martins, Chrysoula Zerva
- SGDPO: Self-Guided Direct Preference Optimization for Language Model Alignment
Wenqiao Zhu, Ji Liu, Lulu Wang, Jun Wu, Yulun Zhang
- Socratic Style Chain-of-Thoughts Help LLMs to be a Better Reasoner
Jiangbo Pei, Peiyu Liu, Xin Zhao, Aidong Men, Yang Liu
- Quantile Regression with Large Language Models for Price Prediction
Nikhita Vedula, Dushyanta Dhyani, Laleh Jalali, Boris N. Oreshkin, Mohsen Bayati, Shervin Malmasi
- Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding Tutors
Jian Wang, Yinpei Dai, Yichi Zhang, Ziqiao Ma, Wenjie Li, Joyce Chai
- AIGuard: A Benchmark and Lightweight Detection for E-commerce AIGC Risks
Wenhua Zhang, Lixin Zou, Xuanrong Rao, Weicheng Li, Xiangyang Luo, Chubin Zhuang, Yongjie Hong, Zhen Qin, Hengyu Chang, Chenliang Li, Bo Zheng
- A$^2$ATS: Retrieval-Based KV Cache Reduction via Windowed Rotary Position Embedding and Query-Aware Vector Quantization
Junhui He, Junna Xing, Nan Wang, Rui Xu, Shangyu Wu, Peng Zhou, Qiang Liu, Chun Jason Xue, Qingan Li
- TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments
Yuheng Lu, Qian Yu, Hongru WANG, Zeming Liu, Wei Su, Yanping Liu, Yuhang Guo, Maocheng Liang, Yunhong Wang, Haifeng Wang
- Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
Jie Zeng, Qianyu He, Qingyu Ren, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao
- CoT-VTM: Visual-to-Music Generation with Chain-of-Thought Reasoning
Xikang Guan, Zheng Gu, Jing Huo, Tianyu Ding, Yang Gao
- A Tale of Evaluating Factual Consistency: Case Study on Long Document Summarization Evaluation
Yang Zhong, Diane Litman
- Evaluating Pretrained Causal Language Models for Synonymy
Ioana Ivan, Carlos Ramisch, Alexis Nasr
- MDIT-Bench: Evaluating the Dual-Implicit Toxicity in Large Multimodal Models
Bohan Jin, Shuhan Qi, Kehai Chen, Xinyi Guo, Xuan Wang
- CoVE: Compressed Vocabulary Expansion Makes Better LLM-based Recommender Systems
Haochen Zhang, Tianyi Zhang, Junze Yin, Oren Gal, Anshumali Shrivastava, Vladimir Braverman
- CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control
Liu Huanshuo, Hao Zhang, Zhijiang Guo, Jing Wang, Kuicai Dong, Xiangyang Li, Yi Quan Lee, Cong Zhang, Yong Liu
- Maximum Score Routing For Mixture-of-Experts
Bowen Dong, Yilong Fan, Yutao Sun, Zhenyu Li, Tengyu Pan, zhou Xun, Jianyong Wang
- Time Course MechInterp: Analyzing the Evolution of Components and Knowledge in Large Language Models
Ahmad Dawar Hakimi, Ali Modarressi, Philipp Wicke, Hinrich Schuetze
- Well Begun is Half Done: Low-resource Preference Alignment by Weak-to-Strong Decoding
Feifan Song, Shaohang Wei, Wen Luo, Yuxuan Fan, Tianyu Liu, Guoyin Wang, Houfeng Wang
- Disentangling Text and Math in Word Problems: Evidence for the Bidimensional Structure of Large Language Models’ Reasoning
Pedro Calais, Gabriel Franco, Zilu Tang, Themistoklis Nikas, Wagner Meira Jr., Evimaria Terzi, Mark Crovella
- Human-LLM Coevolution: Evidence from Academic Writing
Mingmeng Geng, Roberto Trotta
- Disentangled Multi-span Evolutionary Network against Temporal Knowledge Graph Reasoning
Hao Dong, Ziyue Qiao, Zhiyuan Ning, Qi Hao, Yi Du, Pengyang Wang, Yuanchun Zhou
- GRAF: Graph Retrieval Augmented by Facts for Romanian Legal Multi-Choice Question Answering
Cristian-George Craciun, Răzvan-Alexandru Smădu, Dumitru-Clementin Cercel, Mihaela-Claudia Cercel
- Express 💬 What You See 👀: Can Multimodal LLMs Decode Visual Ciphers with Intuitive Semiosis Comprehension?
Jiayi Kuang, Yinghui Li, Chen Wang, Haohao Luo, Ying Shen, Wenhao Jiang
- ConFit v2: Improving Resume-Job Matching using Hypothetical Resume Embedding and Runner-Up Hard-Negative Mining
Xiao Yu, Ruize Xu, Chengyuan Xue, Jinzhong Zhang, Xu Ma, Zhou Yu
- Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion
Anum Afzal, Florian Matthes, Gal Chechik, Yftah Ziser
- Grounding Task Assistance with Multimodal Cues from a Single Demonstration
Gabriel Herbert Sarch, Balasaravanan Thoravi Kumaravel, Sahithya Ravi, Vibhav Vineet, Andy Wilson
- Awes, Laws, and Flaws From Today’s LLM Research
Adrian de Wynter
- Dual Debiasing for Noisy In-Context Learning for Text Generation
Siqi Liang, Sumyeong Ahn, Paramveer Dhillon, Jiayu Zhou
- DRS: Deep Question Reformulation With Structured Output
Zhecheng Li, Yiwei Wang, Bryan Hooi, Yujun Cai, Nanyun Peng, Kai-Wei Chang
- A Step Towards Explainable Hate Speech Detection
Happy Khairunnisa Sariyanto, Diclehan Ulucan, Oguzhan Ulucan, Marc Ebner
- BioHopR: A Benchmark for Multi-Hop, Multi-Answer Reasoning in Biomedical Domain
Yunsoo Kim, Yusuf Abdulle, Honghan Wu
- PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding
Bradley McDanel, Sai Qian Zhang, Yunhai Hu, Zining Liu
- LAM SIMULATOR: Advancing Data Generation for Large Action Models Trainings via Online Exploration and Feedback Simulation
Thai Quoc Hoang, Kung-Hsiang Huang, Shirley Kokane, Jianguo Zhang, Zuxin Liu, Ming Zhu, Jake Grigsby, Tian Lan, Michael S Ryoo, Chien-Sheng Wu, Shelby Heinecke, Huan Wang, Silvio Savarese, Caiming Xiong, Juan Carlos Niebles
- $\textit{Rank, Chunk and Expand}$: Lineage-Oriented Reasoning for Taxonomy Expansion
Sahil Mishra, Kumar Arjun, Tanmoy Chakraborty
- Probing Subphonemes in Morphology Models
Gal Astrach, Yuval Pinter
- Exploiting Instruction-Following Retrievers for Malicious Information Retrieval
Parishad BehnamGhader, Nicholas Meade, Siva Reddy
- Improving Causal Interventions in Amnesic Probing with Mean Projection
Alicja Dobrzeniecka, Antske Fokkens, Pia Sommerauer
- The Threat of PROMPTS in Large Language Models: A System and User Prompt Perspective
Zixuan Xia, Haifeng Sun, Jingyu Wang, Qi Qi, Huazheng Wang, Xiaoyuan Fu, Jianxin Liao
- RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization
Tianci Liu, Haoxiang Jiang, Tianze Wang, Ran Xu, Yue Yu, Linjun Zhang, Tuo Zhao, Haoyu Wang
- Instruction-Tuning LLMs for Event Extraction with Annotation Guidelines
Saurabh Srivastava, Sweta Pati, Ziyu Yao
- mRAKL: Multilingual Retrieval-Augmented Knowledge Graph Construction for Low-Resourced Languages
Hellina Hailu Nigatu, Min Li, Maartje Ter Hoeve, Saloni Potdar, Sarah Chasins
- Mechanistic Interpretability of Emotion Inference in Large Language Models
Ala N. Tak, Amin Banayeeanzade, Anahita Bolourani, Mina Kian, Robin Jia, Jonathan Gratch
- RL-Guider: Leveraging Historical Decisions and Feedback for Drug Editing with Large Language Models
Xufeng Liu, Yixuan Ding, Jingxiang Qu, Yichi Zhang, Wenhan Gao, Yi Liu
- BriefMe: A Legal NLP Benchmark for Assisting with Legal Briefs
Jesse Woo, Fateme Hashemi Chaleshtori, Ana Marasovic, Kenneth Marino
- I see what you mean: Co-Speech Gestures for Reference Resolution in Multimodal Dialogue
Esam Ghaleb, Bulat Khaertdinov, asli ozyurek, Raquel Fernández
- World Knowledge Resolves Aspectual Ambiguity
Katarzyna Pruś, Mark Steedman, Adam Lopez
- ACCESS DENIED INC: The First Benchmark Environment for Sensitivity Awareness
Dren Fazlija, Arkadij Orlov, Sandipan Sikdar
- Spatial Coordinates as a Cell Language: A Multi-Sentence Framework for Imaging Mass Cytometry Analysis
Chi-Jane Chen, Yuhang Chen, Sukwon Yun, Natalie Stanley, Tianlong Chen
- HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation Task
Zhaojian Yu, Yilun Zhao, Arman Cohan, Xiao-Ping Zhang
- TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
Yu Zhang, Wenxiang Guo, Changhao Pan, Dongyu Yao, Zhiyuan Zhu, Ziyue Jiang, Yuhan Wang, Tao Jin, Zhou Zhao
- Compute Optimal Scaling of Skills: Knowledge vs Reasoning
Nicholas Roberts, Niladri S. Chatterji, Sharan Narang, Mike Lewis, Dieuwke Hupkes
- PECAN: LLM-Guided Dynamic Progress Control with Attention-Guided Hierarchical Weighted Graph for Long-Document QA
Xinyu Wang, Yanzheng Xiang, Lin Gui, Yulan He
- Lifelong Model Editing with Graph-Based External Memory
Yash Kumar Atri, Ahmed Alaa, Thomas Hartvigsen
- Multi-Sense Embeddings for Language Models and Knowledge Distillation
Qitong Wang, Mohammed J Zaki, Georgios Kollias, Vasileios Kalantzis
- CodeScientist: End-to-End Semi-Automated Scientific Discovery with Code-based Experimentation
Peter Jansen, Oyvind Tafjord, Marissa Radensky, Pao Siangliulue, Tom Hope, Bhavana Dalvi Mishra, Bodhisattwa Prasad Majumder, Daniel S Weld, Peter Clark
- Beyond Factual Accuracy: Evaluating Coverage of Diverse Factual Information in Long-form Text Generation
Chris Samarinas, Alexander Krubner, Alireza Salemi, Youngwoo Kim, Hamed Zamani
- Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
Jacob Nielsen, Peter Schneider-Kamp, Lukas Galke
- When Detection Fails: The Power of Fine-Tuned Models to Generate Human-Like Social Media Text
Hillary Dawkins, Kathleen C. Fraser, Svetlana Kiritchenko
- Not quite Sherlock Holmes: Pretrained language models cannot reliably differentiate impossible from improbable events
James A. Michaelov, Reeka Estacio, Zhien Zhang, Ben Bergen
- The Rotary Position Embedding May Cause Dimension Inefficiency in Attention Heads for Long-Distance Retrieval
Ting-Rui Chiang, Dani Yogatama
- IDEA: Enhancing the Rule Learning Ability of Large Language Model Agent through Induction, Deduction, and Abduction
Kaiyu He, Mian Zhang, Shuo Yan, Peilin Wu, Zhiyu Chen
- EnigmaToM: Improve LLMs’ Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States
Hainiu Xu, Siya Qi, Jiazheng Li, Yuxiang Zhou, Jinhua Du, Caroline Catmur, Yulan He
- ReasonerRank: Redefining Language Model Evaluation with Ground-Truth-Free Ranking Frameworks
Jiamu Zhang, Jiayi Yuan, Andrew Wen, Hoang Anh Duy Le, Yu-Neng Chuang, Soo-Hyun Choi, Rui Chen, Xia Hu
- HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation
Weizhi Tang, Yixuan Li, Chris Sypherd, Elizabeth Polgreen, Vaishak Belle
- Can Large Language Models Understand Argument Schemes?
Elfia Bezou-Vrakatseli, Oana Cocarascu, Sanjay Modgil
- MMInA: Benchmarking Multihop Multimodal Internet Agents
Shulin Tian, Ziniu Zhang, Liangyu Chen, Ziwei Liu
- ThinkGuard: Deliberative Slow Thinking Leads to Cautious Guardrails
Xiaofei Wen, Wenxuan Zhou, Wenjie Jacky Mo, Muhao Chen
- Neutralizing Bias in LLM Reasoning using Entailment Graphs
Liang Cheng, Tianyi Li, Zhaowei Wang, Tianyang Liu, Mark Steedman
- Dynamic Steering With Episodic Memory For Large Language Models
Van Dai Do, Quan Hung Tran, Svetha Venkatesh, Hung Le
- Eeyore: Realistic Depression Simulation via Supervised and Preference Optimization
Siyang Liu, Bianca Brie, Wenda Li, Laura Biester, Andrew Lee, James Pennebaker, Rada Mihalcea
- Lost in Translation: Benchmarking Commercial Machine Translation Models for Dyslexic-Style Text
Gregory Price, Shaomei Wu
- Divide-Verify-Refine: Can LLMs Self-align with Complex Instructions?
Xianren Zhang, Xianfeng Tang, Hui Liu, Zongyu Wu, Qi He, Dongwon Lee, Suhang Wang
- LlamaPIE: Proactive In-Ear Conversation Assistants
Tuochao Chen, Nicholas Scott Batchelder, Alisa Liu, Noah A. Smith, Shyamnath Gollakota
- Task-Oriented Automatic Fact-Checking with Frame-Semantics
Jacob Devasier, Akshith Reddy Putta, Rishabh Mediratta, Chengkai Li
- Craw4LLM: Efficient Web Crawling for LLM Pretraining
Shi Yu, Zhiyuan Liu, Chenyan Xiong
- Be Cautious When Merging Unfamiliar LLMs: A Phishing Model Capable of Stealing Privacy
GUO Zhenyuan, Yi Shi, Wenlong Meng, Chen GONG, Chengkun Wei, Wenzhi CHEN
- Understand User Opinions of Large Language Models via LLM-Powered In-the-Moment User Experience Interviews
Mengqiao Liu, Tevin Wang, Cassandra A. Cohen, Sarah Li, Chenyan Xiong
- HiCOT: Improving Neural Topic Models via Optimal Transport and Contrastive Learning
Hoang Tran Vuong, Tue Le, Tu Vu, Tung Nguyen, Linh Ngo Van, Sang Dinh, Thien Huu Nguyen
- FLAG-TRADER: Fusion LLM-Agent with Gradient-based Reinforcement Learning for Financial Trading
GUOJUN XIONG, Zhiyang Deng, Keyi Wang, Yupeng Cao, Haohang Li, Yangyang Yu, Xueqing Peng, MINGQUAN LIN, Kaleb E Smith, Xiao-Yang Liu, Jimin Huang, Sophia Ananiadou, Qianqian Xie
- The Silent Saboteur: Imperceptible Adversarial Attacks against Black-Box Retrieval-Augmented Generation Systems
Hongru Song, Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Jianming Lv, Maarten de Rijke, Xueqi Cheng
- CROSSAGENTIE: Cross-Type and Cross-Task Multi-Agent LLM Collaboration for Zero-Shot Information Extraction
Meng Lu, Yuzhang Xie, Zhenyu Bi, Shuxiang Cao, Xuan Wang
- Decoupling Memories, Muting Neurons: Towards Practical Machine Unlearning for Large Language Models
Lishuai Hou, Zixiong Wang, Gaoyang Liu, Chen Wang, Wei Liu, Kai Peng
- Assimilation and Accommodation: Task-Adaptive Hierarchical Abstraction for Solving Web Tasks
Xinyu Pang, Ruixin Hong, Hongming Zhang, Changshui Zhang
- SafeLawBench: Towards Safe Alignment of Large Language Models
Chuxue Cao, Han Zhu, Jiaming Ji, Qichao Sun, Zhenghao Zhu, WU YINYU, Josef Dai, Yaodong Yang, Sirui Han, Yike Guo
- 3DM: Distill, Dynamic Drop, and Merge for Debiasing Multi-modal Large Language Models
Zhaoxi Zhang, Sanwoo Lee, Zhixiang Wang, Yunfang Wu
- CausalAbstain: Enhancing Multilingual LLMs with Causal Reasoning for Trustworthy Abstention
Yuxi SUN, Aoqi Zuo, Wei Gao, Jing Ma
- CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
Kanzhi Cheng, Wenpo Song, Jiaxin Fan, Zheng Ma, Qiushi Sun, Fangzhi Xu, Chenyang Yan, Nuo Chen, Jianbing Zhang, Jiajun Chen
- LLM-Empowered Class Imbalanced Graph Prompt Learning for Online Drug Trafficking Detection
Tianyi Ma, Yiyue Qian, Zehong Wang, Zheyuan Zhang, Chuxu Zhang, Yanfang Ye
- CoLA: Collaborative Low-Rank Adaptation
Yiyun Zhou, Chang Yao, Jingyuan Chen
- GLiM: Integrating Graph Transformer and LLM for Document-Level Biomedical Relation Extraction with Incomplete Labeling
Hao Fang, Yuejie Zhang, Rui Feng, Yingwen Wang, Qing Wang, Wen He, Xiaobo Zhang, Tao Zhang, Shang Gao
- AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting
Yang Xiao, Peng Tianyi, Rohan Kumar Das, Yuchen Hu, Huiping Zhuang
- Sleepless Nights, Sugary Days: Creating Synthetic Users with Health Conditions for Realistic Coaching Agent Interactions
Taedong Yun, Eric Yang, Mustafa Safdari, Jong Ha Lee, Vaishnavi Vinod Kumar, S. Sara Mahdavi, Jonathan Amar, Derek Peyton, Reut Aharony, Andreas Michaelides PhD, Logan Douglas Schneider, Isaac Galatzer-Levy, Yugang jia, John Canny, Arthur Gretton, Maja Mataric
- Imagine to Hear: Auditory Knowledge Generation can be an Effective Assistant for Language Models
Suho Yoo, Hyunjong Ok, Jaeho Lee
- SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning
Junkai Chen, Zhijie Deng, Kening Zheng, Yibo Yan, Shuliang Liu, PeiJun WU, Peijie Jiang, Jia Liu, Xuming Hu
- Prediction-Augmented Generation for Automatic Diagnosis Tasks
Chan-Yang Ju, Dong-Ho Lee
- FedLEKE: Federated Locate-then-Edit Knowledge Editing for Multi-Client Collaboration
Zongkai Zhao, Guozeng Xu, Xiuhua Li, kaiwen wei, Jiang Zhong
- DiSCo: Device-Server Collaborative LLM-based Text Streaming Services
Ting Sun, Penghan Wang, Fan Lai
- Customizing In-context Learning for Dynamic Interest Adaption in LLM-based Recommendation
Keqin Bao, Ming Yan, Yang Zhang, Jizhi Zhang, Wenjie Wang, Fuli Feng, Xiangnan He
- Robust Data Watermarking in Language Models by Injecting Fictitious Knowledge
Xinyue Cui, Johnny Wei, Swabha Swayamdipta, Robin Jia
- LLM-Enhanced Query Generation and Retrieval Preservation for Task-Oriented Dialogue
Jiale Chen, Xuelian Dong, Wenxiu Xie, Ru Peng, Kun Zeng, Tianyong Hao
- ClozeMath: Improving Mathematical Reasoning in Language Models by Learning to Fill Equations
Quang Hieu Pham, Thuy-Duong Nguyen, Tung Pham, Anh Tuan Luu, Dat Quoc Nguyen
- Low-Entropy Watermark Detection via Bayes’ Rule Derived Detector
Beining Huang, Du Su, Fei Sun, Qi Cao, Huawei Shen, Xueqi Cheng
- CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis
Junying Chen, Chi Gui, Anningzhe Gao, Ke Ji, Xidong Wang, Xiang Wan, Benyou Wang
- DaNet: Dual-Aware Enhanced Alignment Network for Multimodal Aspect-Based Sentiment Analysis
Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An
- Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings
Shujian Yang, Shiyao Cui, Haicheng Wang, Tianwei Zhang, Minlie Huang, Jialiang LU, Han Qiu
- LDIR: Low-Dimensional Dense and Interpretable Text Embeddings with Relative Representations
Yile Wang, Zhanyu Shen, Hui Huang
- Ranked Voting based Self-Consistency of Large Language Models
Weiqin Wang, Yile Wang, Hui Huang
- SemanticCamo: Jailbreaking Large Language Models through Semantic Camouflage
Jihui Yan, Xiaocui Yang, Daling Wang, Shi Feng, Yifei Zhang, Yinzhi Zhao
- Assigning Distinct Roles to Quantized and Low-Rank Matrices Toward Optimal Weight Decomposition
Yoonjun Cho, Soeun Kim, Dongjae Jeon, Kyelim Lee, Beomsoo Lee, Albert No
- Better Process Supervision with Bi-directional Rewarding Signals
Wenxiang Chen, Wei He, Zhiheng Xi, Honglin Guo, Boyang Hong, Jiazheng Zhang, Nijun Li, Tao Gui, Yun Li, Qi Zhang, Xuanjing Huang
- AlignXIE: Boosting Cross-Lingual Universal Information Extraction by Multilingual Alignment
Yuxin Zuo, Wenxuan Jiang, Wenxuan Liu, Zixuan Li, Long Bai, Hanbin Wang, Yutao Zeng, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng
- MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
Zhongwei Wan, Che Liu, Xin Wang, Chaofan Tao, Hui Shen, Jing Xiong, Rossella Arcucci, Huaxiu Yao, Mi Zhang
- Harnessing Large Language Models for Disaster Management: A Survey
Zhenyu Lei, Yushun Dong, Weiyu Li, Rong Ding, Qi R. Wang, Jundong Li
- Towards Medical Complex Reasoning with LLMs through Medical Verifiable Problems
Junying Chen, Zhenyang Cai, Ke Ji, Xidong Wang, Wanlong Liu, Rongsheng Wang, Benyou Wang
- Monitoring Decoding: Mitigating Hallucination via Evaluating the Factuality of Partial Response during Generation
Yurui Chang, Bochuan Cao, Lu Lin
- LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Dayiheng Liu, Chang Zhou, Wen Xiao, Tianyu Liu, Baobao Chang
- EvoBench: Towards Real-world LLM-Generated Text Detection Benchmarking for Evolving Large Language Models
Xiao Yu, Yi Yu, Dongrui Liu, Kejiang Chen, Weiming Zhang, Nenghai Yu, Jing Shao
- MMSciBench: Benchmarking Language Models on Multimodal Scientific Problems
Xinwu Ye, Chengfan Li, Siming Chen, Xiangru Tang, Wei Wei
- Lightweight Query Checkpoint: Classifying Faulty User Queries to Mitigate Hallucinations in Large Language Model Question Answering
Jonghak Jang, Minjoo Son, Misuk Kim
- Exploring LLM Annotation for Adaptation of Clinical Information Extraction Models under Data-sharing Restrictions
Seiji Shimizu, HISADA Shohei, Yutaka Uno, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki
- Enhancing the Comprehensibility of Text Explanations via Unsupervised Concept Discovery
Yifan Sun, Danding Wang, Qiang Sheng, Juan Cao, Jintao Li
- RecordTwin: Towards Creating Safe Synthetic Clinical Corpora
Seiji Shimizu, Ibrahim Baroud, Lisa Raithel, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki
- Beyond Surface-Level Patterns: An Essence-Driven Defense Framework Against Jailbreak Attacks in LLMs
Shiyu Xiang, Ansen zhang, Yanfei Cao, Fan Yang, Ronghao Chen
- Multimodal Invariant Sentiment Representation Learning
Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An
- ChuLo: Chunk-Level Key Information Representation for Long Document Understanding
Yan Li, Caren Han, Yue Dai, Feiqi Cao
- REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space
Tomer Ashuach, Martin Tutek, Yonatan Belinkov
- Is External Information Useful for Stance Detection with LLMs?
Quang Minh Nguyen, Taegyoon Kim
- Benchmarking Query-Conditioned Natural Language Inference
Marc E. Canby, Xinchi Chen, Xing Niu, Jifan Chen, Bonan Min, Sergul Aydore, Vittorio Castelli
- Flowchart-Based Decision Making with Large Language Models
Yuuki Yamanaka, Hiroshi Takahashi, Tomoya Yamashita
- NarGINA: Towards Accurate and Interpretable Children’s Narrative Ability Assessment via Narrative Graphs
Jun Zhong, Longwei Xu, Li Kong, Xianzhuo Li, Dandan Liang, Junsheng Zhou
- Improving Efficiency in Large Language Models via Extendable Block Floating Point Representation
Dongyang Li, Zeyang Li, Bosheng Liu, Jigang Wu
- EpiCoDe: Boosting Model Performance Beyond Training with Extrapolation and Contrastive Decoding
Mingxu Tao, Jie Hu, mingchuan yang, Yunhuai Liu, Dongyan Zhao, Yansong Feng
- NativQA: Multilingual Culturally-Aligned Natural Query for LLMs
Md. Arid Hasan, Maram Hasanain, Fatema Ahmad, Sahinur Rahman Laskar, Sunaya Upadhyay, Vrunda N Sukhadia, Mucahid Kutlu, Shammur Absar Chowdhury, Firoj Alam
- DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation
Xinglin Lyu, Wei Tang, Yuang Li, Xiaofeng Zhao, Ming Zhu, Junhui Li, Yunfei Lu, Min Zhang, Daimeng Wei, Hao Yang, Min zhang
- RISE: Reasoning Enhancement via Iterative Self-Exploration in Multi-hop Question Answering
Bolei He, Xinran He, Mengke Chen, xianwei xue, Ying Zhu, Zhen-Hua Ling
- VADE: Visual Attention Guided Hallucination Detection and Elimination
Vishnu Prabhakaran, Purav Aggarwal, Vinay Kumar Verma, Gokul Swamy, Anoop Saladi
- PGPO: Enhancing Agent Reasoning via Pseudocode-style Planning Guided Preference Optimization
zouying cao, Runze Wang, Yifei Yang, Xinbei Ma, Xiaoyong Zhu, Bo Zheng, hai zhao
- The Effectiveness of Uncased Tokeniziaion for Clinical Notes
Cory Paik, Katharina von der Wense
- AMXFP4: Taming Activation Outliers with Asymmetric Microscaling Floating-Point for 4-bit LLM Inference
Janghwan Lee, Jiwoong Park, Jinseok Kim, Yongjik Kim, Jungju Oh, Jinwook Oh, Jungwook Choi
- Improving Continual Pre-training Through Seamless Data Packing
Ruicheng Yin, Xuan Gao, Changze Lv, Xiaohua Wang, Xiaoqing Zheng, Xuanjing Huang
- The Impact of Name Age Perception on Job Recommendations in LLMs
Mahammed Kamruzzaman, Gene Louis Kim
- DAPI: Domain Adaptive Toxicity Probe Vector Intervention, for Fine-Grained Detoxification
Cho Hyeonsu, Dooyoung Kim, Youngjoong Ko
- Task Knowledge Injection via Interpolations and Reinstatement for Large Language Model Generalization
Yukun Zhao, Lingyong Yan, Zhenyang Li, Shuaiqiang Wang, Zhumin Chen, Zhaochun Ren, Dawei Yin
- STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation
Wenxiang Guo, Yu Zhang, Changhao Pan, Zhiyuan Zhu, Ruiqi Li, ZheTao Chen, Wenhao Xu, Fei Wu, Zhou Zhao
- Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning
Xinghao Chen, Zhijing Sun, Guo Wenjin, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen
- INT: Establishing Information Transfer for Multilingual Intent Detection and Slot Filling
Di Wu, Liting Jiang, Bohui Mao, Hongyan Xie, Haoxiang Su, Zhongjiang He, Ruiyu Fang, Shuangyong Song, Hao Huang, Xuelong Li
- Enhancing LLM Agent Safety via Causal Influence Prompting
Dongyoon Hahm, Woogyeol Jin, June Suk Choi, Sungsoo Ahn, Kimin Lee
- Position Paper: MeMo: Towards Language Models with Associative Memory Mechanisms
Fabio Massimo Zanzotto, Elena Sofia Ruzzetti, Giancarlo A. Xompero, Leonardo Ranaldi, Davide Venditti, Federico Ranaldi, Cristina Giannone, Andrea Favalli, Raniero Romagnoli
- DeRAGEC: Denoising Named Entity Candidates with Synthetic Rationale for ASR Error Correction
Solee Im, Wonjun Lee, JinMyeong AN, Yunsu Kim, Jungseul Ok, Gary Lee
- Rehearse With User: Personalized Opinion Summarization via Role-Playing based on Large Language Models
Yanyue Zhang, Yulan He, Deyu Zhou
- AdParaphrase v2.0: Generating Attractive Ad Texts Using a Preference-Annotated Paraphrase Dataset
Soichiro Murakami, Peinan Zhang, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura
- Beyond the Average Reader: the Reader Embedding Approach
Calogero Jerik Scozzaro, Matteo Delsanto, Daniele Paolo Radicioni
- PredictaBoard: Benchmarking LLM Score Predictability
Lorenzo Pacchiardi, Konstantinos Voudouris, Ben Slater, Fernando Martínez-Plumed, Jose Hernandez-Orallo, Lexin Zhou, Wout Schellaert
- FedDQC: Data Quality Control in Federated Instruction-tuning of Large Language Models
Yaxin Du, Rui Ye, Fengting Yuchi, Wanru Zhao, Jingjing Qu, Yanfeng Wang, Siheng Chen
- Weed Out, Then Harvest: Dual Low-Rank Adaptation is an Effective Noisy Label Detector for Noise-Robust Learning
Bo Yuan, Yulin Chen, Yin Zhang
- “I understand your perspective”: LLM Persuasion through the Lens of Communicative Action Theory
Esra Dönmez, Agnieszka Falenska
- Nunchi-Bench: Benchmarking Language Models on Cultural Reasoning with a Focus on Korean Superstition
KYU HEE KIM, Sangah Lee
- Let’s Be Self-generated via Step by Step: A Curriculum Learning Approach to Automated Reasoning with Large Language Models
Kangyang Luo, Zichen Ding, Zhenmin Weng, Lingfeng Qiao, Meng Zhao, Xiang Li, di yin, Jinlong Shu
- daDPO: Distribution-Aware DPO for Distilling Conversational Abilities
Zhengze Zhang, Shiqi Wang, Yiqun Shen, Simin Guo, Dahua Lin, Wang Xiaoliang, Nguyen Cam-Tu, Fei Tan
- Consultant Decoding: Yet Another Synergistic Mechanism
Chuanghao Ding, Jiaping Wang, Ziqing Yang, Wang Xiaoliang, Dahua Lin, Nguyen Cam-Tu, Fei Tan
- IntelliCockpitBench: A Comprehensive Benchmark to Evaluate VLMs for Intelligent Cockpit
Liang Lin, Siyuan Chai, Jiahao Wu, Hongbing Hu, Xiaotao Gu, Hao Hu, Fan Zhang, Wei Wang, Dan Zhang
- Analyzing Political Bias in LLMs via Target-Oriented Sentiment Classification
Akram Elbouanani, Evan Dufraisse, Adrian Popescu
- PISCO: Pretty Simple Compression for Retrieval-Augmented Generation
Maxime Louis, Hervé Déjean, Stéphane Clinchant
- AnchorCoT: Anchors Pave the Way for Multi-hop Reasoning
Tianshi Ming, Xian Wu, Yingying Zhang, Zichuan Fu, Dawei Cheng
- Token Pruning in Multimodal Large Language Models: Are We Solving the Right Problem?
Zichen Wen, Yifeng Gao, Weijia Li, Conghui He, Linfeng Zhang
- Federated Data-Efficient Instruction Tuning for Large Language Models
Zhen Qin, Zhaomin Wu, Bingsheng He, Shuiguang Deng
- They want to pretend not to understand: The Limits of Current LLMs in Interpreting Implicit Content of Political Discourse
Walter Paci, Alessandro Panunzi, Sandro Pezzelle
- ZeroNER: Fueling Zero-Shot Named Entity Recognition via Entity Type Descriptions
Alessio Cocchieri, Marcos Martínez Galindo, Giacomo Frisoni, Gianluca Moro, Claudio Sartori, Giuseppe Tagliavini
- Do Large Language Models Have “Emotion Neurons”? Investigating the Existence and Role
Jaewook Lee, Woojin Lee, Oh-Woog KWON, Harksoo Kim
- Grammar-Based Code Representation: Is It a Worthy Pursuit for LLMs?
Qingyuan Liang, Zhao Zhang, Zeyu Sun, Zheng Lin, Qi Luo, Xiao Yueyi, Yizhou Chen, Yuqun Zhang, Haotian Zhang, Lu Zhang, chenbin, Yingfei Xiong
- Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study
Yujie Lin, Ante Wang, Moye Chen, Jingyao Liu, Hao Liu, Jinsong Su, Xinyan Xiao
- E2I-Synth: Advancing GUI Grounding with Large-Scale Instruction Synthesis
Xinyi Liu, Xiaoyi Zhang, Ziyun Zhang, Yan Lu
- A Study into Investigating Temporal Robustness of LLMs
Jonas Wallat, Abdelrahman Abdallah, Adam Jatowt, Avishek Anand
- ToolExpNet: Optimizing Multi-Tool Selection in LLMs with Similarity and Dependency-Aware Experience Networks
Zijing Zhang, Zhanpeng Chen, He Zhu, Ziyang Chen, nan du, Xiaolong Li
- SPILL: Zero-shot Intent Clustering based on Selection and Pooling with Large Language Models
I-Fan Lin, Faegheh Hasibi, Suzan Verberne
- How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain Simulation
Rui Li, Heming Xia, Xinfeng Yuan, Qingxiu Dong, Lei Sha, Wenjie Li, Zhifang Sui
- GRI-QA: a Comprehensive Benchmark for Table Question Answering over Environmental Data
Michele Luca Contalbo, Sara Pederzoli, Francesco Del Buono, Venturelli Valeria, Francesco Guerra, Matteo Paganelli
- WebUIBench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in WebUI-to-Code
Zhiyu Lin, Zhengda Zhou, Zhiyuan Zhao, Tianrui Wan, Yilun Ma, Junyu Gao, Xuelong Li
- Optimizing Multi-Hop Document Retrieval Through Intermediate Representations
Linjiaen, Jingyu Liu
- Towards Better Understanding of Program-of-Thought Reasoning in Cross-Lingual and Multilingual Environments
Patomporn Payoungkhamdee, Pume Tuchinda, Jinheon Baek, Samuel Cahyawijaya, Can Udomcharoenchaikit, Potsawee Manakul, Peerat Limkonchotiwat, Ekapol Chuangsuwanich, Sarana Nutanong
- A Fully Automated Pipeline for Conversational Discourse Annotation: Tree Scheme Generation and Labeling with Large Language Models
Kseniia Petukhova, Ekaterina Kochmar
- Can Language Models Serve as Analogy Annotators?
Xiaojing Zhang, Bochen Lyu
- Reward Generalization in RLHF: A Topological Perspective
Tianyi Qiu, Fanzhi Zeng, Jiaming Ji, Dong Yan, Kaile Wang, Jiayi Zhou, Yang Han, Josef Dai, Xuehai Pan, Yaodong Yang
- Enhanced Data Synthesis for LLM through Reasoning Structures Generated by Hierarchical GFlowNet
Tianpeng Bu, Minying Zhang, Hongtao Duan, Shurui Li, lulu hu, Yu Li
- Capturing Nuanced Preferences: Preference-Aligned Distillation for Small Language Models
Yanggan Gu, Junzhuo Li, Sirui Huang, Xin Zou, Zhenghua Li, Xuming Hu
- Token-level Preference Self-Alignment Optimization for Multi-style Outline Controllable Generation
Zihao Li, Xuekong Xu, Ziyao Chen, Lixin Zou, Ethanhjwu, Qiang Chen, Chenliang Li
- Bridging Policies, Platforms and Research: Advancing NLP for Hate Speech Proactive Mitigation
Naquee Rizwan, Seid Muhie Yimam, Daryna Dementieva, Dr. Florian Skupin, Tim Fischer, Daniil Moskovskiy, Aarushi Ajay Borkar, Robert Geislinger, Punyajoy Saha, Sarthak Roy, Martin Semmann, Alexander Panchenko, Chris Biemann, Animesh Mukherjee
- Local Look-Ahead Guidance via Verifier-in-the-Loop for Automated Theorem Proving
Sara Rajaee, Kumar Pratik, Gabriele Cesa, Arash Behboodi
- Generalizable Cross-Lingual Cognitive Distortion Detection with Standardized Annotations and Multi-Task Learning
Hongzhi Qi, Nan Bai, Jianqiang Li, Wei Zhai, Qing Zhao, Qi Gao, Bing Xiang Yang, Guanghui FU
- How Do Multilingual Language Models Remember Facts?
Constanza Fierro, Negar Foroutan, Desmond Elliott, Anders Søgaard
- SeqPO-SiMT: Sequential Policy Optimization for Simultaneous Machine Translation
Ting Xu, Zhichao Huang, Jiankai Sun, Shanbo Cheng, Wai Lam
- Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales
Ayuto Tsutsumi, Yuu Jinnai
- BOSE: A Systematic Evaluation Method Optimized for Base Models
Hongzhi Luan, Changxin Tian, Zhaoxin Huan, Xiaolu Zhang, Kunlong Chen, Zhiqiang Zhang, JUN ZHOU
- DPGA-TextSyn: Differentially Private Genetic Algorithm for Synthetic Text Generation
Zhonghao Sun, Zhiliang Tian, YIPING SONG, yuyi Si, Juhua Zhang, Minlie Huang, Kai Lu, Zeyu Xiong, Xinwang Liu, Dongsheng Li
- Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer
Seungyoon Lee, Seongtae Hong, Hyeonseok Moon, Heuiseok Lim
- Boost, Disentangle, and Customize: A Robust System2-to-System1 Pipeline for Code Generation
Kounianhua Du, Hanjing Wang, Jianxing Liu, Jizheng Chen, Xinyi Dai, Yasheng Wang, Ruiming Tang, Yong Yu, Jun Wang, Weinan Zhang
- On the Consistency of Commonsense in Large Language Models
Guozheng Li, Peng Wang, Wenjun Ke, Zijie Xu, Jiajun Liu, Ziyu Shang
- Statement-Tuning Enables Efficient Cross-lingual Generalization in Encoder-only Models
Ahmed Elshabrawy, Thanh-Nhi Nguyen, Yeeun Kang, Lihan Feng, Annant Jain, Faadil Abdullah Shaikh, Jonibek Mansurov, Mohamed Fazli Mohamed Imam, Jesus-German Ortiz-Barajas, Rendi Chevi, Alham Fikri Aji
- Evaluating Large Language Models for Confidence-based Check Set Selection
Jane Arleth dela Cruz, Iris Hendrickx, Martha Larson
- Training Multi-Modal LLMs through Dialogue Planning for HRI
Claudiu Daniel Hromei, Federico Borazio, Andrea Sensi, Elisa Passone, Danilo Croce, Roberto Basili
- MVL-SIB: A Massively Multilingual Vision-Language Benchmark for Cross-Modal Topical Matching
Fabian David Schmidt, Florian Schneider, Chris Biemann, Goran Glavaš
- The Rise of Darkness: Safety-Utility Trade-Offs in Role-Playing Dialogue Agents
Yihong Tang, Kehai Chen, Xuefeng Bai, Zheng-Yu Niu, Bo Wang, Jie Liu, Min Zhang
- SynGraph: A Dynamic Graph-LLM Synthesis Framework for Sparse Streaming User Sentiment Modeling
Xin Zhang, Qiyu Wei, Yingjie Zhu, Linhai Zhang, Deyu Zhou, Sophia Ananiadou
- Enhancing Tool Learning in Large Language Models with Hierarchical Error Checklists
Yue Cui, Liuyi Yao, Shuchang Tao, Weijie Shi, Yaliang Li, Bolin Ding, Xiaofang Zhou
- A Large and Balanced Corpus for Fine-grained Arabic Readability Assessment
Khalid Elmadani, Nizar Habash, Hanada Taha
- Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?
Che Liu, Zhongwei Wan, Haozhe Wang, Yinda Chen, Talha Qaiser, Chen Jin, Nikolay Burlutskiy, Fariba Yousefi, Rossella Arcucci
- See the World, Discover Knowledge: A Chinese Factuality Evaluation for Large Vision Language Models
Jihao Gu, Yingyao Wang, Pi Bu, Chen Wang, Ziming Wang, Tengtao Song, Donglai Wei, Jiale Yuan, Yingxiu Zhao, Jun Song, Bo Zheng, Yancheng He, Shilong Li, Jiaheng Liu, Meng Cao, Xiaoyong Zhu, Yingshui Tan, Xiang Li, Wenbo Su
- Argus: Benchmarking and Enhancing Vision-Language Models for 3D Radiology Report Generation
Che Liu, Zhongwei Wan, Yuqi Wang, Hui Shen, Haozhe Wang, Kangyu Zheng, Mi Zhang, Rossella Arcucci
- Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering
Binquan Ji, Haibo Luo, YifeiLu, Lei Hei, Jiaqi Wang, Tingjing Liao, Wang Lingyu, Shichao Wang, Feiliang Ren
- Evaluating LLMs’ Assessment of Mixed-Context Hallucination Through the Lens of Summarization
Siya Qi, RUI CAO, Yulan He, Zheng Yuan
- TUBA: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning
Xuanli He, Jun Wang, Qiongkai Xu, Pasquale Minervini, Pontus Stenetorp, Benjamin I. P. Rubinstein, Trevor Cohn
- Eliciting Textual Descriptions from Representations of Continuous Prompts
Daniela Gottesman, Mor Geva, Dana Ramati
- Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
Yuhan Fu, Ruobing Xie, Xingwu Sun, Zhanhui Kang, Xirong Li
- Review-Instruct: A Review-Driven Multi-Turn Conversations Generation Method for Large Language Models
Jiangxu Wu, Cong Wang, Tianhuang Su, lin haozhi, JunYang, Zhangchao, Binqiang Pan, SongpanYang, Mingpeng, Kai Shi, ZIXIAN LI
- Axiomatic Analysis of Uncertainty Estimation For Retrieval Augmented Generation
Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi
- EuroVerdict: A Multilingual Dataset for Verdict Generation Against Misinformation
Daniel Russo, Fariba Sadeghi, Stefano Menini, Marco Guerini
- LoFTI: Localization and Factuality Transfer to Indian Locales
Sona Elza Simon, Soumen Kumar Mondal, Abhishek Singhania, Sayambhu Sen, Preethi Jyothi
- Hierarchical Retrieval with Evidence Curation for Open-Domain Financial Question Answering on Standardized Documents
Jaeyoung Choe, Jihoon Kim, Woohwan Jung
- GNN-RAG: Graph Neural Retrieval for Efficient Large Language Model Reasoning on Knowledge Graphs
Costas Mavromatis, George Karypis
- ASTRID - An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems
Yajie Vera He, Mohita Chowdhury, Jared Joselowitz, Aisling Higham, Ernest Lim
- On Entity Identification in Language Models
Masaki Sakata, Sho Yokoi, Benjamin Heinzerling, Takumi Ito, Kentaro Inui
- RAPID: Efficient Retrieval-Augmented Long Text Generation with Writing Planning and Information Discovery
Hongchao Gu, Dexun Li, Kuicai Dong, Hao Zhang, Hang Lv, Hao Wang, Defu Lian, Yong Liu, Enhong Chen
- CHARPEVAL: Benchmarking Large Language Models’ Contextual Reasoning in Knowledge-Grounded Dialogue
Abbas Ghaddar, David Alfonso-Hermelo, Philippe Langlais, Boxing Chen, Prasanna Parthasarathi
- Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad mahdi Abootorabi, Amirhosein Zobeiri, Mahdi Dehghani, Mohammadali Mohammadkhani, Bardia Mohammadi, Omid Ghahroodi, Mahdieh Soleymani Baghshah, Ehsaneddin Asgari
- Debate4MATH: Multi-Agent Debate for Fine-Grained Reasoning in Math
Shaowei Zhang, Deyi Xiong
- Disambiguate First Parse Later: Generating Interpretations for Ambiguity Resolution in Semantic Parsing
Irina Saparina, Mirella Lapata
- The anatomy of evidence: An investigation into explainable ICD coding
Katharina Beckh, Felix Studeny, Sujan Sai Gannamaneni, Dario Antweiler, Stefan Rueping
- AVG-LLaVA: An Efficient Large Multimodal Model with Adaptive Visual Granularity
Zhibin Lan, Liqiang Niu, Fandong Meng, Wenbo Li, Jie Zhou, Jinsong Su
- Word Form Matters: LLMs’ Semantic Reconstruction under Typoglycemia
Chenxi Wang, Tianle Gu, zhongyu wei, Lang Gao, Zirui Song, Xiuying Chen
- LLM-based Translation Inference with Iterative Bilingual Understanding
Andong Chen, Kehai Chen, Yang Xiang, Xuefeng Bai, Muyun Yang, Yang Feng, Tiejun Zhao, Min zhang
- Vulnerability of Text-to-Image Models to Prompt Template Stealing: A Differential Evolution Approach
Yurong Wu, Fangwen Mu, Qiuhong Zhang, Jinjing Zhao, Xinrun Xu, Lingrui Mei, Yang Wu, Lin Shi, Junjie Wang, Zhiming Ding, Yiwei Wang
- mStyleDistance: Multilingual Style Embeddings and their Evaluation
Justin Qiu, Jiacheng Zhu, Ajay Patel, Marianna Apidianaki, Chris Callison-Burch
- SeqMMR: Sequential Model Merging and LLM Routing for Enhanced Batched Sequential Knowledge Editing
Shanbao Qiao, Xuebing Liu, Akshat Gupta, Seung-Hoon Na
- Improving Meta Introspection of Small LLMs by Learning Self-Reflection from Self-Generated Data
Jiaqi Li, Xinyi Dong, Yang Liu, Zhizhuo Yang, Quansen Wang, Xiaobo Wang, Song-Chun Zhu, Zixia Jia, Zilong Zheng
- MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering
Shuo Yang, Siwen Luo, Caren Han, Eduard Hovy
- Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models
Injae Na, Keonwoong Noh, Woohwan Jung
- Low-Rank Interconnected Adaptation across Layers
Yibo Zhong, Jinman Zhao, Yao Zhou
- GaRAGe: A Benchmark with Grounding Annotations for RAG Evaluation
Ionut Teodor Sorodoc, Leonardo F. R. Ribeiro, Rexhina Blloshmi, Christopher Davis, Adrià de Gispert
- Change Entity-guided Heterogeneous Representation Disentangling for Change Captioning
Yi Li, Yunbin Tu, Liang Li, Li Su, Qingming Huang
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
Zhuoran Jin, Hongbang Yuan, Tianyi Men, Pengfei Cao, Yubo Chen, Jiexin Xu, Huaijun Li, Xiaojian Jiang, Kang Liu, Jun Zhao
- Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution
Kun LI, Tianhua Zhang, Yunxiang Li, Hongyin Luo, Abdalla Mohamed Salama Sayed Moustafa, Xixin Wu, James R. Glass, Helen M. Meng
- PAM: Paraphrase AMR-Centric Evaluation Metric
Afonso Sousa, Henrique Lopes Cardoso
- VP-MEL: Visual Prompts Guided Multimodal Entity Linking
Hongze Mi, Jinyuan Li, Zhangxuying, Haoran Cheng, Jiahao Wang, Di Sun, Gang Pan
- FADE: Why Bad Descriptions Happen to Good Features
Bruno Puri, Aakriti Jain, Elena Golimblevskaia, Patrick Kahardipraja, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin
- In the LLM era, Word Sense Induction remains unsolved
Anna Mosolova, Marie Candito, Carlos Ramisch
- Navigating the Political Compass: Evaluating Multilingual LLMs across Languages and Nationalities
Chadi Helwe, Oana Balalau, Davide Ceolin
- Who Can Withstand Chat-Audio Attacks? An Evaluation Benchmark for Large Language Models
Wanqi Yang, Yanda Li, Meng Fang, Yunchao Wei, Ling Chen
- Beyond the Tip of Efficiency: Uncovering the Submerged Threats of Jailbreak Attacks in Small Language Models
Sibo Yi, Tianshuo Cong, Xinlei He, Qi Li, Jiaxing Song
- EMRs2CSP : Mining Clinical Status Pathway from Electronic Medical Records
Yifei Chen, Ruihui Hou, Jingping Liu, Tong Ruan
- A Law Reasoning Benchmark for LLM with Tree-Organized Structures including Factum Probandum, Evidence and Experiences
Jiaxin Shen, Jinan Xu, Huiqi Hu, Luyi Lin, Guoyang Ma, Fei Zheng, Fandong Meng, Jie Zhou, Wenjuan Han
- Libra: Leveraging Temporal Images for Biomedical Radiology Analysis
Xi Zhang, Zaiqiao Meng, Jake Lever, Edmond S. L. Ho
- Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach
Aditya Tomar, Rudra Murthy, Pushpak Bhattacharyya
- Filling the Temporal Void: Recovering Missing Publication Years in the Project Gutenberg Corpus Using LLMs
Omar Momen, Manuel Schaaf, Alexander Mehler
- ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models
Martina Miliani, Serena Auriemma, Alessandro Bondielli, Emmanuele Chersoni, Lucia Passaro, Irene Sucameli, Alessandro Lenci
- Are Dialects Better Prompters? A Case Study on Arabic Subjective Text Classification
Leila MOUDJARI, Farah Benamara
- Natural Logic at the Core: Dynamic Rewards for Entailment Tree Generation
Jihao Shi, Xiao Ding, Kai Xiong, Hengwei Zhao, Bing Qin, Ting Liu
- R.R.: Unveiling LLM Training Privacy through Recollection and Ranking
Wenlong Meng, GUO Zhenyuan, Lenan Wu, Chen GONG, Wenyan Liu, Weixian Li, Chengkun Wei, Wenzhi CHEN
- Nested-Refinement Metamorphosis: Reflective Evolution of Prompts and Code for Efficient Algorithm Design
Shuhan Guo, Nan Yin, James Kwok, Quanming Yao
- MC-MKE: A Fine-Grained Multimodal Knowledge Editing Benchmark Emphasizing Modality Consistency
Junzhe Zhang, Huixuan Zhang, Xunjian Yin, Baizhou Huang, Xu Zhang, Xinyu Hu, Xiaojun Wan
- Visualising Policy-Reward Interplay to Inform Zeroth-Order Preference Optimisation of Large Language Models
Alessio Galatolo, Zhenbang Dai, Meriem Beloucif, Katie Winkle
- Metaphor and Large Language Models: When Surface Features Matter More than Deep Understanding
Elisa Sanchez-Bayona, Rodrigo Agerri
- AskQE: Question Answering as Automatic Evaluation for Machine Translation
Dayeon Ki, Kevin Duh, Marine Carpuat
- ExPerT: Effective and Explainable Evaluation of Personalized Long-Form Text Generation
Alireza Salemi, Julian Killingback, Hamed Zamani
- Bridging Intuitive Associations and Deliberate Recall: Empowering LLM Personal Assistant with Graph-Structured Long-term Memory
Yujie Zhang, Weikang Yuan, Zhuoren Jiang
- Each graph is a new language: Graph Learning with LLMs
Huachi Zhou, Jiahe Du, Chuang Zhou, Chang Yang, Yilin Xiao, Yuxuan Xie, Xiao Huang
- 100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?
Van Yang, Hongye Jin, Shaochen Zhong, Song Jiang, Qifan Wang, Vipin Chaudhary, Xiaotian Han
- Multimodal Fusion and Coherence Modeling for Video Topic Segmentation
Hai Yu, Chong Deng, Qinglin Zhang, Jiaqing Liu, Qian Chen, Wen Wang
- Are Your LLMs Capable of Stable Reasoning?
Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen
- FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only
He Zhu, Yifan Ding, Yicheng Tao, Zhiwen Ruan, Yixia Li, Wenjia Zhang, Yun Chen, Guanhua Chen
- JEBS: A Fine-grained Biomedical Lexical Simplification Task
William Xia, Ishita Unde, Brian David Ondov, Dina Demner-Fushman
- Enhancing Multi-Hop Reasoning for Question Answering with Hyperbolic Representations
Simon Welz, Lucie Flek, Akbar Karimi
- Look & Mark: Leveraging Radiologist Eye Fixations and Bounding boxes in Multimodal Large Language Models for Chest X-ray Report Generation
Yunsoo Kim, Jinge Wu, Su Hwan Kim, Pardeep Vasudev, Jiashu Shen, Honghan Wu
- Hatevolution: What Static Benchmarks Don’t Tell Us
Chiara Di Bonaventura, Barbara McGillivray, Yulan He, Albert Meroño-Peñuela
- Tag-Instruct: Controlled Instruction Complexity Enhancement through Structure-based Augmentation
He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Wenjia Zhang, Yun Chen, Guanhua Chen
- Code-SPA: Style Preference Alignment to Large Language Models for Effective and Robust Code Debugging
Tengfei Wen, Xuanang Chen, Ben He, Le Sun
- Open-World Authorship Attribution
Xinhao Tan, Songhua Liu, Xia Cong, Kunjun Li, Xinchao Wang
- What is in a name? Mitigating Name Bias in Text Embedding via Anonymization
Sahil Manchanda, Pannaga Shivaswamy
- BenNumEval: A Benchmark to Assess LLM’s Numerical Reasoning Capabilities in Bengali
Kawsar Ahmed, Md Osama, Omar Sharif, Eftekhar Hossain, Mohammed Moshiul Hoque
- LM Agents for Coordinating Multi-User Information Gathering
Harsh Jhamtani, Jacob Andreas, Benjamin Van Durme
- $\text{C}^{2}$KD: Cross-layer and Cross-head Knowledge Distillation for Small Language Model-based Recommendation
Xiao Chen, Changyi Ma, Wenqi Fan, Zhaoxiang Zhang, Li Qing
- Sign2Vis: Automated Data Visualization from Sign Language
Yao Wan, Yang Wu, Zhen Li, Guobiao Zhang, Hongyu Zhang, Zhou Zhao, Hai Jin, April Wang
- Transparentize the Internal and External Knowledge Utilization in LLMs with Trustworthy Citation
Jiajun Shen, Tong Zhou, Yubo Chen, Delai Qiu, Shengping Liu, Kang Liu, Jun Zhao
- Learning to Better Act by Post-training on Vision Language Tasks
Li Muyao, Zihao Wang, Kaichen He, Xiaojian Ma, Yitao Liang
- Generative Frame Sampler for Long Video Understanding
Linli Yao, Haoning Wu, Kun Ouyang, Yuanxing Zhang, Caiming Xiong, Bei Chen, Xu Sun, Junnan Li
- Annotating the Annotators: Analysis, Insights and Modelling from an Annotation Campaign on Persuasion Techniques Detection
Davide Bassi, Dimitar Iliyanov Dimitrov, Bernardo D’Auria, Firoj Alam, Maram Hasanain, Christian Moro, Luisa Orrù, Gian Piero Turchi, Preslav Nakov, Giovanni Da San Martino
- On the Generalization vs Fidelity Paradox in Knowledge Distillation
Suhas Kamasetty Ramesh, Ayan Sengupta, Tanmoy Chakraborty
- BEDAA: Bayesian Enhanced DeBERTa for Uncertainty-Aware Authorship Attribution
Iqra Zahid, Youcheng Sun, Riza Batista-Navarro
- Benchmarking the Benchmarks: Reproducing Climate-Related NLP Tasks
Tom Calamai, Oana Balalau, Fabian M. Suchanek
- Exploring Anthropomorphic Language in the Reporting of NLP Findings
Matthew Shardlow, Ashley Williams, Charlie Roadhouse, Filippos Ventirozos, Piotr Przybyła
- PersonaLens: A Benchmark for Personalization Evaluation in Conversational AI Assistants
Zheng Zhao, Clara Vania, Subhradeep Kayal, Naila Khan, Shay B Cohen, Emine Yilmaz
- iAgent: LLM Agent as a Shield between User and Recommender Systems
Wujiang Xu, Yunxiao Shi, Zujie Liang, Xuying Ning, Kai Mei, Kun Wang, Xi Zhu, Min Xu, Yongfeng Zhang
- FactLens: Benchmarking Fine-Grained Fact Verification
Kushan Mitra, Dan Zhang, Sajjadur Rahman, Estevam Hruschka
- Process-based Self-Rewarding Language Models
Shimao Zhang, Xiao Liu, Xin Zhang, Junxiao Liu, Zheheng Luo, Shujian Huang, Yeyun Gong
- The Devil Is in the Word Alignment Details: On Translation-Based Cross-Lingual Transfer for Token Classification Tasks
Benedikt Ebing, Goran Glavaš
- ShieldHead: Decoding-time Safeguard for Large Language Models
Zitao Xuan, Xiaofeng Mao, Da Chen, Xin Zhang, Yuhan Dong, JUN ZHOU
- A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models
Shuliang Liu, Hongyi Liu, Aiwei Liu, Duan Bingchen, Zheng Qi, Yibo Yan, He GENG, Peijie Jiang, Jia Liu, Xuming Hu
- Smotrom tvoja på ander drogoj verden! Resurrecting Dead Pidgin with Generative Models: Russenorsk Case Study
Ivan P. Yamshchikov, Sergei Shteiner, Anna Bykova, Alexey Tikhonov
- PromptCoT: Synthesizing Olympiad-level Problems for Mathematical Reasoning in Large Language Models
Xueliang Zhao, Wei Wu, Jian Guan, Lingpeng Kong
- Speculative Decoding via Exponential Races
Szymon Kobus, Deniz Gunduz
- Going Beyond Your Expectations in Latency Metrics for Simultaneous Speech Translation
Jorge Iranzo-Sánchez, Javier Iranzo-Sánchez, Adrià Giménez, Jorge Civera
- Towards a Design Guideline for RPA Evaluation: A Survey of Large Language Model-Based Role-Playing Agents
Chaoran Chen, Bingsheng Yao, Ruishi Zou, Wenyue Hua, Weimin Lyu, Toby Jia-Jun Li, Dakuo Wang
- Recursive Question Understanding for Complex Question Answering over Heterogeneous Personal Data
Philipp Christmann, Gerhard Weikum
- PreSumm: Predicting Summarization Performance Without Summarizing
Steven Koniaev, Ori Ernst, Jackie CK Cheung
- Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases
Yongjia Lei, Haoyu Han, Ryan A. Rossi, Franck Dernoncourt, Nedim Lipka, Mahantesh M Halappanavar, Jiliang Tang, Yu Wang
- Fact Recall, Heuristics or Pure Guesswork? Precise Interpretations of Language Models for Fact Completion
Denitsa Saynova, Lovisa Hagström, Moa Johansson, Richard Johansson, Marco Kuhlmann
- FPE2M2: Approaching Lossless and Efficient Quantization with Native Floating Point
Ke Yi, jianwei zhang, Zhiying Xu, Xinlong Yang, Yang Zhou, Minmin Sun, Zengke Liu, Tong Zhang, Junyang Lin, Jingren Zhou
- Asymmetric Conflict and Synergy in Post-training for LLM-based Multilingual Machine Translation
Tong Zheng, Yan Wen, Huiwen Bao, Junfeng Guo, Heng Huang
- VISIAR: Empower MLLM for Visual Story Ideation
Zhaoyang Xia, Somdeb Sarkhel, Mehrab Tanjim, Stefano Petrangeli, Ishita Dasgupta, Yuxiao Chen, JINXUAN XU, Di Liu, Saayan Mitra, Dimitris N. Metaxas
- Same Company, Same Signal: The Role of Identity in Earnings Call Transcripts
Ding Yu, Zhuo Liu, Hangfeng He
- Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems
Emma Harvey, Emily Sheng, Su Lin Blodgett, Alexandra Chouldechova, Jean Garcia-Gathright, Alexandra Olteanu, Hanna Wallach
- Mind the (Belief) Gap: Group Identity in the World of LLMs
Angana Borah, Marwa Houalla, Rada Mihalcea
- SemCSE: Semantic Contrastive Sentence Embeddings Using LLM-Generated Summaries For Scientific Abstracts
Marc Felix Brinner, Sina Zarrieß
- A General Framework to Enhance Fine-tuning-based LLM Unlearning
Jie Ren, Zhenwei DAI, Xianfeng Tang, Hui Liu, Jingying Zeng, Zhen Li, Rahul Goutam, Suhang Wang, Yue Xing, Qi He, Hui Liu
- Right Answer, Wrong Score: Uncovering the Inconsistencies of LLM Evaluation in Multiple-Choice Question Answering
Francesco Maria Molfese, Luca Moroni, Luca Gioffré, Alessandro Scirè, Simone Conia, Roberto Navigli
- Human Validation Is Not Enough for Theory of Mind Benchmarks
Adil Soubki, Owen Rambow
- MiniKV: Pushing the Limits of 2-Bit KV Cache via Compression and System Co-Design for Efficient Long Context Inference
Akshat Sharma, Hangliang Ding, Jianping Li, Neel Dani, Minjia Zhang
- Sci-LoRA: Mixture of Scientific LoRAs for Cross-Domain Lay Paraphrasing
Ming Cheng, Jiaying Gong, Hoda Eldardiry
- Chameleon LLMs: User Personas Influence Chatbot Personality Shifts
Jane Xing, Tianyi Niu, Shashank Srivastava
- Trick or Neat: Adversarial Ambiguity and Language Model Evaluation
Antonia Karamolegkou, Oliver Eberle, Phillip Rust, Carina Kauf, Anders Søgaard
- Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes
Kshitish Ghate, Tessa Charlesworth, Mona T. Diab, Aylin Caliskan
- Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
Yingqian Cui, Pengfei He, Jingying Zeng, Hui Liu, Xianfeng Tang, Zhenwei DAI, Yan Han, Chen Luo, Jing Huang, Zhen Li, Suhang Wang, Yue Xing, Jiliang Tang, Qi He
- Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers
Yilun Zhao, Chengye Wang, Chuhan Li, Arman Cohan
- MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs
Kaustubh Deshpande, Ved Sirdeshmukh, Johannes Baptist Mols, Lifeng Jin, Ed-Yeremai Hernandez-Cardona, Dean Lee, Jeremy Kritz, Willow E. Primack, Summer Yue, Chen Xing
- Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
Jaydeep Borkar, Matthew Jagielski, Katherine Lee, Niloofar Mireshghallah, David A. Smith, Christopher A. Choquette-Choo
- Safety is Not Only About Refusal: Reasoning-Enhanced Fine-tuning for Interpretable LLM Safety
Yuyou Zhang, Miao Li, William Han, Yihang Yao, Zhepeng Cen, Ding Zhao
- A puyfred feels less of a puyfred if you say it’s cute, but it still feels bad: context-dependent form-meaning systematicity in LLMs
Giovanni Cassani, Jaïr A. Waal
- MetaSynth: Meta-Prompting-Driven Agentic Scaffolds for Diverse Synthetic Data Generation
Haris Riaz, Sourav Sanjukta Bhabesh, Vinayak Arannil, Miguel Ballesteros, Graham Horwood
- MVTamperBench: Evaluating Robustness of Vision-Language Models
Amit Agarwal, Srikant Panda, Angeline Charles, Hitesh Laxmichand Patel, Bhargava Kumar, Priyaranjan Pattnayak, Taki Hasan Rafi, Tejaswini Kumar, Hansa Meghwani, Karan Gupta, Dong-Kyu Chae
- Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models
Qianqi Yan, Yue Fan, Hongquan Li, Shan Jiang, Yang Zhao, Xinze Guan, Ching-Chen Kuo, Xin Eric Wang
- Vision-Language Models Struggle to Align Entities across Modalities
Iñigo Alonso, Gorka Azkune, Ander Salaberria, Jeremy Barnes, Oier Lopez de Lacalle
- A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information
Lucky Susanto, Musa Izzanardi Wijanarko, Prasetia Anugrah Pratama, Zilu Tang, Fariz Akyas, Traci Hong, Ika Karlina Idris, Alham Fikri Aji, Derry Tanti Wijaya
- MedCite: Can Language Models Generate Verifiable Text for Medicine?
Xiao Wang, Mengjue Tan, Qiao Jin, Guangzhi Xiong, Yu Hu, Aidong Zhang, Zhiyong Lu, Minjia Zhang
- Let The Jury Decide: Fair Demonstration Selection for In-Context Learning through Incremental Greedy Evaluation
Sadaf MD Halim, Chen Zhao, Xintao Wu, Latifur Khan, Christan Grant, Fariha Ishrat Rahman, Feng Chen
- The Lies Characters Tell: Utilizing Large Language Models to Normalize Adversarial Unicode Perturbations
Portia Cooper, Eduardo Blanco, Mihai Surdeanu
- Speech Act Patterns for Improving Generalizability of Explainable Politeness Detection Models
Ahmad Aljanaideh
- A Systematic Evaluation of Transformer-LM Representations for Capturing Author States and Traits
Khushboo Singh, Vasudha Varadarajan, Adithya V Ganesan, August Håkan Nilsson, Nikita Soni, Syeda Mahwish, Pranav Chitale, Ryan L. Boyd, Lyle Ungar, Richard N Rosenthal, H. Schwartz
- TReMu: Towards Neuro-Symbolic Temporal Reasoning for LLM-Agents with Memory in Multi-Session Dialogues
Yubin Ge, Salvatore Romeo, Jason Cai, Raphael Shu, MONICA SUNKARA, Yassine Benajiba, Yi Zhang
- Conservative Bias in Large Language Models: Measuring Relation Predictions
Toyin Aguda, Erik Wilson, Allan Anzagira, Simerjot Kaur, Charese Smiley
- Mitigating Bias in RAG: Controlling the Embedder
Taeyoun Kim, Jacob Mitchell Springer, Aditi Raghunathan, Maarten Sap
- V-ALPHASOCIAL: Benchmark and Self-Reflective Chain-of-Thought Generation for Visual Social Commonsense Reasoning
Zongyu Lin, Zhikun Xu, Xiaohan Song, Yixin Wan, Xingcheng Yao, Tsung-Han Lin, Selina Song, Pranav Subbaraman, Ben Zhou, Kai-Wei Chang, Yizhou Sun
- AfroBench: How Good are Large Language Models on African Languages?
Jessica Ojo, Odunayo Ogundepo, Akintunde Oladipo, Kelechi Ogueji, Jimmy Lin, Pontus Stenetorp, David Ifeoluwa Adelani
- Training Bilingual LMs with Data Constraints in the Targeted Language
Skyler Seto, Maartje Ter Hoeve, Richard He Bai, Natalie Schluter, David Grangier
- ChartQAPro: A More Diverse and Challenging Benchmark for Chart Question Answering
Ahmed Masry, Mohammed Saidul Islam, Mahir Ahmed, Aayush Bajaj, Firoz Kabir, Aaryaman Kartha, Md Tahmid Rahman Laskar, Mizanur Rahman, Shadikur Rahman, Mehrad Shahmohammadi, Megh Thakkar, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty
- From Observation to Understanding: Front-Door Adjustments with Uncertainty Calibration for Enhancing Egocentric Reasoning in LVLMs
Shenshen Li, Wenxin Meng, Lei Wang, Hao Yang, Chong Peng, Peng Yan, Fumin Shen, Jingkuan Song, Heng Tao Shen, Xing Xu
- Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion
Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park
- Worse than Random? An Embarrassingly Simple Probing Evaluation of Large Multimodal Models in Medical VQA
Qianqi Yan, Xuehai He, Xiang Yue, Xin Eric Wang
- Optimizing Reasoning for Text-to-SQL with Execution Feedback
Bohan Zhai, Canwen Xu, Zhewei Yao, Yuxiong He
- Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities
Wenyue Hua, Kaijie Zhu, Lingyao Li, Lizhou Fan, Mingyu Jin, Shuhang Lin, Haochen Xue, Zelong Li, Jindong Wang, Yongfeng Zhang
- Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models
Shuqi LIU, Han Wu, Bowei He, Xiongwei Han, Mingxuan Yuan, Linqi Song
- EgoNormia: Benchmarking Physical Social Norm Understanding
MohammadHossein Rezaei, Yicheng Fu, Phil Cuvin, Caleb Ziems, Yanzhe Zhang, Hao Zhu, Diyi Yang
- Large Language Models as Neurolinguistic Subjects: Discrepancy in Performance and Competence for Form and Meaning
Linyang He, Ercong Nie, Helmut Schmid, Hinrich Schuetze, Nima Mesgarani
- The Impact of Large Language Models in Academia: from Writing to Speaking
Mingmeng Geng, Caixi Chen, Yanru Wu, Yao Wan, Pan Zhou, Dongping Chen
- X-WebAgentBench: A Multilingual Interactive Web Benchmark for Evaluating Global Agentic System
Peng Wang, Ruihan Tao, Qiguang Chen, Mengkang Hu, Libo Qin
- MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents
Haoran Tan, Zeyu Zhang, Quanyu Dai, Chen Ma, Xu Chen, Zhenhua Dong
- Adaptive LoRA Merge with Parameter Pruning for Low-Resource Generation
Ryota Miyano, Yuki Arase
- LongAttn: Selecting Long-context Training Data via Token-level Attention
Longyun Wu, Dawei Zhu, Guangxiang Zhao, Zhuocheng Yu, Junfeng Ran, Xiangyu Wong, Lin Sun, Sujian Li
- CoRE: Condition-based Reasoning for Identifying Outcome Variance in Complex Events
Sai P Vallurupalli, Francis Ferraro
- FaVe: Factored and Verified Search Rationale for Long-form Answer
Jihyuk Kim, SUNGJIN LEE, seung-won hwang, Yang Liu
- UnrealLLM: Towards Highly Controllable and Interactable 3D Scene Generation by LLM-powered Procedural Content Generation
SongTang, Kaiyong Zhao, Lei Wang, Yuliang Li, Xuebo Liu, Junyi Zou, Qiang Wang, Xiaowen Chu
- Tree-of-Prompts: Abstracting Control-Flow for Prompt Optimization
Jihyuk Kim, Shubham Garg, Lahari Poddar, seung-won hwang, Chris Hench
- Outlier-weighed Layerwise Sampling for LLM Fine-tuning
Pengxiang Li, Lu Yin, Xiaowei Gao, Shiwei Liu
- KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial Recomputation
Chaoyi Jiang, Lei Gao, Hossein Entezari Zarch, Murali Annavaram
- Direct Behavior Optimization: Unlocking the Potential of Lightweight LLMs
Hongming Yang, Shi Lin, Jun Shao, Changting Lin, Donghai Zhu, Meng Han, Qinglei Kong
- Whether LLMs Know If They Know: Identifying Knowledge Boundaries via Debiased Historical In-Context Learning
Bo Lv, Nayu Liu, Yang Shen, Xin Liu, Ping Luo, Yue Yu
- How do LLMs’ Preferences Affect Event Argument Extraction? CAT: Addressing Preference Traps in Unsupervised EAE
Yunhao Wei, Kai Shuang, Zhiyi Li, Chenrui Mao
- Out-of-Distribution Detection via LLM-Guided Outlier Generation for Text-attributed Graph
Xiangwei Lv, Mengze Li, Jingyuan Chen, Zhiang Dong, Sirui Han, Beishui Liao
- Document-Level Relation Extraction with Global Relations and Entity Pair Reasoning
Fu Zhang, Yi Yan, Jingwei Cheng
- Towards Storage-Efficient Visual Document Retrieval: An Empirical Study on Reducing Patch-Level Embeddings
Yubo Ma, Jinsong Li, Yuhang Zang, Xiaobao Wu, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Haodong Duan, Jiaqi Wang, Yixin Cao, Aixin Sun
- Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models
Qingyu Ren, Jie Zeng, Qianyu He, Jiaqing Liang, Yanghua Xiao, Weikang Zhou, Zeye Sun, Fei Yu
- ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models
Hwiyeol Jo, Hyunwoo Lee, Taiwoo Park
- Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations
Chunyang Li, Weiqi Wang, Tianshi Zheng, Yangqiu Song
- LLMTaxo: Leveraging Large Language Models for Constructing Taxonomy of Factual Claims from Social Media
Haiqi Zhang, Zhengyuan Zhu, Zeyu Zhang, Chengkai Li
- AnCast++: Document-Level Evaluation of Graph-based Meaning Representations
Haibo Sun, Jayeol Chun, Nianwen Xue
- MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Yongbin Li, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Hamid Alinejad-Rokny, Xiaobo Xia, Jingkuan Song, Fei Huang
- SciVerse: Unveiling the Knowledge Comprehension and Visual Reasoning of LMMs on Multi-modal Scientific Problems
Ziyu Guo, Renrui Zhang, Hao Chen, Jialin Gao, Dongzhi Jiang, Jiaze Wang, Pheng-Ann Heng
- Exploring Layer-wise Representations of English and Chinese Homonymy in Pre-trained Language Models
Matthew King-Hang Ma, XIE Chenwei, Wenbo Wang, William Shiyuan Wang
- DocMEdit: Towards Document-Level Model Editing
Li Zeng, Zeming Liu, Chong Feng, Heyan Huang, Yuhang Guo
- Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge Editing
Yifan Lu, Jing Li, Yigeng Zhou, Yihui Zhang, Wenya Wang, Xiucheng Li, Meishan Zhang, Fangming Liu, Jun Yu, Min Zhang
- Evaluating the Long-Term Memory of Large Language Models
Zixi Jia, Qinghua Liu, Hexiao Li, Yuyan Chen, Jiqiang Liu
- Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments
Russell Scheinberg, Ameeta Agrawal, Amber Shore, So Young Lee
- Data Interpreter: An LLM Agent for Data Science
Sirui Hong
- DReSD: Dense Retrieval for Speculative Decoding
Milan Gritta, Huiyin Xue, Gerasimos Lampouras
- Core: Robust Factual Precision with Informative Sub-Claim Identification
Zhengping Jiang, Jingyu Zhang, Nathaniel Weir, Seth Ebner, Miriam Wanner, Kate Sanders, Daniel Khashabi, Anqi Liu, Benjamin Van Durme
- Rethinking Diverse Human Preference Learning through Principal Component Analysis
Feng Luo, Rui Yang, Hao Sun, Chunyuan Deng, Jiarui Yao, Jingyan Shen, Huan Zhang, Hanjie Chen
- Improving Word Alignment Using Semi-Supervised Learning
Zhongtao Miao, Qiyu Wu, Masaaki Nagata, Yoshimasa Tsuruoka
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
Yixin Ou, Yunzhi Yao, Ningyu Zhang, Hui Jin, Jiacheng Sun, Shumin Deng, Zhenguo Li, Huajun Chen
- LLM-Symbolic Integration for Robust Temporal Tabular Reasoning
Atharv Kulkarni, Kushagra Dixit, Vivek Srikumar, Dan Roth, Vivek Gupta
- Multimodal Large Language Models for Text-rich Image Understanding: A Comprehensive Review
Pei Fu, Tongkun Guan, Zining Wang, Zhentao Guo, Chen Duan, Hao Sun, Boming Chen, Qianyi Jiang, Jiayao Ma, Kai zhou, Junfeng Luo
- PruneVid: Visual Token Pruning for Efficient Video Large Language Models
Xiaohu Huang, Hao Zhou, Kai Han
- PromptWizard: Optimizing Prompts via Task-Aware, Feedback-Driven Self-Evolution
Eshaan Agarwal, Raghav Magazine, Joykirat Singh, Vivek Dani, Tanuja Ganu, Akshay Nambi
- Exposing Numeracy Gaps: A Benchmark to Evaluate Fundamental Numerical Abilities in Large Language Models
Haoyang LI, Xuejia Chen, Zhanchao Xu, Darian Li, Nicole HU, Fei Teng, Yiming Li, Luyu QIU, Chen Jason Zhang, Li Qing, Lei Chen
- TABGEN-ICL: Residual-Aware In-Context Example Selection for Tabular Data Generation
Liancheng Fang, Aiwei Liu, Hengrui Zhang, Henry Peng Zou, Weizhi Zhang, Philip S. Yu
- Benchmarking Multi-National Value Alignment for Large Language Models
Chengyi Ju, Weijie Shi, Chengzhong LIU, Jiaming Ji, Jipeng Zhang, Ruiyuan Zhang, Jiajie Xu, Yaodong Yang, Sirui Han, Yike Guo
- MotiveBench: How Far Are We From Human-Like Motivational Reasoning in Large Language Models?
Xixian Yong, Jianxun Lian, Xiaoyuan Yi, Xiao Zhou, Xing Xie
- Confidence Improves Self-Consistency in LLMs
Amir Taubenfeld, Tom Sheffer, Eran Ofek, Amir Feder, Ariel Goldstein, Zorik Gekhman, Gal Yona
- None of the Above, Less of the Right Parallel Patterns in Human and LLM Performance on Multi-Choice Questions Answering
Zhi Rui Tam, Cheng-Kuang Wu, Chieh-Yen Lin, Yun-Nung Chen
- In Search of the Lost Arch in Dialogue: A Dependency Dialogue Acts Corpus for Multi-Party Dialogues
Jon Cai, Brendan King, Peyton Cameron, Susan Windisch Brown, Miriam Eckert, Dananjay Srinivas, George Arthur Baker, V Kate Everson, Martha Palmer, James Martin, Jeffrey Flanigan
- ProMind-LLM: Proactive Mental Health Care via Causal Reasoning with Sensor Data
Xinzhe Zheng, Sijie JI, Jiawei Sun, Renqi Chen, Wei Gao, Mani Srivastava
- Debiasing Online Preference Learning via Preference Feature Preservation
Dongyoung Kim, Jinsung Yoon, Jinwoo Shin, Jaehyung Kim
- ShortGPT: Layers in Large Language Models are More Redundant Than You Expect
Xin Men, Mingyu Xu, Qingyu Zhang, Qianhao Yuan, Bingning Wang, Hongyu Lin, Yaojie Lu, Xianpei Han, Weipeng Chen
- ProjectEval: A Benchmark for Programming Agents Automated Evaluation on Project-Level Code Generation
Kaiyuan Liu, Youcheng Pan, Yang Xiang, Daojing He, Jing Li, Yexing Du, Tianrun Gao
- V²R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations
Zhiyuan Fan, Yumeng Wang, Sandeep Polisetty, Yi R. Fung
- DYNTEXT: Semantic-Aware Dynamic Text Sanitization for Privacy-Preserving LLM Inference
Juhua Zhang, Zhiliang Tian, Minghang Zhu, YIPING SONG, Minlie Huang, Siyi Yang, Qiunan Du, Xinwang Liu, Taishu sheng, Dongsheng Li
- InImageTrans: Multimodal LLM-based Text Image Machine Translation
Fei Zuo, Kehai Chen, Yu Zhang, Zhengshan Xue, Min Zhang
- FRAME: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy
Xuemiao Zhang, Feiyu Duan, Xu Liangyu, Yongwei Zhou, Sirui Wang, Rongxiang Weng, Jingang Wang, Xunliang Cai
- When Large Language Models Meet Speech: A Survey on Integration Approaches
Zhengdong Yang, Shuichiro Shimizu, Yahan Yu, Chenhui Chu
- KE-MHISTO: Towards a Multilingual Historical Knowledge Extraction Benchmark for Addressing the Long-Tail Problem
Arianna Graciotti, Leonardo Piano, Nicolas Lazzari, Enrico Daga, Rocco Tripodi, Valentina Presutti, Livio Pompianu
- TailorKV: A Hybrid Framework for Long-Context Inference via Tailored KV Cache Optimization
Dingyu Yao, Bowen Shen, Zheng Lin, Wei Liu, Jian Luan, Bin Wang, Weiping Wang
- The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing
Xinwei Guo, Jiashi Gao, Junlei Zhou, Jiaxin Zhang, Guanhua Chen, Xiangyu Zhao, Quanying Liu, Haiyan Wu, Xin Yao, Xuetao Wei
- LLMs Can Achieve High-quality Simultaneous Machine Translation as Efficiently as Offline
Biao Fu, Minpeng Liao, Kai Fan, Chengxi Li, Liang Zhang, Yidong Chen, Xiaodong Shi
- Beyond Completion: A Foundation Model for General Knowledge Graph Reasoning
Yin Hua, Zhiqiang Liu, Mingyang Chen, Zheng Fang, Chi Man Wong, Lingxiao Li, Chi Man VONG, Huajun Chen, Wen Zhang
- Generative Error Correction for Emotion-aware Speech-to-text Translation
Zhengdong Yang, Sheng Li, Chenhui Chu
- SynapticRAG: Enhancing Temporal Memory Retrieval in Large Language Models through Synaptic Mechanisms
Yuki Hou, Haruki Tamoto, Qinghua Zhao, HOMEI MIYASHITA
- Localizing and Mitigating Errors in Long-form Question Answering
Rachneet Singh Sachdeva, Yixiao Song, Mohit Iyyer, Iryna Gurevych
- EMGLLM: Data-to-Text Alignment for Electromyogram Diagnosis Generation with Medical Numerical Data Encoding
Zefei Long, Zhenbiao Cao, Wei Chen, zhongyu wei
- LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM
sambal shikhar, Mohammed Irfan Kurpath, Sahal Shaji Mullappilly, Jean Lahoud, Fahad Shahbaz Khan, Rao Muhammad Anwer, Salman Khan, Hisham Cholakkal
- Act2P: LLM-Driven Online Dialogue Act Classification for Power Analysis
Zhangwenbo, Wang yuhan
- Benchmarking Large Language Models against Smaller Fine-Tuned Models for Low-Resource Maltese NLP
Kurt Micallef, Claudia Borg
- TRATES: Trait-Specific Rubric-Assisted Cross-Prompt Essay Scoring
Sohaila Eltanbouly, Salam Albatarni, Tamer Elsayed
- DAST: Context-Aware Compression in LLMs via Dynamic Allocation of Soft Tokens
Shaoshen Chen, Yangning Li, Zishan Xu, Yongqin Zeng, Shunlong Wu, Xinshuo Hu, Zifei Shan, Xin Su, Jiwei Tang, Yinghui Li, Hai-Tao Zheng
- A Multi-Expert Structural-Semantic Hybrid Framework for Unveiling Historical Patterns in Temporal Knowledge Graphs
Yimin Deng, Yuxia Wu, Yejing Wang, Guoshuai Zhao, Li Zhu, Qidong Liu, Derong Xu, Zichuan Fu, Xian Wu, Yefeng Zheng, Xiangyu Zhao, Xueming Qian
- MWPO: Enhancing LLMs Performance through Multi-Weight Preference Strength and Length Optimization
Shiyue Xu, Fu Zhang, Jingwei Cheng, Linfeng Zhou
- CLEAR: Character Unlearning in Textual and Visual Modalities
Alexey Dontsov, Dmitrii Korzh, Alexey Zhavoronkin, Boris Mikheev, Denis Bobkov, Aibek Alanov, Oleg Rogov, Ivan Oseledets, Elena Tutubalina
- Assessing the Reasoning Capabilities of LLMs in the context of Evidence-based Claim Verification
John Dougrez-Lewis, Mahmud Elahi Akhter, Federico Ruggeri, Sebastian Löbbers, Yulan He, Maria Liakata
- Temporal Generalizability in the Realm of Event Detection: The Role of Multilingual Models and Stochasticity
Stella Verkijk, Piek Vossen, Pia Sommerauer
- DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
Ying Zhou, Xinyao Wang, Yulei Niu, Yaojie Shen, Lexin Tang, Fan Chen, Ben He, Le Sun, Longyin Wen
- Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models?
Yifei Wang, Yu Sheng, Linjing Li, Daniel Dajun Zeng
- ToolSpectrum: Towards Personalized Tool Utilization for Large Language Models
Zihao Cheng, Hongru WANG, Zeming Liu, Yuhang Guo, Yuanfang Guo, Yunhong Wang, Haifeng Wang
- Reverse Preference Optimization for Complex Instruction Following
Xiang Huang, Ting-En Lin, Feiteng Fang, Yuchuan Wu, Hangyu Li, Yuzhong Qu, Fei Huang, Yongbin Li
- MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens
Jeong Hun Yeo, Hyeongseop Rha, Se Jin Park, Yong Man Ro
- Def-DTS: Deductive Reasoning for Open-domain Dialogue Topic Segmentation
Seungmin Lee, Yongsang Yoo, Minhwa Jung, Min Song
- Exploring Jailbreak Attacks on LLMs through Intent Concealment and Diversion
Tiehan Cui, Yanxu Mao, Peipei Liu, Congying Liu, Datao You
- Verbosity-Aware Rationale Reduction: Sentence-Level Rationale Reduction for Efficient and Effective Reasoning
Joonwon Jang, Jaehee Kim, WONBIN KWEON, Seonghyeon Lee, Hwanjo Yu
- Exploring the Role of Mental Health Conversational Agents in Training Medical Students and Professionals: A Systematic Literature Review
Thushari Atapattu, Menasha Thilakaratne, Duc Nhan Do, Mahen Herath, Katrina E. Falkner
- Bandit-Based Prompt Design Strategy Selection Improves Prompt Optimizers
Rin Ashizawa, Yoichi Hirose, Nozomu Yoshinari, Kento Uchida, Shinichi Shirakawa
- STORYTELLER: An Enhanced Plot-Planning Framework for Coherent and Cohesive Story Generation
Jiaming Li, Yukun Chen, Ziqiang Liu, Minghuan Tan, Lei Zhang, Yunshui Li, Run Luo, Longze Chen, Jing Luo, Ahmadreza Argha, Hamid Alinejad-Rokny, Wei Zhou, Min Yang
- SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models
Kaushal Kumar Maurya, KV Aditya Srivatsa, Ekaterina Kochmar
- SkyLLM: Cross-LLM-APIs Federation for Cost-effective Query Processing
Heng Zhao, Yifei Zhu
- Matina: A Culturally-Aligned Persian Language Model Using Multiple LoRA Experts
Sara Bourbour Hosseinbeigi, MohammadAli SeifKashani, Javad seraj, Fatemeh Taherinezhad, Ali Nafisi, Fatemeh Nadi, Iman Barati, Hosein Hasani, Mostafa Amiri, Mostafa Masoudi
- PM3-KIE: A Probabilistic Multi-Task Meta-Model for Document Key Information Extraction
Birgit Kirsch, Héctor Allende-Cid, Stefan Rueping
- TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text
Ahmed Lekssays, Utsav Shukla, Husrev Taha Sencar, Md Rizwan Parvez
- G2S: A General-to-Specific Learning Framework for Temporal Knowledge Graph Forecasting with Large Language Models
Long Bai, Zixuan Li, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng, Tat-Seng Chua
- Disentangling Reasoning Tokens and Boilerplate Tokens For Language Model Fine-tuning
Ziang Ye, Zhenru Zhang, Yang Zhang, jianxing ma, Junyang Lin, Fuli Feng
- APT: Improving Specialist LLM Performance with Weakness Case Acquisition and Iterative Preference Training
Jun Rao, Zepeng Lin, Xuebo Liu, Xiaopeng Ke, Lian Lian, Dong Jin, shengjun cheng, Jun Yu, Min Zhang
- EasyEA: Large Language Model is All You Need in Entity Alignment Between Knowledge Graphs
Jingwei Cheng, Chenglong Lu, Linyan Yang, Guoqing Chen, Fu Zhang
- An Adaptive Multi-Threshold Loss and a General Framework for Collaborating Losses in Document-Level Relation Extraction
Huangming Xu, Fu Zhang, Jingwei Cheng
- RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following
Junru Lu, Jiazheng Li, Guodong Shen, Lin Gui, Siyu An, Yulan He, di yin, Xing Sun
- C²RBench: A Chinese Complex Reasoning Benchmark for Large Language Models
Junru Wu, Tianhao Shen, Linxi Su, Deyi Xiong
- Unlocking LLMs’ Self-Improvement Capacity with Autonomous Learning for Domain Adaptation
Ke Ji, Junying Chen, Anningzhe Gao, Wenya Xie, Xiang Wan, Benyou Wang
- How Personality Traits Shape LLM Risk-Taking Behaviour
John Hartley, Conor Brian Hamill, Dale Seddon, Devesh Batra, Ramin Okhrati, Raad Khraishi
- Word-Level Detection of Code-Mixed Hate Speech with Multilingual Domain Transfer
Karin Niederreiter, Dagmar Gromann
- Evaluation of Attribution Bias in Generator-Informed Retrieval-Augmented Large Language Models
Amin Abolghasemi, Leif Azzopardi, Seyyed Hadi Hashemi, Maarten de Rijke, Suzan Verberne
- Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment
Wen Yang, Junhong Wu, Chen Wang, Chengqing Zong, Jiajun Zhang
- Diagnosing Failures in Large Language Models’ Answers: Integrating Error Attribution into Evaluation Framework
Zishan Xu, Shuyi Xie, Qingsong Lv, Shupei Xiao, Linlin Song, Sui Wenjuan, Fan Lin
- Encode Errors: Representational Retrieval of In-Context Demonstrations for Multilingual Grammatical Error Correction
Guangyue Peng, Wei Li, Wen Luo, Houfeng Wang
- Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data
Xuemiao Zhang, Xu Liangyu, Feiyu Duan, Yongwei Zhou, Sirui Wang, Rongxiang Weng, Jingang Wang, Xunliang Cai
- Can Input Attributions Interpret the Inductive Reasoning Process in In-Context Learning?
Mengyu Ye, Tatsuki Kuribayashi, Goro Kobayashi, Jun Suzuki
- Modal Dependency Parsing via Biaffine Attention with Self-Loop
Jayeol Chun, Nianwen Xue
- Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs
Zixiao Wang, Duzhen Zhang, Ishita Agarwal, Shen Gao, Le Song, Xiuying Chen
- Measuring What Makes You Unique: Difference-Aware User Modeling for Enhancing LLM Personalization
Yilun Qiu, Xiaoyan Zhao, Yang Zhang, Yimeng Bai, Wenjie Wang, Hong Cheng, Fuli Feng, Tat-Seng Chua
- VideoRAG: Retrieval-Augmented Generation over Video Corpus
Soyeong Jeong, Kangsan Kim, Jinheon Baek, Sung Ju Hwang
- Synergistic Augmentation: Enhancing Cross-Domain Zero-Shot Slot Filling with Small Model-Assisted Large Language Models
Weizhen li, Junbao Huang, Peijie Huang, Yuhong Xu, Jiekun Fan
- A Classifier of Word-Level Variants in Witnesses of Biblical Hebrew Manuscripts
Iglika Nikolova-Stoupak, Maxime Amblard, Sophie Robert-Hayek, Davide D’Amico, Frédérique Rey
- NOVA: An Iterative Planning Framework for Enhancing Scientific Innovation with Large Language Models
xiang hu, Hongyu Fu, Jinge Wang, Yifeng wang, zhikun li, Renjun Xu, Yu Lu, Yaochu Jin, Lili Pan, Zhenzhong Lan
- Query-Driven Multimodal GraphRAG: Dynamic Local Knowledge Graph Construction for Online Reasoning
Chenyang Bu, Guojie Chang, zihao chen, CunYuan Dang, Zhize Wu, Yi He, Xindong Wu
- A Survey of Uncertainty Estimation Methods on Large Language Models
Zhiqiu Xia, JINXUAN XU, Yuqian Zhang, Hang Liu
- Beyond Single-Value Metrics: Evaluating and Enhancing LLM Unlearning with Cognitive Diagnosis
Yicheng Lang, Kehan Guo, Yue Huang, Yujun Zhou, Haomin Zhuang, Tianyu Yang, Yao Su, Xiangliang Zhang
- Natural Language Processing in Support of Evidence-based Medicine: A Scoping Review
Zihan Xu, Haotian Ma, Yihao Ding, Gongbo Zhang, Chunhua Weng, Yifan Peng
- How do Transformer Embeddings Represent Compositions? A Functional Analysis
Aishik Nagar, Ishaan Singh Rawal, Mansi Dhanania, Cheston Tan
- Entriever: Energy-based Retriever for Knowledge-Grounded Dialog Systems
Yucheng Cai, Ke Li, Yi Huang, Junlan Feng, Zhijian Ou
- MONTROSE: LLM-driven Monte Carlo Tree Search Self-Refinement for Cross-Domain Rumor Detection
Shanshan Liu, Menglong Lu, Zhen Huang, Zejiang He, Liu Liu, Zhigang Sun, Dongsheng Li
- PEToolLLM: Towards Personalized Tool Learning in Large Language Models
Qiancheng Xu, Yongqi Li, Heming Xia, Fan Liu, Min Yang, Wenjie Li
- A Comprehensive Graph Framework for Question Answering with Mode-Seeking Preference Alignment
Quanwei Tang, Sophia Yat Mei Lee, Junshuang Wu, Dong Zhang, Shoushan Li, Erik Cambria, Guodong Zhou
- A MISMATCHED Benchmark for Scientific Natural Language Inference
Firoz Shaik, Mobashir Sadat, Nikita Gautam, Doina Caragea, Cornelia Caragea
- TagRouter: Learning Route to LLMs through Tags for Open-Domain Text Generation Tasks
Zhou Chen, Zhiqiang wei, Yuqi Bai, Xue Xiong, Jianmin WU
- The Reasoning-Memorization Interplay in Language Models Is Mediated by a Single Direction
Yihuai Hong, Meng Cao, Dian Zhou, Lei Yu, Zhijing Jin
- MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification
xu Zhao Pan, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao, Hongxun Yao, Kaipeng Zhang
- CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents
Tianqi Xu, Linyao Chen, Dai-Jie Wu, Yanjun Chen, Zecheng Zhang, Xiang Yao, Zhiqiang Xie, Yongchao Chen, Shilong Liu, Bochen Qian, Anjie Yang, Zhaoxuan Jin, Jianbo Deng, Philip Torr, Bernard Ghanem, Guohao Li
- Towards A “Novel” Benchmark: Evaluating Literary Fiction with Large Language Models
Wenqing Wang, Mingqi Gao, Xinyu Hu, Xiaojun Wan
- A Reinforcement Learning Framework for Cross-Lingual Stance Detection Using Chain-of-Thought Alignment
Binghui Li, Minghui Zou, Xiaowang Zhang, Shizhan Chen, Zhiyong Feng
- CARE-STaR: Constraint-aware Self-taught Reasoner
Zhiliang Li, Bo Tang, Yijun Niu, Beihong Jin, Qiwen Shi, Yuchen Feng, Zhiyu li, Jie Hu, mingchuan yang, Feiyu Xiong
- Is It JUST Semantics? A Case Study of Discourse Particle Understanding in LLMs
William Berkeley Sheffield, Kanishka Misra, Valentina Pyatkin, Ashwini Deo, Kyle Mahowald, Junyi Jessy Li
- War of Thoughts: Competition Stimulates Stronger Reasoning in Large Language Models
Yibin Chen, YAN ZHENG, Yifu Yuan, Jinyi Liu, Jianye HAO
- Does Rationale Quality Matter? Enhancing Mental Disorder Detection via Selective Reasoning Distillation
Hoyun Song, Huije Lee, Jisu Shin, Sukmin Cho, Changgeon Ko, Jong C. Park
- Rethinking Table Instruction Tuning
Naihao Deng, Rada Mihalcea
- CliniDial: A Naturally Emerged Multimodal Dialogue Dataset for Team Reflection During Clinical Operation
Naihao Deng, Kapotaksha Das, Rada Mihalcea, Vitaliy Popov, Mohamed Abouelenien
- Chumor: Towards Benchmarking Chinese Humor Understanding
Ruiqi He, Yushu He, Longju Bai, Jiarui Liu, Zhenjie Sun, Zenghao Tang, He Wang, Hanchen Xia, Rada Mihalcea, Naihao Deng
- Explicit Bayesian Inference to Uncover the Latent Themes of Large Language Models
Raymond Li, Chuyuan Li, Gabriel Murray, Giuseppe Carenini
- Multi-Modal Framing Analysis of News
Arnav Arora, Srishti Yadav, Maria Antoniak, Serge Belongie, Isabelle Augenstein
- Improving Occupational ISCO Classification of Multilingual Swiss Job Postings with LLM-Refined Training Data
Ann-Sophie Gnehm, Simon Clematide
- Brevity is the soul of sustainability: Characterizing LLM response lengths
Soham Poddar, Paramita Koley, Janardan Misra, Niloy Ganguly, Saptarshi Ghosh
- Adversarial Preference Learning for Robust LLM Alignment
Yuanfu Wang, Pengyu Wang, Chenyang Xi, Bo Tang, Junyi Zhu, Wenqiang Wei, chen chen, Chao Yang, Jingfeng Zhang, Chaochao Lu, Yijun Niu, Keming Mao, Zhiyu li, Feiyu Xiong, Jie Hu, mingchuan yang
- gMBA: Expression Semantic Guided Mixed Boolean-Arithmetic Deobfuscation Using Transformer Architectures
Youjeong Roh, Joon-Young Paik, Jingun Kwon, Eun-Sun Cho
- READoc: A Unified Benchmark for Realistic Document Structured Extraction
Zichao Li, Aizier Abulaiti, Yaojie Lu, Xuanang Chen, Jia Zheng, Hongyu Lin, Xianpei Han, Shanshan Jiang, Bin Dong, Le Sun
- TicTac: Temporal-aware Supervised Fine-tuning for Automatic Text Dating
Minna Peng, Han Ren
- Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting
Hao Feng, Shu Wei, Xiang Fei, Wei Shi, Yingdong Han, Lei Liao, Jinghui Lu, Binghong Wu, Qi Liu, Chunhui Lin, Jingqun Tang, Hao Liu, Can Huang
- FanChuan: A Multilingual and Graph-Structured Benchmark For Parody Detection and Analysis
Yilun Zheng, Sha Li, Fangkun Wu, Yang Ziyi, Lin Hongchao, Zhichao Hu, Cai Xinjun, Ziming Wang, Jinxuan Chen, Sitao Luan, Jiahao Xu, Lihui Chen
- P-CoT: A Pedagogically-motivated Participatory Chain-of-Thought Prompting for Phonological Reasoning in LLMs
Dongjun Jang, Youngchae Ahn, Hyopil Shin
- DynaCode: A Dynamic Complexity-Aware Code Benchmark for Evaluating Large Language Models in Code Generation
Wenhao Hu, Jinhao Duan, Chunchen Wei, Li Zhang, Yue Zhang, Kaidi Xu
- Small Encoders Can Rival Large Decoders in Detecting Groundedness
Istabrak Abbes, Gabriele Prato, Quentin Fournier, Fernando Rodriguez, Alaa Boukhary, Adam Elwood, Sarath Chandar
- KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
Ahmed Heakl, Muhammad Abdullah Sohail, Mukul Ranjan, Rania Elbadry, Ghazi Shazan Ahmad, Mohamed El-Geish, Omar Maher, Zhiqiang Shen, Fahad Shahbaz Khan, Salman Khan
- Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness
Shayan Alipour, Indira Sen, Mattia Samory, Tanu Mitra
- AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
Nathaniel Romney Robinson, Shahd Abdelmoneim, Kelly Marchisio, Sebastian Ruder
- Is Large Language Model Performance on Reasoning Tasks Impacted by Different Ways Questions Are Asked?
Seok Hwan Song, Mohna Chakraborty, Qi Li, Wallapak Tavanapong
- MutantPrompt: Prompt Optimization via Mutation Under a Budget on Modest-sized LMs
Arijit Nag, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti
- Heuristic-based Search Algorithm in Automatic Instruction-focused Prompt Optimization: A Survey
Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A. Malin, Sricharan Kumar
- CONSENSAGENT: Towards Efficient and Effective Consensus in Multi-Agent LLM Interactions Through Sycophancy Mitigation
Priya Pitre, Naren Ramakrishnan, Xuan Wang
- The Structural Safety Generalization Problem
Julius Broomfield, Tom Gibbs, George Ingebretsen, Ethan Kosak-Hine, Tia Nasir, Jason Zhang, Reihaneh Iranmanesh, Sara Pieri, Reihaneh Rabbany, Kellin Pelrine
- DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization
Amitava Das, Suranjana Trivedy, Danush Khanna, Yaswanth Narsupalli, Basab Ghosh, Rajarshi Roy, Gurpreet Singh, Vinija Jain, Vasu Sharma, Aishwarya Naresh Reganti, Aman Chadha
- Model-Dependent Moderation: Inconsistencies in Hate Speech Detection Across LLM-based Systems
Neil Fasching, Yphtach Lelkes
- Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification
Subhendu Khatuya, Shashwat Naidu, Saptarshi Ghosh, Pawan Goyal, Niloy Ganguly
- Unsupervised Morphological Tree Tokenizer
Qingyang Zhu, Xiang Hu, Pengyu Ji, Wei Wu, Kewei Tu
- CausalLink: An Interactive Evaluation Framework for Causal Reasoning
Jinyue Feng, Frank Rudzicz
- Toward Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset (GIST)
Jiarui Liu, Iman Ouzzani, Wenkai Li, Lechen Zhang, Tianyue Ou, Houda Bouamor, Zhijing Jin, Mona T. Diab
- A Joint Optimization Framework for Enhancing Efficiency of Tool Utilization in LLM Agents
Bin Wu, Edgar Meij, Emine Yilmaz
- When Claims Evolve: Evaluating and Enhancing the Robustness of Embedding Models Against Misinformation Edits
Jabez Magomere, Emanuele La Malfa, Manuel Tonneau, Ashkan Kazemi, Scott A. Hale
- Splintering Nonconcatenative Languages for Better Tokenization
Bar Gazit, Shaltiel Shmidman, Avi Shmidman, Yuval Pinter
- Aria-UI: Visual Grounding for GUI Instructions
Yuhao Yang, Yue Wang, Dongxu Li, Ziyang Luo, Bei Chen, Chao Huang, Junnan Li
- Revealing Hidden Mechanisms of Cross-Country Content Moderation with Natural Language Processing
Neemesh Yadav, Jiarui Liu, Francesco Ortu, Roya Ensafi, Zhijing Jin, Rada Mihalcea
- Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation
Stefan Vasilev, Christian Herold, Baohao Liao, Seyyed Hadi Hashemi, Shahram Khadivi, Christof Monz
- Creating a Lens of Chinese Culture: A Multimodal Dataset for Chinese Pun Rebus Art Understanding
Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Kiana Avestimehr, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth Narayanan, Salman Avestimehr
- FastDraft: How to Train Your Draft
Ofir Zafrir, Igor Margulis, Dorin Shteyman, Shira Guskin, Guy Boudoukh
- SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale
Shester Gueuwou, Xiaodan Du, Greg Shakhnarovich, Karen Livescu
- Click, Type, Repeat: A Comprehensive Survey on GUI Agents
Dang Nguyen, Jian Chen, Yu Wang, Gang Wu, Namyong Park, Zhengmian Hu, Hanjia Lyu, Junda Wu, Ryan Aponte, Yu Xia, Xintong Li, Jing Shi, Hongjie Chen, Viet Dac Lai, Zhouhang Xie, Sungchul Kim, Ruiyi Zhang, Tong Yu, Mehrab Tanjim, Nesreen K. Ahmed, Puneet Mathur, Seunghyun Yoon, Lina Yao, Branislav Kveton, Jihyung Kil, Thien Huu Nguyen, Trung Bui, Tianyi Zhou, Ryan A. Rossi, Franck Dernoncourt
- MEDEC: A Benchmark for Medical Error Detection and Correction in Clinical Notes
Asma Ben Abacha, Wen-wai Yim, Yujuan Fu, Zhaoyi Sun, Meliha Yetisgen, Fei Xia, Thomas Lin
- Understanding the Influence of Synthetic Data for Text Embedders
Jacob Mitchell Springer, Vaibhav Adlakha, Siva Reddy, Aditi Raghunathan, Marius Mosbach
- Dynamic Knowledge Integration for Evidence-Driven Counter-Argument Generation with Large Language Models
Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri
- Tell, Don’t Show: Leveraging Language Models’ Abstractive Retellings to Model Literary Themes
Li Lucy, Camilla Griffiths, Sarah Levine, Jennifer L Eberhardt, Dorottya Demszky, David Bamman
- BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck Principle
EunJeong Hwang, Peter West, Vered Shwartz
- Financial Language Model Evaluation (FLaME)
Glenn Matlin, Mika Okamoto, Huzaifa Pardawala, Yang Yang, Sudheer Chava
- CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation
Nengbo Wang, Xiaotian Han, JAGDIP SINGH, Jing Ma, Vipin Chaudhary
- Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation
Tharindu Kumarage, Ninareh Mehrabi, Anil Ramakrishna, Xinyan Zhao, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta, Charith Peris
- Explain then Rank: Scale Calibration of Neural Rankers Using Natural Language Explanations from LLMs
Puxuan Yu, Daniel Cohen, Hemank Lamba, Joel R. Tetreault, Alejandro Jaimes
- Beyond instruction-conditioning, MoTE: Mixture of Task Experts for Multi-task Embedding Models
Miguel Romero Calvo, Shuoyang Ding, Corey D Barrett, Georgiana Dinu, George Karypis
- Metagent-P: A Neuro-Symbolic Planning Agent with Metacognition for Open Worlds
YanfangZhou, Yuntao Liu, Xiaodong Li, Yongqiang Zhao, Xintong Wang, Jinlong Tian, Zhenyu Li, Xinhai Xu
- Q-STRUM Debate: Query-Driven Contrastive Summarization for Recommendation Comparison
George-Kirollos Saad, Scott Sanner
- Inductive Linguistic Reasoning with Large Language Models
Raghav Ramji, Keshav Ramji
- Evaluating LLMs’ Mathematical and Coding Competency through Ontology-guided Interventions
Pengfei Hong, Navonil Majumder, Deepanway Ghosal, Somak Aditya, Rada Mihalcea, Soujanya Poria
- Exploiting Phonetics and Glyph Representation at Radical-level for Classical Chinese Understanding
Junyi Xiang, Maofu Liu
- Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training
Toan Tran, Ruixuan Liu, Li Xiong
- Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Ameya Godbole, Robin Jia
- TabXEval: Why this is a Bad Table? An eXhaustive Rubric for Table Evaluation
Vihang Pancholi, Jainit Sushil Bafna, Tejas Anvekar, Manish Shrivastava, Vivek Gupta
- LADDER: Language-Driven Slice Discovery and Error Rectification in Vision Classifiers
Shantanu Ghosh, Rayan Syed, Chenyu Wang, Vaibhav Choudhary, Binxu Li, Clare B Poynton, Shyam Visweswaran, kayhan Batmanghelich
- GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning
Sifan Zhou, Shuo Wang, Zhihang Yuan, Mingjia Shi, Yuzhang Shang, Dawei Yang
- Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings
Gunjan Balde, Soumyadeep Roy, Mainack Mondal, Niloy Ganguly
- UniT: One Document, Many Revisions, Too Many Edit Intention Taxonomies
Fangping Lan, Abdullah Aljebreen, Eduard Dragut
- Predicting Depression in Screening Interviews from Interactive Multi-Theme Collaboration
Xianbing Zhao, Yiqing Lyu, Di Wang, Buzhou Tang
- Your Language Model May Think Too Rigidly: Achieving Reasoning Consistency with Symmetry-Enhanced Training
Yihang Yao, Zhepeng Cen, Miao Li, William Han, Yuyou Zhang, Emerson Liu, Zuxin Liu, Chuang Gan, Ding Zhao
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
jianling li, ShangZhan Li, Zhenye Gao, Qi Shi, Yuxuan Li, Zefan Wang, Jiacheng Huang, WangHaojie, Jianrong Wang, Xu Han, Zhiyuan Liu, Maosong Sun
- Just KIDDIN’ : Knowledge Infusion and Distillation for Detection of INdecent Memes
Rahul Garg, Trilok Padhi, Hemang Jain, Ugur Kursuncu, Ponnurangam Kumaraguru
- Dynamic Personality in LLM Agents: A Framework for Evolutionary Modeling and Behavioral Analysis in the Prisoner’s Dilemma
Weiqi Zeng, Bo Wang, Dongming Zhao, Zongfeng Qu, Ruifang He, Yuexian Hou, Qinghua Hu
- Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Dylan Zhang, Justin Wang, Tianran Sun
- On the Robust Approximation of ASR Metrics
Abdul Waheed, Hanin atwany, Rita Singh, Bhiksha Raj
- Are the Values of LLMs Structurally Aligned with Humans? A Causal Perspective
Yipeng Kang, Junqi Wang, Yexin Li, Mengmeng Wang, Wenming Tu, Quansen Wang, Hengli Li, Tingjun Wu, Xue Feng, Fangwei Zhong, Zilong Zheng
- LLMs Can Also Do Well! Breaking Barriers in Semantic Role Labeling via Large Language Models
Xinxin Li, Huiyao Chen, Chengjun Liu, Jing Li, Meishan Zhang, Jun Yu, Min Zhang
- Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models
Hanin atwany, Abdul Waheed, Rita Singh, Monojit Choudhury, Bhiksha Raj
- M2PA: A Multi-Memory Planning Agent for Open Worlds Inspired by Cognitive Theory
YanfangZhou, Xiaodong Li, Yuntao Liu, Yongqiang Zhao, Xintong Wang, Zhenyu Li, Jinlong Tian, Xinhai Xu
- AnnaAgent: Dynamic Evolution Agent Systerm with Multi-Session Memory for Realistic Seeker Simulation
Ming Wang, Peidong Wang, Lin Wu, Xiaocui Yang, Daling Wang, Shi Feng, Yuxin Chen, Bixuan Wang, Yifei Zhang
- Diversification Catalyzes Language Models’ Instruction Generalization To Unseen Semantics
Dylan Zhang, Justin Wang, Francois Charton
- DecompileBench: A Comprehensive Benchmark for Evaluating Decompilers in Real-World Scenarios
Zeyu Gao, Yuxin Cui, Hao Wang, Siliang Qin, Yuanda Wang, Zhang Bolun, Chao Zhang
- Thinking Before Running! Efficient Code Generation with Thorough Exploration and Optimal Refinement
Xiaoqing Zhang, Yuhan Liu, Flood Sung, Xiuying Chen, Shuo Shang, Rui Yan
- Edit Once, Update Everywhere: A Simple Framework for Cross-Lingual Knowledge Synchronization in LLMs
Yuchen Wu, Liang Ding, Li Shen, Dacheng Tao
- SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities
Fengqing Jiang, Zhangchen Xu, Yuetai Li, Luyao Niu, Zhen Xiang, Bo Li, Bill Yuchen Lin, Radha Poovendran
- ETRQA: A Comprehensive Benchmark for Evaluating Event Temporal Reasoning Abilities of Large Language Models
Sigang Luo, Yinan Liu, Dongying Lin, Yingying Zhai, Bin Wang, Xiaochun Yang, Junpeng Liu
- The Law of Knowledge Overshadowing: Towards Understanding, Predicting and Preventing LLM Hallucination
Yuji Zhang, Sha Li, Cheng Qian, Jiateng Liu, Pengfei Yu, Chi Han, Yi R. Fung, Kathleen McKeown, ChengXiang Zhai, Manling Li, Heng Ji
- EQS: Unified Entity-Query-Sentence Contrastive Learning for Multimodal Temporal Knowledge Graph Completion
Ying Zhang, Li Zhang, Yu Zhao, Baohang Zhou, Xinying Qian, Xuhui Sui, Kehui Song
- LegoMT2: Selective Asynchronous Sharded Data Parallel Training for Massive Neural Machine Translation
Fei Yuan, Yinquan Lu, Lei Li, Jingjing Xu
- Pruning General Large Language Models into Customized Expert Models
Yiran Zhao, Guizhen Chen, Kenji Kawaguchi, Lidong Bing, Wenxuan Zhang
- Enhance Multimodal Consistency and Coherence for Text-Image Plan Generation
Xiaoxin Lu, Ranran Haoran Zhang, Yusen Zhang, Rui Zhang
- Un-considering Contextual Information: Assessing LLMs’ Understanding of Indexical Elements
Metehan Oğuz, Yavuz Faruk Bakman, Duygu Nur Yaldiz
- Behavioral Analysis of Information Salience in Large Language Models
Jan Trienes, Jörg Schlötterer, Junyi Jessy Li, Christin Seifert
- The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
Avinash Baidya, Kamalika Das, Xiang Gao
- Task Facet Learning: A Structured Approach To Prompt Optimization
Gurusha Juneja, Gautam Jajoo, Hua Li, Jian Jiao, Nagarajan Natarajan, Amit Sharma
- LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding
Junlong Tong, Jinlan Fu, Zixuan Lin, Yingqi Fan, Anhao Zhao, Hui Su, Xiaoyu Shen
- YinYang-Align: A new Benchmark for Competing Objectives and Introducing Multi-Objective Preference based Text-to-Image Alignment
Amitava Das, Yaswanth Narsupalli, Gurpreet Singh, Vinija Jain, Vasu Sharma, Suranjana Trivedy, Aman Chadha, Amit Sheth
- FREE: Fast and Robust Vision Language Models with Early Exits
Divya Jyoti Bajpai, Manjesh Kumar Hanawal
- REPRO-Bench: Can AI Agents Assess the Reproducibility of Social Science Papers?
Chuxuan Hu, Liyun Zhang, Yeji Lim, Aum Wadhwani, Austin Peters, Daniel Kang
- Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts
Sara Ghaboura, Ketan Pravin More, Ritesh Thawkar, Wafa Al Ghallabi, Omkar Thawakar, Fahad Shahbaz Khan, Hisham Cholakkal, Salman Khan, Rao Muhammad Anwer
- Unveiling and Addressing Pseudo Forgetting in Large Language Models
Huashan Sun, Yizhe Yang, Yinghao Li, Jiawei Li, Yang Gao
- Improving MLLM’s Document Image Machine Translation via Synchronously Self-reviewing Its OCR Proficiency
Yupu Liang, Yaping Zhang, Zhiyang Zhang, Zhiyuan Chen, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou
- HG-InsightLog: Context Prioritization and Reduction for Question Answering with Non-Natural Language Construct Log Data
Supriya Bajpai, Athira Gopal, Chandrakant Harjpal, Niraj Kumar
- Dialect Normalization using Large Language Models and Morphological Rules
Antonios Dimakis, John Pavlopoulos, Antonios Anastasopoulos
- USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long $\underline{C}$onversations
Mounika Marreddy, SUBBA REDDY OOTA, Venkata Charan Chinni, Manish Gupta, Lucie Flek
- Learning to Insert [PAUSE] Tokens for Better Reasoning
Eunki Kim, Sangryul Kim, James Thorne
- EventRAG: Supportive Event Retrieval on Hypergraph for Future Forecasting
Zhengwei Tao, Zhi Jin, pu wu, Xiaoying Bai, Haiyan Zhao, Jia Li, Xiancai Chen, Linyu Li, Chongyang Tao
- Understand the Implication: Learning to Think for Pragmatic Understanding
Settaluri Lakshmi Sravanthi, Kishan Maharaj, Sravani Gunnu, Abhijit Mishra, Pushpak Bhattacharyya
- Source Attribution for Large Language Model-Generated Data
Xinyang Lu, Jingtan Wang, Zitong Zhao, Zhongxiang Dai, Chuan-Sheng Foo, See-Kiong Ng, Bryan Kian Hsiang Low
- Dense Retrieval with Quantity Comparison Intent
Prayas Agrawal, Nandeesh Kumar K M, Muthusamy Chelliah, Surender Kumar, Soumen Chakrabarti
- Reflection on Knowledge Graph for Large Language Models Reasoning
Yigeng Zhou, Wu Li, Yifan Lu, Jing Li, Fangming Liu, Meishan Zhang, Yequan Wang, Daojing He, Honghai LIU, Min Zhang
- Revisiting 3D LLM Benchmarks: Are We Really Testing 3D Capabilities?
Jiahe Jin, Yanheng He, Mingyan Yang
- DIESEL - Dynamic Inference-Guidance via Evasion of Semantic Embeddings in LLMs
Ben Ganon, Alon Zolfi, Omer Hofman, Inderjeet Singh, Hisashi Kojima, Yuval Elovici, Asaf Shabtai
- Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience
Jiawei Gu, Ziting Xian, Yuanzhen Xie, Ye Liu, Enjie Liu, Ruichao Zhong, Mochi Gao, Yunzhi Tan, Bo Hu, Zang Li
- Structured Pruning for Diverse Best-of-$N$ Reasoning Optimization
Hieu Trung Nguyen, Bao Nguyen, Viet Anh Nguyen
- RAND: Disrupting Adversarial Synchronization in Transformers via Redundancy-Aware Noise Defense
Lian Duan, Hanzhang Wang, Yuchun Fang
- PodAgent: A Comprehensive Framework for Podcast Generation
Yujia Xiao, Lei He, Haohan Guo, Feng-Long Xie, Tan Lee
- STORM-BORN: A Challenging Mathematical Derivations Dataset Curated via a Human-in-the-Loop Multi-Agent Framework
Wenhao Liu, Zhenyi Lu, Xinyu Hu, Jerry Zhang, Dailin Li, Jiacheng Cen, Huilin Cao, Haiteng Wang, Yuhan Li, XIE KUN, Dandan Li, Pei Zhang, Chengbo Zhang, Yuxiang Ren, Xiaohong Huang, Yan Ma
- iMOVE : Instance-Motion-Aware Video Understanding
Jiaze Li, Yaya Shi, Zongyang Ma, Haoran Xu, cheng.feng, Huihui Xiao, Ruiwen Kang, Fan Yang, Tingting Gao, Di ZHANG
- SceneGram: Conceptualizing and Describing Tangrams in Scene Context
Simeon Junker, Sina Zarrieß
- Relevant or Random: Can LLMs Truly Perform Analogical Reasoning?
Chengwei Qin, Wenhan Xia, Tan Wang, Fangkai Jiao, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty
- MERIT: Multi-Agent Collaboration for Unsupervised Time Series Representation Learning
Shu Zhou, Yunyang Xuan, Yuxuan Ao, Xin Wang, Tao Fan, Hao Wang
- JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
Chang Gao, Wenxuan Zhang, Guizhen Chen, Wai Lam
- RedundancyLens: Revealing and Exploiting Visual Token Processing Redundancy for Efficient Decoder-Only MLLMs
Hongliang Li, Jiaxin Zhang, Wenhui Liao, Dezhi Peng, Kai Ding, Lianwen Jin
- Memory-augmented Query Reconstruction for LLM-based Knowledge Graph Reasoning
Mufan Xu, Gewen Liang, Kehai Chen, wei wang, Xun Zhou, Muyun Yang, Tiejun Zhao, Min Zhang
- KaFT: Knowledge-aware Fine-tuning for Boosting LLMs’ Domain-specific Question-Answering Performance
Qihuang Zhong, Liang Ding, Xiantao Cai, Juhua Liu, Bo Du, Dacheng Tao
- Are Multimodal Large Language Models Pragmatically Competent Listeners in Simple Reference Resolution Tasks?
Simeon Junker, Manar Ali, Larissa Koch, Sina Zarrieß, Hendrik Buschmeier
- Removing Prompt-template Bias in Reinforcement Learning from Human Feedback
Chaojie Wang, Haonan shi, Long Tian, Bo An, Shuicheng YAN
- Latent Distribution Decouple for Uncertain-Aware Multimodal Multi-label Emotion Recognition
Jingwang Huang, Jiang Zhong, Qin Lei, gaojinpeng, ymyang, Sirui Wang, PeiguangLi, kaiwen wei
- Are LLMs Rational Investors? A Study on the Financial Bias in LLMs
Yuhang Zhou, Yuchen Ni, Zhiheng Xi, Zhangyue Yin, Yu He, Gan Yunhui, Xiang Liu, Zhang Jian, Sen Liu, Xipeng Qiu, Yixin Cao, Guangnan Ye, Hongfeng Chai
- Seeing What Tastes Good: Revisiting Multimodal Distributional Semantics in the Billion Parameter Era
Dan Oneata, Desmond Elliott, Stella Frank
- Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models
Sajjad Ghiasvand, Yifan Yang, Zhiyu Xue, Mahnoosh Alizadeh, Zheng Zhang, Ramtin Pedarsani
- A rebuttal of two common deflationary stances against LLM cognition
Zak Hussain, Rui Mata, Dirk U. Wulff
- COVER: Context-Driven Over-Refusal Verification in LLMs
Giovanni Sullutrone, Riccardo A. Vigliermo, Sonia Bergamaschi, Luca Sala
- MOSAIC: Multiple Observers Spotting AI Content
Matthieu Dubois, François Yvon, Pablo Piantanida
- GUIDEX: Guided Synthetic Data Generation for Zero-Shot Information Extraction
Neil De La Fuente, Oscar Sainz, Iker García-Ferrero, Eneko Agirre
- Do Large Language Models Represent the People? A Systematic Literature Review on the Demographic Representativeness of Large Language Models
Indira Sen, Marlene Lutz, Elisa Rogers, David Garcia, Markus Strohmaier
- LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Omkar Thawakar, Dinura Dissanayake, Ketan Pravin More, Ritesh Thawkar, Ahmed Heakl, Noor Ahsan, Yuhao Li, Ilmuz Zaman Mohammed Zumri, Jean Lahoud, Rao Muhammad Anwer, Hisham Cholakkal, Ivan Laptev, Mubarak Shah, Fahad Shahbaz Khan, Salman Khan
- Burn After Reading: Do Multimodal Large Language Models Truly Capture Order of Events in Image Sequences?
Yingjin Song, Yupei Du, Denis Paperno, Albert Gatt
- Full-Step-DPO: Self-Supervised Preference Optimization with Step-wise Rewards for Mathematical Reasoning
Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, WANG CHEN, Wei Zhang, Anh Tuan Luu
- Do Emotions Really Affect Argument Convincingness? A Dynamic Approach with LLM-based Manipulation Checks
Yanran Chen, Steffen Eger
- SCOPE: Compress Mathematical Reasoning Steps for Efficient Automated Process Annotation
Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, WANG CHEN, Wei Zhang, Anh Tuan Luu
- Compositional Syntactico-SemBanking for English as a Second or Foreign Language
Wenxi Li, Xihao Wang, Weiwei Sun
- Semantics-aware prompting for translating NOtices To AirMen
Minal Nitin Dani, Aishwarya Maheswaran, Maunendra Sankar Desarkar
- Stereotype or Personalization? User Identity Biases Chatbot Recommendations
Anjali Kantharuban, Jeremiah Milbauer, Maarten Sap, Emma Strubell, Graham Neubig
- Automated main concept generation for narrative discourse assessment in aphasia
Ankita Gupta, Marisa Hudspeth, Jacquie Kurland, Brendan O’Connor
- Can VLMs Actually See and Read? A Survey on Modality Collapse in Vision-Language Models
Mong Yuan Sim, Wei Emma Zhang, Xiang Dai, Biaoyan Fang
- “You are Beautiful, Body Image Stereotypes are Ugly!” BIStereo: A Benchmark to Measure Body Image Stereotypes in Language Models
Narjis Asad, Nihar Ranjan Sahoo, Rudra Murthy, Swaprava Nath, Pushpak Bhattacharyya
- Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for Large Language Models
Zhengliang Shi, Yuhan Wang, Lingyong Yan, Pengjie Ren, Shuaiqiang Wang, Dawei Yin, Zhaochun Ren
- FINECITE: A Novel Approach on Fine-Grained Citation Context Analysis
Lasse M. Jantsch, Dong-Jae Koh, Seonghwan Yoon, Jisu Lee, Anne Lauscher, Young-Kyoon Suh
- Decoupling Reasoning and Knowledge Injection for In-Context Editing
Changyue Wang, Weihang Su, Qingyao Ai, Yujia Zhou, Yiqun LIU
- Entrospect: Information-Theoretic Self-Reflection Elicits Better Response Refinement of Small Language Models
Tianqiang Yan, Ziqiao Lin, Zhenglong Sun, Lin Zhang, Yuan Gao
- An Iterative Repair Approach for Few-shot Transfer in KBQA with Unanswerability
Riya Sawhney, Samrat Yadav, Indrajit Bhattacharya, Mausam
- Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection
San Kim, Jonghwi Kim, Yejin Jeon, Gary Lee
- EnSToM: Enhancing Dialogue Systems with Entropy-Scaled Steering Vectors for Topic Maintenance
Heejae Suh, Yejin Jeon, Deokhyung Kang, Taehee Park, Yejin Min, Gary Lee
- MultiTEND: A Multilingual Benchmark for Natural Language to NoSQL Query Translation
Zhiqian Qin, Yuanfeng SONG, Jinwei Lu, Yuanwei SONG, Shuaimin Li, Chen Jason Zhang
- Tool learning via Inference-time Scaling and Cycle Verifier
xiaobo liang, Wenjin Xie, Juntao Li, Wanfu Wang, Yibin Chen, Kehai Chen, Min Zhang
- When Benchmarks Talk: Re-Evaluating Code LLMs with Interactive Feedback
Jane Pan, Ryan Shar, Jacob Pfau, Ameet Talwalkar, He He, Valerie Chen
- Reranking-based Generation for Unbiased Perspective Summarization
Narutatsu Ri, Nicholas Deas, Kathleen McKeown
- KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model’s Reasoning Path Aggregation
Siyuan Fang, Kaijing Ma, Tianyu Zheng, Xeron Du, Ningxuan Lu, Ge Zhang, Qingkun Tang
- Enhancing LLM-based Hatred and Toxicity Detection with Meta-Toxic Knowledge Graph
Yibo Zhao, Jiapeng Zhu, Can Xu, Yao Liu, Xiang Li
- Mixture-of-Personas Language Models for Population Simulation
Ngoc Bui, Hieu Trung Nguyen, Shantanu Kumar, Julian Theodore, Weikang Qiu, Viet Anh Nguyen, Rex Ying
- ClusComp: A Simple Paradigm for Model Compression and Efficient Finetuning
Baohao Liao, Christian Herold, Seyyed Hadi Hashemi, Stefan Vasilev, Shahram Khadivi, Christof Monz
- Aspect-Aware Decomposition for Opinion Summarization
Miao Li, Jey Han Lau, Eduard Hovy, Mirella Lapata
- Token-Budget-Aware LLM Reasoning
Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen
- HATA: Trainable and Hardware-Efficient Hash-Aware Top-$k$ Attention for Scalable Large Model Inference
Ping Gong, Jiawei Yi, Shengnan Wang, Juncheng Zhang, Zewen Jin, Ouxiang Zhou, Ruibo Liu, Guanbin Xu, Youhui Bai, Bowen Ye, Kun Yuan, Tong Yang, Gong Zhang, Renhai Chen, Feng Wu, Cheng Li
- Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning
Shota Takashiro, Takeshi Kojima, Andrew Gambardella, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo
- LIST: Linearly Incremental SQL Translator for Single-Hop Reasoning, Generation and Verification
Kaiyuan Guan, Ruoxin Li, Xudong Guo, Zhenning Huang, Xudong Weng, Hehuan Liu, Zheng Wei, Zang Li
- MAGI: Multi-Agent Guided Interview for Psychiatric Assessment
Guanqun Bi, Zhuang Chen, Zhoufu Liu, Hongkai Wang, Xiyao Xiao, Yuqiang Xie, Wen Zhang, Yongkang Huang, Yuxuan Chen, Libiao Peng, Minlie Huang
- TituLLMs: A Family of Bangla LLMs with Comprehensive Benchmarking
Shahriar Kabir Nahin, Rabindra Nath Nandi, Sagor Sarker, Quazi Sarwar Muhtaseem, Md Kowsher, Apu Chandraw Shill, Md Ibrahim, Mehadi Hasan Menon, Tareq Al Muntasir, Firoj Alam
- WikiMixQA: A Multimodal Benchmark for Question Answering over Tables and Charts
Negar Foroutan, Angelika Romanou, Matin Ansaripour, Julian Martin Eisenschlos, Karl Aberer, Rémi Lebret
- Let’s Fuse Step by Step: Generative Fusion Decoding Algorithm with LLMs to Enhance Speech Recognition and Text Image Recognition
Chan-Jan Hsu, Yi-Chang Chen, FengTing Liao, Pei-Chen Ho, Yu-Hsiang Wang, Po-Chun Hsu, Da-shan Shiu
- HPSS: Heuristic Prompting Strategy Search for LLM Evaluators
Bosi Wen, Pei Ke, Yufei Sun, Cunxiang Wang, Xiaotao Gu, Jinfeng Zhou, Jie Tang, Hongning Wang, Minlie Huang
- A Fully Generative Motivational Interviewing Counsellor Chatbot for Moving Smokers Towards the Decision to Quit
Zafarullah Mahmood, Soliman Ali, Jiading Zhu, Mohamed Abdelwahab, Michelle Yu Collins, Sihan Chen, Yi Cheng Zhao, Jodi Wolff, Osnat C. Melamed, Nadia Minian, Marta Maslej, Carolynne Cooper, Matt Ratto, Peter Selby, Jonathan Rose
- LegalCore: A Dataset for Event Coreference Resolution in Legal Documents
Kangda Wei, Xi Shi, Jonathan Tong, Sai Ramana Reddy, Anandhavelu Natarajan, Rajiv Jain, Aparna Garimella, Ruihong Huang
- Rectifying Belief Space via Unlearning to Harness LLMs’ Reasoning
Ayana Niwa, Masahiro Kaneko, Kentaro Inui
- MemeDetoxNet: Balancing Toxicity Reduction and Context Preservation
Gitanjali Kumari, Jitendra solanki, Asif Ekbal
- Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL
Wichayaporn Wongkamjan, Yanze Wang, Feng Gu, Denis Peskoff, Jonathan K. Kummerfeld, Jonathan May, Jordan Lee Boyd-Graber
- Multi-matrix Factorization Attention
Jingcheng Hu, Houyi Li, Yinmin Zhang, Zili Wang, Shuigeng Zhou, Xiangyu Zhang, Heung-Yeung Shum
- Self-Training Elicits Concise Reasoning in Large Language Models
Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun
- Reason from Future: Reverse Thought Chain Enhances LLM Reasoning
Yinlong Xu, Yanzhao Zheng, Shuoshuo Sun, Shuaihan Huang, Baohua Dong, Zhu Hangcheng, Ruohui Huang, Gang Yu, Hongxia Xu, Jian Wu
- LLMs as Planning Modelers: A Survey for Leveraging Large Language Models to Construct Automated Planning Models
Marcus Tantakoun, Christian Muise, Xiaodan Zhu
- From Conversation to Automation: Leveraging LLMs for Problem-Solving Therapy Analysis
Elham Aghakhani, Lu Wang, Karla T. Washington, George Demiris, Jina Huh-Yoo, Rezvaneh Rezapour
- Revisiting Self-Consistency from Dynamic Distributional Alignment Perspective on Answer Aggregation
Yiwei Li, Ji Zhang, Shaoxiong Feng, Peiwen Yuan, Xinglin Wang, Jiayi Shi, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li
- Don’t Say No: Jailbreaking LLM by Suppressing Refusal
Yukai Zhou, Jian Lou, Zhijie Huang, Zhan Qin, Sibei Yang, Wenjie Wang
- From Perception to Reasoning: Enhancing Vision-Language Models for Mobile UI Understanding
Settaluri Lakshmi Sravanthi, Ankit Mishra, Debjyoti Mondal, Subhadarshi Panda, Rituraj Singh, Pushpak Bhattacharyya
- Lemmas Matter, But Not Like That: Predictors of Lemma-Based Generalization in Morphological Inflection
Sarah Ruth Brogden Payne, Jordan Kodner
- Mosaic-IT: Cost-Free Compositional Data Synthesis for Instruction Tuning
Ming Li, Pei Chen, Chenguang Wang, Hongyu Zhao, Yijun Liang, YuPeng Hou, Fuxiao Liu, Tianyi Zhou
- MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration
Yucheng Zhou, Lingran Song, Jianbing Shen
- ATLAS: Agent Tuning via Learning Critical Steps
Zhixun Chen, Ming Li, Yuxuan Huang, Yali Du, Meng Fang, Tianyi Zhou
- TagGen: Enforcing Syntactic Structures with Tag-Based Control
Vicky Xefteri, Afra Amini, Tim Vieira, Ryan Cotterell
- Small Models Struggle to Learn from Strong Reasoners
Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran
- Sparse Rewards Can Self-Train Dialogue Agents
Barrett Martin Lattimer, Varun Prashant Gangal, Ryan McDonald, Yi Yang
- REUBEN: REsampling-based Uncertainty Bounds for Evaluating NLP
Jonne Sälevä, Duygu Ataman, Constantine Lignos
- Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing
Shoumik Saha, Soheil Feizi
- When AI Writes Like Humans: Capturing the Emergent Patterns of Literary Judgment via Intrinsic Textual Metrics
Guillermo Marco, Julio Gonzalo, Víctor Fresno, Juan-Luis Suárez
- Summary Factual Inconsistency Detection Based on LLMs Enhanced by Universal Information Extraction
Anguo Li, Lei Yu
- ELI-Why: Evaluating the Pedagogical Utility of LLM Explanations
Brihi Joshi, Keyu He, Sahana Ramnath, Sadra Sabouri, Kaitlyn Zhou, Souti Chattopadhyay, Swabha Swayamdipta, Xiang Ren
- Beyond Generation: Leveraging LLM Creativity to Overcome Label Bias in Classification
Xiaoyue Wang, Xin Liu
- CogSteer: Cognition-Inspired Selective Layer Intervention for Efficiently Steering Large Language Models
Xintong Wang, Jingheng Pan, Liang Ding, Longyue Wang, Longqin Jiang, Xingshan Li, Chris Biemann
- PASTEL : Polarity-Aware Sentiment Triplet Extraction with LLM-as-a-Judge
Aaditya Bodke, Avinoor Singh Kohli, Hemant Subhash Pardeshi, Prathamesh Bhosale
- COSMIC: Generalized Refusal Identification in LLM Activations
Vincent Siu, Nicholas Crispino, Zihao Yu, Sam Pan, Zhun Wang, Yang Liu, Dawn Song, Chenguang Wang
- Red Queen: Exposing Latent Multi-Turn Risks in Large Language Models
Yifan Jiang, Kriti Aggarwal, Tanmay Laud, Kashif Munir, Jay Pujara, Subhabrata Mukherjee
- MDBench: A Synthetic Multi-Document Reasoning Benchmark Generated with Knowledge Guidance
Joseph J Peper, Wenzhao Qiu, Ali Payani, Lu Wang
- DiaLLM: EHR-Enhanced Clinical Conversational System for Clinical Test Recommendation and Diagnosis Prediction
Weijieying Ren, Tianxiang Zhao, Lei Wang, Tianchun Wang, Vasant G Honavar
- Can Hallucination Correction Improve Video-Language Alignment?
Lingjun Zhao, Mingyang Xie, Paola Cascante-Bonilla, Hal Daumé III, Kwonjoon Lee
- IMPARA-GED: Grammatical Error Detection is Boosting Reference-free Grammatical Error Quality Estimator
Yusuke Sakai, Takumi Goto, Taro Watanabe
- Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs
Chenjun Xu, Bingbing Wen, Bin HAN, Robert Wolfe, Lucy Lu Wang, Bill Howe
- Why Multi-Interest Fairness Matters: Hypergraph Contrastive Multi-Interest Learning for Fair Conversational Recommender System
Yongsen Zheng, Zongxuan Xie, Guohua Wang, Ziyao Liu, Liang Lin, Kwok-Yan Lam
- Cautious Next Token Prediction
Yizhou Wang, Lingzhi Zhang, Yue Bai, Mang Tik Chiu, Zhengmian Hu, Mingyuan Zhang, Qihua Dong, Yu Yin, Sohrab Amirghodsi, Yun Fu
- Reasoning with Graphs: Structuring Implicit Knowledge to Enhance LLMs Reasoning
Haoyu Han, Yaochen Xie, Hui Liu, Xianfeng Tang, Sreyashi Nag, William Headden, Yang Li, Chen Luo, Shuiwang Ji, Qi He, Jiliang Tang
- Enhancing Medical Dialogue Generation through Knowledge Refinement and Dynamic Prompt Adjustment
Hongda Sun, Jiaren Peng, Wenzhong Yang, Liang He, Bo Du, Rui Yan
- Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders
Kristian Kuznetsov, Laida Kushnareva, Anton Razzhigaev, Polina Druzhinina, Anastasia Voznyuk, Irina Piontkovskaya, Evgeny Burnaev, Serguei Barannikov
- Low-Resource Grammatical Error Correction: Selective Data Augmentation with Round-Trip Machine Translation
Frank Palma Gomez, Alla Rozovskaya
- Just Put a Human in the Loop? Investigating LLM-Assisted Annotation for Subjective Tasks
Hope Schroeder, Deb Roy, Jad Kabbara
- Research Community Perspectives on “Intelligence” and Large Language Models
Bertram Højer, Terne Sasha Thorn Jakobsen, Anna Rogers, Stefan Heinrich
- ACLED-DS: A Large Multilingual Expert-Annotated Event Dataset for the Real World
Sina Semnani, Pingyue Zhang, Wanyue Zhai, Haozhuo Li, Ryan Beauchamp, Trey Billing, Katayoun Kishi, Manling Li, Monica Lam
- Benchmarking and Improving New Knowledge Acquisition during Continued Pre-training
Aochong Oliver Li, Tanya Goyal
- CourtEval: A Courtroom-Based Multi-Agent Evaluation Framework
Sandeep Kumar, Abhijit A Nargund, Vivek Sridhar
- Multilingual Definition Modeling
Edison Marrese-Taylor, Erica K. Shimomoto, A. Solano, Enrique Reid
- Human Bias in the Face of AI: The Role of Human Judgement in AI Generated Text Evaluation
Tiffany Zhu, Iain Weissburg, Kexun Zhang, William Yang Wang
- Redundancy, Isotropy, and Intrinsic Dimensionality of Prompt-based Text Embeddings
Hayato Tsukagoshi, Ryohei Sasano
- Harnessing Whisper for Prosodic Stress Analysis
Samuel S. Sohn, Sten Knutsen, Karin Stromswold
- Can You Share Your Story? Modeling Clients’ Metacognition and Openness for LLM Therapist Evaluation
Minju Kim, Dongje Yoo, Yeonjun Hwang, Minseok Kang, Namyoung Kim, Minju Gwak, Beong-woo Kwak, Hyungjoo Chae, Harim Kim, Yunjoong Lee, Min Hee Kim, Dayi jung, Kyong-Mee Chung, Jinyoung Yeo
- Dictionaries to the Rescue: Cross-Lingual Vocabulary Transfer for Low-Resource Languages Using Bilingual Dictionaries
Haruki Sakajo, Yusuke Ide, Justin Vasselli, Yusuke Sakai, Yingtao Tian, Hidetaka Kamigaito, Taro Watanabe
- GradNormIR: When Should We Update the Dense Retriever in Evolving Corpora?
Dayoon Ko, Jinyoung Kim, Sohyeon Kim, Jinhyuk Kim, Jaehoon Lee, Seonghak Song, Minyoung Lee, Gunhee Kim
- The Million Authors Corpus: A Cross-Lingual and Cross-Domain Wikipedia Dataset for Authorship Verification
Abraham Israeli, Shuai Liu, Jonathan May, David Jurgens
- BridG MT: Enhancing LLMs’ Machine Translation Capabilities with Sentence Bridging and Gradual MT
Seungwoo Choi, Gahyun Yoo, Jay-Yoon Lee
- Text2World: Benchmarking Large Language Models for Symbolic World Model Generation
Mengkang Hu, Tianxing Chen, Yude Zou, Yuheng Lei, Qiguang Chen, Ming Li, Yao Mu, Hongyuan Zhang, Wenqi Shao, Ping Luo
- Shiny Inputs, Shiny Outputs? Unveiling the Halo Effect in MultiModal LLMs
Kyusik Kim, Jeongwoo Ryu, Hyeonseok Jeon, Bongwon Suh
- CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang, Ruqi Zhang
- ADO: Automatic Data Optimization for Inputs in LLM Prompts
Sam Lin, Wenyue Hua, Lingyao Li, Zhenting Wang, Yongfeng Zhang
- Large Language Models Still Exhibit Bias in Long Text
Wonje Jeung, Dongjae Jeon, Ashkan Yousefpour, Jonghyun Choi
- Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation
Qiyue Gao, Xinyu Pi, Kevin Liu, Junrong Chen, Ruolan Yang, Xinqi Huang, Xinyu Fang, Lu Sun, Gautham Kishore, Bo Ai, Stone Tao, Mengyang Liu, Jiaxi Yang, Chao-Jung Lai, Chuanyang Jin, Jiannan Xiang, Benhao Huang, Zeming Chen, David Danks, Hao Su, Tianmin Shu, Ziqiao Ma, Lianhui Qin, Zhiting Hu
- Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents
Ivoline C. Ngong, Swanand Ravindra Kadhe, Hao Wang, Keerthiram Murugesan, Justin D. Weisz, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy
- Enhancing Persona Consistency for LLMs’ Role-Playing using Persona-Aware Contrastive Learning
Ke Ji, Yixin Lian, Linxu Li, Jingsheng Gao, Weiyuan Li, Bin Dai
- M$^{2}$-TabFact: Multi-Document Multi-Modal Fact Verification with Visual and Textual Representations of Tabular Data
Mingyang Zhou, Lingyu Zhang, Sophia Horng, Maximillian Chen, Kung-Hsiang Huang, Shih-Fu Chang
- Fuzzy Speculative Decoding for a Tunable Accuracy-Runtime Tradeoff
Maximilian Holsman, Yukun Huang, Bhuwan Dhingra
- PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play
Wei Fang, Yang Zhang, Kaizhi Qian, James R. Glass, Yada Zhu
- Towards the Pedagogical Steering of Large Language Models for Tutoring: A Case Study with Modeling Productive Failure
Romain Puech, Jakub Macina, Julia Chatain, Mrinmaya Sachan, Manu Kapur
- Spotting Out-of-Character Behavior: Atomic-Level Evaluation of Persona Fidelity in Open-Ended Generation
Jisu Shin, Juhyun Oh, Eunsu Kim, Hoyun Song, Alice Oh
- What Language Do Non-English-Centric Large Language Models Think in?
Chengzhi Zhong, Qianying Liu, Fei Cheng, Junfeng Jiang, Zhen Wan, Chenhui Chu, Yugo Murawaki, Sadao Kurohashi
- $T^5Score$: A Methodology for Automatically Assessing the Quality of LLM Generated Multi-Document Topic Sets
Itamar Trainin, Omri Abend
- Uncertainty Aware Contrastive Decoding
Hakyung Lee, Subeen Park, JOOWANG KIM, Sungjun Lim, Kyungwoo Song
- GEMS: Generation-Based Event Argument Extraction via Multi-perspective Prompts and Ontology Steering
Run Lin, Yao Liu, Yanglei Gan, Yuxiang Cai, Tian Lan, Qiao Liu
- RomanLens: The Role Of Latent Romanization In Multilinguality In LLMs
Alan Saji, Jaavid Aktar Husain, Thanmay Jayakumar, Raj Dabre, Anoop Kunchukuttan, Ratish Puduppully
- 7 Points to Tsinghua but 10 Points to 清华? Assessing Large Language Models in Agentic Multilingual National Bias
Qianying Liu, Katrina Qiyao Wang, Fei Cheng, Sadao Kurohashi
- V-HATE: Voting-based Implicit Hate Speech Detection
Yejin Lee, Hyeseon Ahn, Yo-Sub Han
- Search-in-Context: Efficient Multi-Hop QA over Long Contexts via Monte Carlo Tree Search with Dynamic KV Retrieval
Jiabei Chen, Guang Liu, Shizhu He, Kun Luo, Yao Xu, Jun Zhao, Kang Liu
- LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation
Eunsu Kim, Juyoung Suk, Seungone Kim, Niklas Muennighoff, Dongkwan Kim, Alice Oh
- IntentionESC: An Intention-Centered Framework for Enhancing Emotional Support in Dialogue Systems
Xinjie Zhang, Wenxuan Wang, Qin Jin
- Beyond Context to Cognitive Appraisal: Emotion Reasoning as a Theory of Mind Benchmark for Large Language Models
Gerard Christopher Yeo, Kokil Jaidka
- CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization
Mst. Fahmida Sultana Naznin, Adnan Ibney Faruq, Mostafa Rifat Tazwar, Md Jobayer, Md. Mehedi Hasan Shawon, Md Rakibul Hasan
- Rethinking Prompt-based Debiasing in Large Language Model
Xinyi Yang, Runzhe Zhan, Shu Yang, Junchao Wu, Lidia S. Chao, Derek F. Wong
- Exploring In-context Example Generation for Machine Translation
Dohyun Lee, Seungil Chad Lee, Chanwoo Yang, Yujin Baek, Jaegul Choo
- Knowledge Base Construction for Knowledge-Augmented Text-to-SQL
Jinheon Baek, Horst Samulowitz, Oktie Hassanzadeh, Dharmashankar Subramanian, Sola Shirai, Alfio Gliozzo, Debarun Bhattacharjya
- NBDESCRIB: A Dataset for Text Description Generation from Tables and Code in Jupyter Notebooks with Guidelines
Xuye Liu, Tengfei Ma, Yimu Wang, Fengjie Wang, Jian Zhao
- ECoRAG: Evidentiality-guided Compression for Long Context RAG
Yeonseok Jeong, Jinsu Kim, Dohyeon Lee, seung-won hwang
- From Complexity to Clarity: AI/NLP’s Role in Regulatory Compliance
Jivitesh Jain, Nivedhitha Dhanasekaran, Mona T. Diab
- EXPERT: An Explainable Image Captioning Evaluation Metric with Structured Explanations
Hyunjong Kim, Sangyeop Kim, Jongheon Jeong, Yeongjae Cho, Sungzoon Cho
- Mind Your Theory: Theory of Mind Goes Deeper Than Reasoning
Eitan Wagner, Nitay Alon, Joseph M Barnby, Omri Abend
- LLMs are Biased Evaluators But Not Biased for Retrieval Augmented Generation
Yen-Shan Chen, Jin Jing, Peng-Ting Kuo, Chao-Wei Huang, Yun-Nung Chen
- Standard Quality Criteria Derived from Current NLP Evaluations for Guiding Evaluation Design and Grounding Comparability and AI Compliance Assessments
Anya Belz, Simon Mille, Craig Thomson
- skLEP: A Slovak General Language Understanding Benchmark
Marek Suppa, Andrej Ridzik, Daniel Hládek, Tomáš Javůrek, Viktória Ondrejová, Kristína Sásiková, Martin Tamajka, Marian Simko
- Can Vision Language Models Understand Mimes?
Hyundong Justin Cho, Spencer Lin, Tejas Srinivasan, Michael Saxon, Deuksin Kwon, Natali T. Chavez, Jonathan May
- Training Language Model to Critique for Better Refinement
Tianshu Yu, Chao Xiang, mingchuan yang, Pei Ke, Bosi Wen, Cunxiang Wang, Jiale Cheng, Li Zhang, Xinyu Mu, Chuxiong Sun, Minlie Huang
- Dynamic Task Vector Grouping for Efficient Multi-Task Prompt Tuning
Peiyi Zhang, Richong Zhang, Zhijie Nie, Ziqiao Wang
- DICE-BENCH: Evaluating the Tool-Use Capabilities of Large Language Models in Multi-Round, Multi-Party Dialogues
Kyochul Jang, Donghyeon Lee, Kyusik Kim, Dongseok Heo, Taewhoo Lee, Woojeong Kim, Bongwon Suh
- HASH-RAG: Bridging Deep Hashing with Retriever for Efficient, Fine Retrieval and Augmented Generation
Jinyu Guo, Xunlei Chen, Qiyang Xia, Zhaokun Wang, Jie Ou, Libo Qin, Shunyu Yao, Wenhong Tian
- A Constrained Text Revision Agent via Iterative Planning and Searching
Hannan Cao, Hwee Tou Ng
- MMRefine: Unveiling the Obstacles to Robust Refinement in Multimodal Large Language Models
Gio Paik, Geewook Kim, Jinbae Im
- How Programming Concepts and Neurons Are Shared in Code Language Models
Amir Hossein Kargaran, Yihong Liu, François Yvon, Hinrich Schuetze
- DynaQuest: A Dynamic Question Answering Dataset Reflecting Real-World Knowledge Updates
Qian Lin, Junyi Li, Hwee Tou Ng
- ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations
Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba
- Revisiting In-Context Learning with Long Context Language Models
Jinheon Baek, Sun Jae Lee, Prakhar Gupta, Geunseob Oh, Siddharth Dalmia, Prateek Kolhar
- Rationalize and Align: Enhancing Writing Assistance with Rationale via Self-Training for Improved Alignment
Hannan Cao, Hai Ye, Hwee Tou Ng
- Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps
Jie Ou, Jinyu Guo, Shuaihong Jiang, Zhaokun Wang, Libo Qin, Shunyu Yao, Wenhong Tian
- MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment
Amir Hossein Kargaran, Ali Modarressi, Nafiseh Nikeghbal, Jana Diesner, François Yvon, Hinrich Schuetze
- Automated Fine-Grained Mixture-of-Experts Quantization
Zhanhao Xie, Yuexiao Ma, Xiawu Zheng, Fei Chao, Wanchen Sui, Shen Li, Yong Li, Rongrong Ji
- Enhancing Complex Reasoning in Knowledge Graph Question Answering Through Query Graph Approximation
Hongjun Jeong, Minji Kim, Heesoo Jung, Ko Keun Kim, Hogun Park