(Paper Review) DoLa_Decoding by Contrasting Layers Improves Factuality in Large Language Models ICLR 2024 #Paper Review #Large Language Models (LLMs) #LLM Safety #Hallucination #Factuality #Decoding #Text Generation
(Paper Review) Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors ICLR 2025 #Paper Review #Activation Steering #Activation Intervention #Large Language Models (LLMs) #LLM Safety
(Paper Review) Programming Refusal with Conditional Activation Steering ICLR 2025 Spotlight #Paper Review #LLM Safety #Large Language Models (LLMs) #Activation Steering #Refusal
(Paper Review) Alleviating Hallucinations of Large Language Models through Induced Hallucinations ACL 2025 #Paper Review #LLM Safety #Large Language Models (LLMs) #Hallucination