The Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) will take place next week. We’re excited to share all the work from SAIL that will be presented, and you’ll find links to papers, videos and blogs below. Feel free to reach out to the contact authors directly to learn more about the work that’s happening at Stanford!

List of Accepted Papers

Fixing Model Bugs with Natural Language Patches

Authors: Shikhar Murty, Christopher D. Manning, Scott Lundberg, Marco Tulio Ribeiro
Contact: jsmurty@stanford.edu
Links: Paper
Keywords: language models, test time model corrections, post-hoc model patching


On Measuring the Intrinsic Few-Shot Hardness of Datasets

Authors: Xinran Zhao*; Shikhar Murty*; Christopher D. Manning
Contact: xzhaoar@stanford.edu
Keywords: few-shot learning, dataset hardness, lightweight metic


Enhancing Self-Consistency and Performance of Pretrained Language Models with NLI

Authors: Eric Mitchell, Joseph J. Noh, Siyan Li, William S. Armstrong, Ananth Agarwal, Patrick Liu, Chelsea Finn, Christopher D. Manning
Contact: eric.mitchell@cs.stanford.edu
Links: Paper | Website
Keywords: consistency nli language question


Detecting Label Errors by using Pre-Trained Language Models

Authors: Derek Chong, Jenny Hong, Christopher D. Manning
Contact: derekch@stanford.edu
Links: Paper | Blog Post | Video | Website
Keywords: robustness, pre-trained language models, human-in-the-loop, label errors, label noise


You Only Need One Model for Open-domain Question Answering

Authors: Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher D. Manning and Kyoung-Gu Woo
Links: Paper
Keywords: question answering, triviaqa


JamPatoisNLI: A Jamaican Patois Natural Language Inference Dataset

Authors: Ruth-Ann Hazel Armstrong, John Hewitt and Christopher D. Manning
Contact: ruthanna@stanford.edu
Links: Paper | Blog Post | Video | Website
Keywords: dataset, creole languages, pretrained models, cross-lingual transfer, multilingual bert, xlm-roberta, jamaican patois, translation, natural language inference, multilingual, transfer learning


Truncation Sampling as Language Model Desmoothing

Authors: John Hewitt, Christopher D. Manning and Percy Liang
Keywords: natural language generation


Workshop on Story Shared and Lesson Learned

Authors: Diyi Yang, Pradeep Dasigi, Sherry Tongshuang Wu, Tuhin Chakrabarty, and Yuval Pinter
Contact: diyiy@stanford.edu
Keywords: career trajectory; mentorship


When FLUE Meets FLANG: Benchmarks and Large Pre-trained Language Model for Financial Domain

Authors: Raj Sanjay Shah, Kunal Chawla, Dheeraj Eidnani, Agam Shah, Wendi Du, Sudheer Chava, Natraj Raman, Smiley Charese, Jiaao Chen, Diyi Yang
Contact: diyiy@stanford.edu
Links: Paper | Blog Post | Website
Keywords: financial language modeling, large language models, datasets and benchmarks


Robustness of Demonstration-based Learning Under Limited Data Scenario

Authors: Hongxin Zhang, Yanzhe Zhang, Ruiyi Zhang, Diyi Yang
Contact: diyiy@stanford.edu
Links: Paper | Website
Keywords: large language models, analysis, robustness, demonstration, structured prediction, limited data


Geographic Citation Gaps in NLP Research

Authors: Mukund Rungta, Janvijay Singh, Saif M. Mohammad, Diyi Yang
Contact: diyiy@stanford.edu
Links: Paper
Keywords: computational social science; bias and diversity


Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation Metrics

Authors: Elisa Kreiss, Cynthia Bennett, Shayan Hooshmand, Eric Zelikman, Meredith Ringel Morris and Christopher Potts
Contact: ekreiss@stanford.edu
Links: Paper
Keywords: image captioning, evaluation, context, accessibility


Mixed-effects transformers for hierarchical adaptation

Authors: Julia White, Noah Goodman and Robert Hawkins
Keywords: language modeling and analysis of language models


Concadia: Towards Image-Based Text Generation with a Purpose

Authors: Elisa Kreiss, Fei Fang, Noah D. Goodman, Christopher Potts
Contact: ekreiss@stanford.edu
Links: Paper
Keywords: image captioning, multimodal, context


Systematicity in GPT-3’s Interpretation of Novel English Noun Compounds

Authors: Siyan Li, Riley Carlson, Christopher Potts
Contact: siyanli@stanford.edu
Links: Paper
Keywords: gpt-3, noun compounds, language comprehension


Authors: Gil Semo, Dor Bernsohn, Ben Hagag, Gila Hayat, Joel Niklaus
Contact: joel.niklaus@inf.unibe.ch
Links: Paper
Keywords: natural legal language processing, class actions, legal judgment prediction, legal nlp, transformers, calibration, integrated gradients


FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning

Authors: Suvir Mirchandani, Licheng Yu, Mengjiao Wang, Animesh Sinha, Wenwen Jiang, Tao Xiang, Ning Zhang
Contact: smirchan@stanford.edu
Links: Paper
Keywords: vision-and-language learning, domain-specific pre-training, fashion


“It’s Not Just Hate”: A Multi-Dimensional Perspective on Detecting Harmful Speech Online

Authors: Federico Bianchi, Stefanie Anja Hills, Patricia Rossini, Dirk Hovy, Rebekah Tromble and Nava Tintarev
Contact: fede@stanford.edu
Keywords: hate speech detection


Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models

Authors: Mirac Suzgun, Luke Melas-Kyriazi, Dan Jurafsky
Contact: msuzgun@cs.stanford.edu
Links: Paper | Video | Website
Keywords: style transfer, language models, prompting, sentiment transfer, grammar error correction, shakespeare, yelp, amazon


The Authenticity Gap in Human Evaluation

Authors: Kawin Ethayarajh, Dan Jurafsky
Contact: kawin@stanford.edu
Links: Paper
Keywords: evaluation, nlp, nlg, generation


LADIS: Language Disentanglement for 3D Shape Editing

Authors: Ian Huang, Panos Achlioptas, Tianyi Zhang, Sergey Tulyakov, Minhyuk Sung and Leonidas Guibas
Contact: ianhuang@cs.stanford.edu
Links: Paper
Keywords: speech, vision, robotics, multimodal grounding


SocioProbe: What, When, and Where Language Models Learn about Sociodemographics

Authors: Anne Lauscher, Federico Bianchi, Samuel Bowman, Dirk Hovy
Contact: fede@stanford.edu
Keywords: sociodemographic, language models


Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages

Authors: Paul Röttger, Debora Nozza, Federico Bianchi, Dirk Hovy
Contact: fede@stanford.edu
Keywords: hate speech detection, low resource


Twitter-Demographer: A Flow-based Tool to Enrich Twitter Data

Authors: Federico Bianchi, Vincenzo Cutrona, Dirk Hovy
Contact: fede@stanford.edu
Links: | Video | Website
Keywords: demographer, social science, data enrichment twitter


Improving the Factual Correctness of Radiology Report Generation with Semantic Rewards

Authors: Jean-Benoit Delbrouck, Pierre Chambon, Christian Bluethgen, Emily Tsai, Omar Almusa and Curtis Langlotz
Keywords: natural language generation


We look forward to seeing you at EMNLP 2022!