Natural Language, Dialog and Speech (NDS) Symposium
Friday, November 22, 2019, 9:00 AM - 6:00 PM
The New York Academy of Sciences, 7 World Trade Center, 250 Greenwich St Fl 40, New York
The New York Academy of Sciences
Natural language, dialog and speech (NDS) researchers focus on communication between people and computers using human languages both in written and spoken forms. They develop models for analyzing the structure and content of human conversation and create artificial agents who can engage in human-like interaction with people and other agents.
Building on the success of NYAS’s annual Machine Learning Symposium, the longest-running conference on Machine Learning in the Eastern United States, NDS2019 will convene leading researchers from academia and industry to discuss cutting-edge methodologies and computational approaches to applied and theoretical problems in dialog systems, spoken and natural language understanding, natural language generation and speech synthesis.
Livestreams: Two of the Keynotes are available via Livestream:
Keynote #1 (Commonsense Intelligence: Cracking the Longstanding Challenge in AI, Yejin Choi, University of Washington) at 10:10 am
Keynote #2 (Propagation, Persuasion, and Polarization: Language Effectiveness and Effects of Language, Lillian Lee, Cornell University) at 12:20 pm
New York University
The New York Academy of Sciences
Brooklyn College & The Graduate Center, CUNY
JP Morgan Chase
November 22, 2019
Registration, Continental Breakfast, and Poster Set-up
Keynote Address 1
Commonsense Intelligence: Cracking the Longstanding Challenge in AI
Despite considerable advances in deep learning, AI remains to be narrow and brittle. One fundamental limitation comes from its lack of commonsense intelligence: reasoning about everyday situations and events, which in turn, requires knowledge about how the physical and social world works. In this talk, I will share some of our recent efforts that attempt to crack commonsense intelligence.
First, I will introduce ATOMIC, the atlas of everyday commonsense knowledge and reasoning, organized as a graph of 877k if-then rules (e.g., "if X pays Y a compliment, then Y will likely return the compliment”). Next, I will introduce COMET, our deep neural networks that can learn from and generalize beyond the ATOMIC commonsense graph. Finally, I will present RAINBOW, a collection of seven benchmarks that aims to cover a wide spectrum of commonsense intelligence from natural language inference to adductive reasoning to visual commonsense reasoning. I will conclude the talk by discussing major open research questions, including the importance of algorithmic solutions to reduce incidental biases in data that can lead to overestimation of true AI capabilities.
STAR Talks: Session 1
Multimodal Dialogue for Interacting with Data
Automatic Extraction of Polysemous Words from Contextualized Embeddings
A Good Sample is Hard to Find: Noise Injection Sampling and Self-Training for Neural Language Generation Models
Deciphering How People Detect Lies: Acoustic-Prosodic and Lexical Cues to Deception and Trust
Finding Generalizable Evidence by Learning to Convince Q&A Models
Networking Break and Poster Viewing
Keynote Address 2
Propagation, Persuasion, and Polarization: Language Effectiveness and Effects of Language
Does the way in which something is worded in and of itself have an effect on whether it is remembered or attracts attention, perhaps beyond its content or context? We'll present work using language analysis to predict memorable movie quotes, persuasive arguments in the community ChangeMyView, and controversial comments on Reddit.
Networking Lunch and Poster Viewing
STAR Talks: Session 2
Temporally Aware Named Entity Recognition
Understanding Learning Dynamics of Language Models with SVCCA
Evaluating Conversational Agents
Executing Instructions in Situated Collaborative Interactions
Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions
Keynote Address 3
Embeddings for Spoken Words
Word embeddings have become a ubiquitous tool in natural language processing. These embeddings represent the meanings of written words. On the other hand, for spoken language it may be more important to represent how a written word *sounds* rather than (or in addition to) what it means. For some applications it can also be helpful to represent variable-length acoustic segments corresponding to words, or other linguistic units, as fixed-dimensional vectors. This talk will present recent work on both acoustic word embeddings and "acoustically grounded" written word embeddings, including their applications for improved speech recognition and search.
Keynote Address 4
Towards Large-Scale Federated Conversational Intelligence
Conversational agents have become prevalent in every aspect of our lives. Conversational agents such as Alexa and Google Assistant are no longer closed applications but rather they have evolved to be an ecosystem featuring hundreds of thousands of voice skills and offer a rich set of tools to bring in unlimited number of skills, for instance, an AI-centric toolkit for voice skill authoring, seamless cold start of new skills, and traffic optimization based on user satisfaction. In this talk, I discuss recent industrial trends towards large-scale federated conversational intelligence and give a sneak peek of present challenges and approaches in industry.