5th Annual Machine Learning Symposium
Friday, October 22, 2010
Presented by the Machine Learning Discussion Group
This is the fifth symposium on Machine Learning at the New York Academy of Sciences. The aim of these series of symposia is to build a community of scientists in machine learning from the NYC area's academic, government, and industrial institutions by convening and promoting the exchange of ideas in a neutral setting.
Past Machine Learning Symposia
*Presentation times are subject to change.
Coffee & Poster Set-Up
Probabilistic Topic Models
Learning Hierarchies of Features
Dual Decomposition and Linear Programming Relaxations for Inference in Natural Language Processing
Cluster Trees, Near-neighbor Graphs, and Continuum Percolation
Student Award Winner Announcement & Closing Remarks
End of program
Optional Science Alliance meeting for students
David Blei, PhD
David Blei is an assistant professor of Computer Science at Princeton University. He received his PhD in 2004 at U.C. Berkeley and was a postdoctoral fellow at Carnegie Mellon University. His research focuses on probabilistic models, Bayesian nonparametric methods, and approximate posterior inference. He works on a variety of applications, including text, images, music, social networks, and scientific data.
Sanjoy Dasgupta, PhD
University of California
Sanjoy Dasgupta is an Associate Professor in the Department of Computer Science and Engineering at UC San Diego. Prior to joining UCSD in 2002, he was a senior member of the technical staff at AT&T Labs—Research. He obtained a Ph.D. in Computer Science from UC Berkeley in 2000, and a B.A. in Computer Science from Harvard in 1993. His research area is learning theory, with a focus on unsupervised and minimally supervised learning.
Michael Collins, PhD
Massachusetts Institute of Technology
Michael Collins is an associate professor of computer science at MIT. He received a PhD in computer science from the University of Pennsylvania in 1998. Prior to joining MIT in 2003, he was a researcher at AT&T Labs-Research. His research has focused on topics including statistical parsing, structured prediction problems in machine learning, and NLP applications including machine translation, dialog systems, and speech recognition.
Yann LeCun, PhD
New York University
Yann LeCun is Silver Professor of Computer Science and Neural Science at the Courant Institute of Mathematical Sciences and at the Center for Neural Science of New York University. He received an Electrical Engineer Diploma from Ecole Supérieure d'Ingénieurs en Electrotechnique et Electronique (ESIEE), Paris in 1983, and a PhD in Computer Science from Université Pierre et Marie Curie (Paris) in 1987. After a postdoc at the University of Toronto, he joined AT&T Bell Laboratories in Holmdel, NJ, in 1988. He became head of the Image Processing Research Department at AT&T Labs-Research in 1996, and joined NYU in 2003, after a brief period as Fellow at the NEC Research Institute in Princeton. His current interests include machine learning, computer vision, pattern recognition, mobile robotics, and computational neuroscience. He has published over 150 technical papers on these topics as well as on neural networks, handwriting recognition, image processing and compression, and VLSI design. His handwriting recognition technology is used by several banks around the world to read checks. His image compression technology, called DjVu, is used by hundreds of web sites and publishers and millions of users to distribute and access scanned documents on the Web, and his image recognition technique, called Convolutional Network, has been deployed by companies such as Google, Microsoft, NEC, France Telecom and several startup companies for document recognition, human-computer interaction, image indexing, and video analytics. He has been on the editorial board of IJCV, IEEE PAMI, IEEE Trans on Neural Networks, was program chair of CVPR'06, and is chair of the annual Learning Workshop. He is on the science advisory board of Institute for Pure and Applied Mathematics, and is the co-founder of MuseAmi, a music technology company.
Naoki Abe, PhD
Corinna Cortes, PhD
Tony Jebara, PhD
Michael Kearns, PhD
University of Pennsylvania
John Langford, PhD
Mehryar Mohri, PhD
New York University
Robert Schapire, PhD
David Waltz, PhD
Probabilistic Topic Models
David Blei, PhD, Princeton University
Probabilistic topic modeling provides an important suite of tools for the unsupervised analysis of large collections of documents. Topic modeling algorithms can uncover the underlying themes of a collection and decompose its documents according to those themes. This analysis can then be used for tasks like corpus exploration, document search, and a variety of prediction problems. In this talk, I will review the state-of-the-art in probabilistic topic models and describe two recent innovations.
First, I will describe a topic model developed for analyzing political texts, such as bills and laws. With this model, we can characterize the political tone of a government body and make predictions about how its members will vote on new legislation.
Second, I will describe an on-line strategy for fitting topic models.Rather than analyzing a corpus in batch, our algorithm can analyze documents arriving in a stream. An analysis of 3.3M articles from Wikipedia shows that this on-line approach fits topic models that are as good or better than those found with the traditional batch approach, and fits them in a fraction of the time.
Learning Hierarchies of Features
Yann LeCun, PhD, New York University
Intelligent perceptual tasks such as vision and audition require the construction of good internal representations. Theoretical and empirical evidence suggest that the perceptual world is best represented by a multi-stage hierarchy in which features in successive stages are increasingly global, invariant, and abstract. An important challenge for Machine Learning is to devise "deep learning" methods than can automatically learn good feature hierarchies from labeled and unlabeled data. A class of such methods that combine unsupervised sparse coding, and supervised refinement will be described. A number of applications will be shown through videos and live demos, including a category-level object recognition system that can be trained on line, and a trainable vision system for off-road mobile robot.
Cluster trees, Near-neighbor Graphs, and Continuum Percolation
Sanjoy Dasgupta, PhD, MUniversity of California, San Diego
What information does the clustering of a finite data set reveal about the underlying distribution from which the data were sampled? This basic question has proved elusive even for the most widely-used clustering procedures. A natural criterion is to seek clusters that converge (as the data set grows) to regions of high density. When all possible density levels are considered, this is a hierarchical clustering problem where the sought limit is called the “cluster tree”. We give a simple three-line algorithm for estimating this tree that implicitly constructs a multiscale hierarchy of near-neighbor graphs on the data points. We show that the procedure is consistent, answering a long-standing open problem of Hartigan. We also obtain rates of convergence, using a percolation argument that gives insight into how near-neighbor graphs should be constructed.
Travel & Lodging
The New York Academy of Sciences
7 World Trade Center
250 Greenwich Street, 40th floor
New York, NY 10007-2157
Hotels Near 7 World Trade Center
Recommended partner hotel:
The New York Academy of Sciences is a part of the Club Quarters network . Please feel free to make accommodations with Club Quarters on-line to save significantly on hotel costs.
Club Quarters Reservation Password: NYAS
Club Quarters, World Trade Center
140 Washington Street
New York, NY 10006
Phone: (212) 577-1133
Located on the south side of the World Trade Center, opposite Memorial Plaza, Club Quarters, 140 Washington Street, is just a short walk to our location.
Other hotels located near 7 WTC: