Support The World's Smartest Network

Help the New York Academy of Sciences bring late-breaking scientific information about the COVID-19 pandemic to global audiences. Please make a tax-deductible gift today.

This site uses cookies.
Learn more.


This website uses cookies. Some of the cookies we use are essential for parts of the website to operate while others offer you a better browsing experience. You give us your permission to use cookies, by continuing to use our website after you have received the cookie notification. To find out more about cookies on this website and how to change your cookie settings, see our Privacy policy and Terms of Use.

We encourage you to learn more about cookies on our site in our Privacy policy and Terms of Use.

3rd Annual Machine Learning Symposium

3rd Annual Machine Learning Symposium

Friday, October 10, 2008

The New York Academy of Sciences

Presented By


This is the third symposium on Machine Learning at the New York Academy of Sciences. The aim of these series of symposia is to build a community of scientists in machine learning from the NYC area's academic, government, and industrial institutions by convening and promoting the exchange of ideas in a neutral setting.

Steering Committee

  • Corinna Cortes, PhD, Google, Inc.
  • Tony Jebara, PhD, Columbia University
  • John Langford, PhD, Yahoo Research
  • Michael L. Littman, PhD, Rutgers University
  • Mehryar Mohri, PhD, Courant Institute of Mathematical Sciences
  • Robert Schapire, PhD, Princeton University
  • David Waltz, PhD, Columbia University



 9:30 AM   Coffee and Poster Set-Up

10:00 AM   Opening Remarks

10:15 AM   Edo Airoldi, Troyanskaya Lab, Princeton University

11:00 AM   Dana Angluin, Yale University

11:45 AM   Graduate Student Talks

          Corinna Cortes, Mehryar Mohri, Michael Riley and Afshin Rostamizadeh, Google Research

          Koby Crammer, Eyal Even-Dar, Yishay Mansour and Jennifer Wortman, University of Pennsylvania

          Sina Jafarpour, Princeton University

          Lihong Li and Thomas J. Walsh, Rutgers University

          Piotr W. Mirowski, Yann LeCun, Deepak Madhavan and Ruben Kuzniecky, Courant Institute of Mathematical Science

          Mehryar Mohri and Ameet Talwalkar, Courant Institute of Mathematical Sciences

          Indraneel Mukherjee and Robert E. Schapire, Princeton University

          Anil Raj and Chris H. Wiggins, Columbia University

          Victor S. Sheng, Foster Provost and Panagiotis G. Ipeirotis, New York University

          Pannagadatta K. Shivaswamy and Tony Jebara, Columbia University

12:45 PM  Lunch & Poster Session

2:30 PM   Tony Jebara, Columbia University

3:15 PM   Robert Kleinberg, Cornell University

4:00 PM   Student Award Winner Announcement & Closing Remarks


Speaker Abstracts

A Statistical Perspective on Cellular Growth
Edo Airoldi, Princeton University

Maintaining balanced growth in a changing environment is a fundamental systems-level challenge for cellular physiology, particularly in microorganisms. While the complete set of regulatory and functional pathways supporting growth and cellular proliferation are not yet known, portions of them are well understood. In particular, cellular proliferation is governed by mechanisms that are highly conserved from unicellular to multicellular organisms, and the disruption of these processes in metazoans is a major factor in the development of cancer. In this talk, we will introduce statistical and computational methods to identify quantitative aspects of the regulatory mechanisms underlying cell proliferation in Saccharomyces cerevisiae. We find that the expression levels of a small set of genes accurately predict the instantaneous growth rate of any cellular culture, robust to changing biological conditions, experimental methods, and technological platforms. Our model also predicts growth rates for the related yeast Saccharomyces bayanus and the highly diverged yeast Schizosaccharomyces pombe, suggesting that the underlying regulatory signature is conserved across a wide range of unicellular evolution.

We investigate the biological significance of the identified gene expression signature from multiple perspectives: by perturbing the regulatory network through the Ras/ cAMP/PKA pathway, observing strong up-regulation of growth rate even in the absence of appropriate nutrients, and by discovering potential transcription factor binding sites enriched in growth-correlated genes. Most importantly, statistical and computational methods enable substantive biological insights about growth at instantaneous time scales inaccessible by direct experimental methods.

Value Injection Queries for Circuit Learning
Dana Angluin, Yale University

We survey results on algorithms to learn Boolean, analog and probabilistic circuits using value injection queries. A value injection query is a kind of enhanced membership query, in which we may control the values on interior wires, as well as on input wires of the circuit, but still may only observe the values on output wire(s) of the circuit. This type of query is inspired by the capabilities of gene suppression and gene over-expression in studying the structure of gene regulatory networks.

We consider the theoretical power of such queries in learning Boolean circuits, where we give polynomial time algorithms to learn circuits with bounded fan-in and logarithmic depth, as well as unbounded fan-in constant depth circuits over AND, OR and NOT. For analog circuits, a topological parameter, the shortcut width of the circuit, turns out to be a key to its efficient learnability. Finally, for probabilistic circuits (equivalently, Bayesian networks) we can generalize the Boolean case for 0/1 values, but we also encounter novel phenomena. This talk describes joint work with James Aspnes, Jiang Chen, David Eisenstat, Lev Reyzin, and Yinghua Wu; relevant papers may be found on the webpage of James

Embedding, Clustering and Matching with Graphs of GPS Data

Tony Jebara, Columbia University

Many machine learning tasks can naturally be framed as problems on graphs. These tasks include dimensionality reduction, clustering and classification. I will describe matching algorithms that recover graphs from data, minimum volume embedding algorithms that recover low dimensional visualizations from graphs and new spectral algorithms that partition graphs into pieces.

At Sense Networks, we have been building graphs from spatio-temporal location data from many GPS equipped phones and devices. One example is a graph or network of places in the city that shows similarity between different locations and how active they are right now. Sense also builds a network of users showing how similar person X is to person Y by comparing their movement trails or histories. Embedding and clustering these graphs reveals interesting trends in behavior and tribes of people that are far more detailed than traditional census demographics. With machine learning algorithms applied to these human activity graphs, it becomes possible to make predictions for advertising, marketing and collaborative recommendation.

Multi-Armed Bandit Problems in Metric Spaces

Robert Kleinberg, Cornell University

Multi-armed bandit problems constitute a well-studied abstraction of the exploration/exploitation tradeoffs inherent in many sequential decision making problems. A broad range of computing applications require bandit algorithms with a large but structured set of alternatives. Often this structure takes the form of a metric: a distance function expressing the decision-maker's prior knowledge that certain alternatives will have similar payoffs. This talk focuses on two such applications, one in electronic commerce and the other in web advertising. We will show how both applications can be formulated as special cases of a general problem, the "Lipschitz multi-armed bandit problem," which generalizes the classical multi-armed bandit problem by allowing for a large (possibly uncountable) decision set comprising the points of a metric space. We will define an invariant that precisely determines the performance of the best possible algorithm for this problem in a given metric, and we will describe an algorithm that meets this bound. This is joint work with Alex Slivkins and Eli Upfal.


Hierarchial Bayesian Models of Categorical Data Annotation
Bob Carpenter, Alias-I Inc.

Sparse Regression and Model Degeneracy in fMRI
Melissa K. Carroll, Guillermo A. Cecchi, Irina Rish, Rahul Garg and A. Ravi Rao, Princeton University

Automatically Extracting Social Networks from Unstructured Text
Jonathan Chang, Jordan Boyd-Graber and David M. Blei, Princeton University

Sample Selection Bias Correction Theory
Corinna Cortes, Mehryar Mohri, Michael Riley and Afshin Rostamizadeh, Google Research

Regret Minimization with Concept Drift
Koby Crammer, Eyal Even-Dar, Yishay Mansour and Jennifer Wortman, University of Pennsylvania

Ranking Electrical Feeders of the New York Power Grid
Philip Gross, Ansaf Salleb-Aouissi, Haimonti Dutta and Albert Boulanger, Columbia University

Automatically Marking Houses in Rural Satellite Images of UN Millennium Villages in Africa
Roy Han, Columbia University

Large Margin Transformation Learning
Andrew G. Howard and Tony Jebara, Columbia University

Learning Directly from Compressed Sensed Data, Maching Learning and Compressed Sensing Benefits
Sina Jafarpour, Princeton University

Scaling Up Linear SVM Classifiers Using Confidence-Based Boosting, A Theoretical Analysis Based on Rademacher Complexity
Sina Jafarpour, Princeton University

Learning Animal Movement Models and Location Estimates Using HMMs
Berk Kapicioglu, Robert E. Schapire, Martin Wikelski and Tamara Broderick, Princeton University

High-Performance Analysis of Sequences
Pavel Kuksa, Pai-Hsi Huang and Vladimir Pavlovic, Rutgers University

Fast Feature Selection for Reinforcement-Learning-Based Spoken Dialog Management: A Case Study
Lihong Li, Jason D. Williams and Suhrid Balakrishnan, Rutgers University

Knows What It Knows: A Framework for Self-Aware Learning
Lihong Li and Thomas J. Walsh, Rutgers University

Learning Regulatory Motifs from Gene Expression Trajectories Using Graph-Regularized Partial Least Square Regression
Xuejing Li, Chris H. Wiggins, Valerie Reinke and Christina Leslie, Columbia University

Reducing Statistical Dependencies in Natural Images Using Radial Gaussianization
Siwei Lyu and Eero P. Simoncelli, University at Albany, SUNY

Comparing SVM and Convolutional Networks for Epileptic Seizure Prediction from EEG
Piotr W. Mirowski, Yann LeCun, Deepak Madhavan and Ruben Kuzniecky, Courant Institute of Mathematical Sciences

A Dynamical Factor Graph with Latent Variables for Time Series Prediction
Piotr W. Mirowski and Yann LeCun, Courant Institute of Mathematical Sciences

Improved Bounds for the Nyström Method
Mehryar Mohri and Ameet Talwalkar, Courant Institute of Mathematical Sciences

Learning with Continuous Experts Using Drifting Games
Indraneel Mukherjee and Robert E. Schapire, Princeton University

PAC-MDP Reinforcement Learning with Bayesian Priors
Ali Nouri and Lihong Li, Rutgers University

An Information-Theoretic Derivation of Min-Cut Based Graph Partitioning
Anil Raj and Chris H. Wiggins, Columbia University

Mining Retail Data for Targeting Customers with Headroom
Madhu Shashanka and Michael Giering, Mars Inc.

Graph Embedding with Global Structure Preserving Constraints
Blake Shaw and Tony Jabara, Columbia University

Improving Data Quality and Data Mining Using Multiple, Noisy Labelers
Victor S. Sheng, Foster Provost and Panagiotis G. Ipeirotis, New York University

A Heuristic to Enable Auditing Decisions in Travel & Entertainment Expense Management
Anshul Sheopuri, Jose Gomes, Sai Zeng, Paolina Centonze and Ioana Boier-Martin, IBM T J Watson Rsearch Center

Relative Margin Machines
Pannagadatta K. Shivaswamy and Tony Jebara, Columbia University

Efficient Learning of Action Schemas and Web-Service Descriptions
Thomas J. Walsh, Rutgers University