## CMPSCI 791 TP (Spring 2012)## Topics in OptimizationMondays, 3:45-5:00pmRoom CS 140 ## Syllabus- January 23rd, 2012:
**Introduction**-*Andrew McCallum* - Paper(s): None
- January 31st, 2012:
**Convex Optimization**-*Brian Martin* - Reading: Boyd's "Convex Optimization": sections: 1.1-1.3, 3.1, 3.2
- Problems: Intro Quiz, Convex Function Quiz
- February 6th, 2012:
**Linear Programming/Duality/Conjugate Duality, Part 1**-*Sebastian Riedel, David Belanger* - If you don't know much about duality, I suggest watching two of Boyd's lectures on it. This is a total of 2.5 hours of lecture, but in some way it's good that it's long because he explains things carefully and thoroughly.If you understand duality and LPs, I suggest making sure you understand the simplex algorithm. Don't worry about the 'tableau' and how it's implemented. Focus on how they go about converting the LP problem into a combinatorial search problem.I learned the material from the following 3 lecture notes (they are short). Focus on making sure you understand the meaning and motivation for 'reduced cost'http://people.orie.cornell.edu/~dpw/orie6300/Lectures/lec08.pdfIf you understand the concept of reduced cost, you may enjoy this paper on min-cut clustering.
- February 13th, 2012:
**Linear Programming/Duality/Conjugate Duality, Part 2**-*Sebastian Riedel, David Belanger* - February 23th, 2012:
**Subgradient Algos, Stochastic Approx, Part 1**-*Alexandre Passos, Sameer Singh* - Papers
http://www.stanford.edu/class/ee364b/notes/subgradients_notes.pdf http://www.stanford.edu/class/ee364b/notes/stoch_subgrad_notes.pdf - Videos:
http://www.stanford.edu/class/ee364b/videos/video01.html http://www.stanford.edu/class/ee364b/videos/video04.html - Supplement: http://leon.bottou.org/publications/pdf/mloptbook-2011.pdf
- February 27th, 2012:
**Subgradient Algos, Stochastic Approx, Part 2**-*Alexandre Passos, Sameer Singh* __Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent__(Feng Niu, Benjamin Recht, Christopher R e and Stephen J. Wright)- Paper: ArXiv (preferred): http://arxiv.org/pdf/1106.5730v2.pdf or NIPS: http://books.nips.cc/papers/files/nips24/NIPS2011_0485.pdf
- Video: http://videolectures.net/nipsworkshops2011_recht_lockfree/
- March 5th, 2012:
**Cutting-plane methods, Branch and Bound, Part 1**-*Sebastian Riedel, Jiaping Zheng* - Video: http://www.stanford.edu/class/ee364b/videos/video05.html
- Notes: Slides, Notes
- Supplementary: http://www.cs.cornell.edu/people/tj/publications/joachims_etal_09a.pdf
- March 12th, 2012: Class cancelled
- March 19th, 2012:
**Spring Break** - March 26th, 2012: Class cancelled
- April 2nd, 2012:
**Cutting-plane methods, Branch and Bound, Part 2**-*Sebastian Riedel, Jiaping Zheng* - Paper(s):
Part of the "Encyclopedia of Optimization": http://homepages.rpi.edu/~mitchj/papers/leeejem.pdf You can focus on the Overview section, and the Illustrative Example Part of the "Handbook of Applied Optimization": http://homepages.rpi.edu/~mitchj/papers/bc_annot.html Please read Sections 1-3. Section 4 also discusses the Gomory cutting plane method. - April 9th, 2012:
**MCMC and Optimization**-*Michael Wick, Sameer Singh* - MCMC tutorial: http://cis.temple.edu/~latecki/Courses/RobotFall07/PapersFall07/andrieu03introduction.pdf
- Chapter 2.1 (motivation, this is optional)
- Chapter 3 (focus primarily on 3.1,3.2,3.4)
- Parallel Tempering: http://pubs.rsc.org/en/Content/ArticleLanding/2005/CP/b509983h
- April 17th, 2012:
**Bayesian/Global Optimization**-*Anton Bakalov* - Tutorial on Bayesian Optimization: http://haikufactory.com/files/bayopt.pdf
- until Section 2.5 (inclusive)
- April 23rd, 2012:
**Sub-modularity**-*Alexandre Passos, Limin Yao* - Slides: http://www.intelligent-optimization.org/LION6/krause-submodularity-tutorial-lion.pdf (that's the best and most concise reference I found) up until slide 62, and make sure you get the main definitions and whatnot.
- Here are some warmup questions. Which of the following functions are submodular, supermodular, or modular?
- Noisy-or gates, as a function of the input features (a noisy or gate is 1 with probability = 1 - prod_i (1 - p_i) , where p_i is a specific parameter for each source).
- mutual information MI(X;Y), as you choose which variables to add to X
- training-set likelihood of a logistic regression classifier as a function of which features you're using
- the k-means score as a function of which points you're using as means (the k-means score is the sum of the distances from points to the closest mean)
- the score of a pairwise binary clustering model as a function of which points go into the positive cluster
- April 30th, 2012:
**A* Search**-*David Belanger, Brian Martin* - Paper(s):
Additional Topics: Dynamic Programming, Integer linear programming + relaxations |