Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

Wang, Peng; Shen, Chunhua; van den Hengel, Anton; Torr, Philip H. S.

doi:10.1007/s11263-015-0865-2

Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

Published: 24 October 2015

Volume 117, pages 269–289, (2016)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Peng Wang¹,
Chunhua Shen^1,2,
Anton van den Hengel^1,2 &
…
Philip H. S. Torr³

675 Accesses
2 Citations
Explore all metrics

Abstract

We propose a branch-and-cut (B&C) method for solving general MAP-MRF inference problems. The core of our method is a very efficient bounding procedure, which combines scalable semidefinite programming (SDP) and a cutting-plane method for seeking violated constraints. In order to further speed up the computation, several strategies have been exploited, including model reduction, warm start and removal of inactive constraints. We analyze the performance of the proposed method under different settings, and demonstrate that our method either outperforms or performs on par with state-of-the-art approaches. Especially when the connectivities are dense or when the relative magnitudes of the unary costs are low, we achieve the best reported results. Experiments show that the proposed algorithm achieves better approximation than the state-of-the-art methods within a variety of time budgets on challenging non-submodular MAP-MRF inference problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Nonmonotone Submodular Maximization Under Routing Constraints

On Tightening the M-Best MAP Bounds

Linear size MIP formulation of Max-Cut: new properties, links with cycle inequalities and computational results

Article 20 November 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

In the following experiments, we find that odd-wheel inequalities are only effective on the modularity clustering models, therefore this class of constraints is not considered for other models.
http://www.di.ens.fr/~mschmidt/Software/UGM.html.
http://icl.cs.utk.edu/plasma/.

References

Achterberg, T., Koch, T., & Martin, A. (2005). Branching rules revisited. Operations Research Letters, 33(1), 42–54.
Article MathSciNet MATH Google Scholar
Aji, S. M., Horn, G. B., Mceliece, R. J. (1998). On the convergenceof iterative decoding on graphs with a single cycle. In Proceedings of ISIT.
Alahari, K., Kohli, P., & Torr, P. H. (2008). Reduce, reuse & recycle: Efficiently solving multi-label MRFs. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).
Alizadeh, F., Haeberly, J.-P. A., & Overton, M. L. (1998). Primal-dual interior-point methods for semidefinite programming: Convergence rates, stability and numerical results. SIAM Journal on Optimization, 8(3), 746–768.
Article MathSciNet MATH Google Scholar
Andersen, E. D., Roos, C., & Terlaky, T. (2003). On implementing a primal-dual interior-point method for conic quadratic optimization. Mathematical Programming, 95(2), 249–277.
Article MathSciNet MATH Google Scholar
Andres, B., Beier, T., & Kappes, J. H. (2014). OpenGM2. Retrieved from http://hci.iwr.uni-heidelberg.de/opengm2/.
Armbruster, M., Fügenschuh, M., Helmberg, C., & Martin, A. (2012). LP and SDP branch-and-cut algorithms for the minimum graph bisection problem: A computational comparison. Mathematical Programming Computation, 4(3), 275–306.
Article MathSciNet MATH Google Scholar
Arora, S., & Kale, S. (2007). A combinatorial, primal-dual approach to semidefinite programs. In Proceedings of Annual ACM Symposium on Theory of Computing (pp. 227–236).
Arora, S., Hazan, E., & Kale, S. (2005). Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In Proceedings of Annual IEEE Symposium on Foundations of Computer Science (pp. 339–348).
Arora, S., Hazan, E., & Kale, S. (2012). The multiplicative weights update method: A meta-algorithm and applications. Theory of Computing, 8(1), 121–164.
Article MathSciNet MATH Google Scholar
Barahona, F., & Mahjoub, A. R. (1986). On the cut polytope. Mathematical programming, 36(2), 157–173.
Article MathSciNet MATH Google Scholar
Batra, D., Nowozin, S., & Kohli, P. (2011). Tighter relaxations for MAP-MRF inference: A local primal-dual gap based separation algorithm. In Proceedings International Conference on Artificial Intelligence and Statistics (pp. 146–154).
Bayati, M., Shah, D., & Sharma, M. (2005). Maximum weight matching via max-product belief propagation. In Proceedings of IEEE International Symposium on Information Theory (pp. 1763–1767).
Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society Series B (Methodological), 48, 259–302.
MathSciNet MATH Google Scholar
Bonato, T., Jünger, M., Reinelt, G., & Rinaldi, G. (2014). Lifting and separation procedures for the cut polytope. Mathematical Programming, 146(1–2), 351–378.
Article MathSciNet MATH Google Scholar
Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.
Article Google Scholar
Burer, S., & Monteiro, R. D. (2003). A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2), 329–357.
Article MathSciNet MATH Google Scholar
Burer, S., & Vandenbussche, D. (2008). A finite branch-and-bound algorithm for nonconvex quadratic programming via semidefinite relaxations. Mathematical Programming, 113(2), 259–282.
Article MathSciNet MATH Google Scholar
Chopra, S., & Rao, M. R. (1993). The partition problem. Mathematical Programming, 59(1–3), 87–115.
Article MathSciNet MATH Google Scholar
Deza, M., Grötschel, M., & Laurent, M. (1992). Clique-web facets for multicut polytopes. Mathematics of Operations Research, 17(4), 981–1000.
Article MathSciNet MATH Google Scholar
Deza, M., & Laurent, M. (1997). Geometry of cuts and metrics. Algorithms and combinatorics (Vol. 15). Berlin: Springer.
Google Scholar
Dinh, T. P., Canh, N. N., & Le Thi, H. A. (2010). An efficient combined DCA and B&B using DC/SDP relaxation for globally solving binary quadratic programs. Journal of Global Optimization, 48(4), 595–632.
Article MathSciNet MATH Google Scholar
Elidan, G., & Globerson, A. (2011). The probabilistic inference challenge (PIC2011). Retrieved from http://www.cs.huji.ac.il/project/PASCAL/.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54.
Article Google Scholar
Frostig, R., Wang, S., Liang, P. S., & Manning, C. D. (2014). Simple MAP inference via low-rank relaxations. In Proceedings of Advances in Neural Information Processing Systems (pp. 3077–3085).
Garber, D., & Hazan, E. (2011). Approximating semidefinite programs in sublinear time. In Proceedings of Advances in Neural Information Processing Systems (pp. 1080–1088).
Givry, S.D., Hurley, B., Allouche, D., Katsirelos, G., O’Sullivan, B., & Schiex, T. (2014). An experimental evaluation of CP/AI/OR solvers for optimization in graphical models. In Congrès ROADEF’2014, Bordeaux, FRA.
Globerson, A., & Jaakkola, T.S. (2007). Fixing max-product:Convergent message passing algorithms for MAP LP-relaxations. In Proceedings of Advances in Neural Information Processing Systems.
Goemans, M. X., & Williamson, D. P. (1995). Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 42, 1115–1145.
Article MathSciNet MATH Google Scholar
Gorelick, L., Boykov, Y., Veksler, O., Ayed, I. B., & Delong, A. (2014). Local submodular approximations for binary pairwiseenergies. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
Hazan, E. (2008). Sparse approximate solutions to semidefinite programs. In LATIN 2008: Theoretical Informatics (pp. 306–316).
Hazan, T., & Shashua, A. (2010). Norm-product belief propagation: Primal-dual message-passing for approximate inference. IEEE Transactions on Information Theory, 56(12), 6294–6316.
Article MathSciNet Google Scholar
Helmberg, C. (1994). An interior point method for semidefinite programming and max-cut bounds, Ph.D. dissertation, Department of Mathematics, Graz University of Technology.
Helmberg, C., Poljak, S., Rendl, F., & Wolkowicz, H. (1995). Combining semidefinite and polyhedral relaxations for integer programs. In Proceedings of the 4th International IPCO Conference on Integer Programming and Combinatorial Optimization (pp. 124–134).
Helmberg, C., Rendl, F., & Weismantel, R. (1996). Quadratic knapsack relaxations using cutting planes. In Proceedings of the 5th International IPCO Conference on Integer Programming and Combinatorial Optimization (pp. 175–189).
Helmberg, C., & Rendl, F. (1998). Solving quadratic (0, 1)-problems by semidefinite programs and cutting planes. Mathematical Programming, 82(3), 291–315.
Article MathSciNet MATH Google Scholar
Helmberg, C., Rendl, F., & Weismantel, R. (2000). A semidefinite programming approach to the quadratic knapsack problem. Journal of Combinatorial Optimization, 4(2), 197–215.
Article MathSciNet MATH Google Scholar
Helmberg, C., & Weismantel, R. (1998). Cutting plane algorithms for semidefinite relaxations. Fields Institute Communications, 18, 197–213.
MathSciNet MATH Google Scholar
Hendrix, E. M., Boglárka, G., et al. (2010). Introduction to nonlinear and global optimization. Berlin: Springer.
Book MATH Google Scholar
Horst, R., Pardalos, P. M., & Van Thoai, N. (2000). Introduction to global optimization. Berlin: Springer.
Book MATH Google Scholar
Horst, R., & Tuy, H. (2013). Global optimization: Deterministic approaches. Berlin: Springer.
MATH Google Scholar
Huang, Q., Chen, Y., & Guibas, L. (2014). Scalable semidefiniterelaxation for maximum a posterior estimation. In Proceedings of International Conference on Machine Learning.
IBM. (2015). ILOG CPLEX optimizer. Retrieved from http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/.
Joachims, T., Finley, T., & Yu, C.-N. J. (2009). Cutting-plane training of structural svms. Machine Learning, 77(1), 27–59.
Article MATH Google Scholar
Johnson, J. K. (2008). Convex relaxation methods for graphical models: Lagrangian and maximum entropy approaches, Ph.D. dissertation, Massachusetts Institute of Technology, 2008.
Johnson, J. K., Malioutov, D. M., & Willsky, A. S. (2007). Lagrangian relaxation for MAP estimation in graphicalmodels. In Annual Allerton Conference on Communication, Control, and Computing.
Jojic, V., Gould, S., & Koller, D. (2010). Accelerated dual decomposition for MAP inference. In Proceedings of International Conference on Machine Learning (pp. 503–510).
Jordan, M. J., & Wainwright, M. I. (2003). Semidefinite relaxations for approximate inference on graphs with cycles. In Proceedings of Advances in Neural Information Processing Systems (vol. 16, pp. 369–376).
Joulin, A., Bach, F., & Ponce, J. (2010). Discriminativeclustering for image co-segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
Kappes, J. H., Andres, B., Hamprecht, F. A., Schnörr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B. X., Kröger, T., Lellmann, J., Komodakis, N., Savchynskyy, B., & Rother, C. (2015). Acomparative study of modern inference techniques for structureddiscrete energy minimization problems, International Journal of Computer Vision.
Kappes, J. H., Savchynskyy, B., Schnörr, C. (2012). A bundle approach to efficient MAP-inference by lagrangian relaxation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1688–1695).
Kappes, J. H., Schmidt, S., & Schnörr, C. (2010). MRF inference by k-fan decomposition and tight lagrangian relaxation. In Proceedings of European Conference on Computer Vision (pp. 735–747).
Kappes, J. H., Speth, M., Andres, B., Reinelt, G., & Schnörr, C. (2011). Globally optimal image partitioning by multicuts. In Energy Minimization Methods in Computer Vision and Pattern Recognition (pp. 31–44).
Kappes, J. H., Speth, M., Reinelt, G., & Schnörr, C. (2013). Higher-order segmentation via multicuts. arXiv:1305.6387, preprint.
Kappes, J. H., Speth, M., Reinelt, G., & Schnörr, C. (2013). Towards efficient and exact MAP-inference for large scale discrete computer vision problems via combinatorial optimization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1752–1758).
Kernighan, B., & Lin, S. (1970). An efficient heuristic procedure for partitioning graphs. The Bell Systems Technical Journal, 49(2), 291–307.
Article MATH Google Scholar
Kim, W., & Lee, K. M. (2011). A hybrid approach for MRF optimization problems: Combination of stochastic sampling and deterministic algorithms. Computer Vision and Image Understanding, 115(12), 1623–1637.
Article Google Scholar
Kohli, P., Shekhovtsov, A., Rother, C., Kolmogorov, V., & Torr, P. (2008). On partial optimality in multi-label mrfs. In Proceedings of International Conference on Machine Learning (pp. 480–487).
Kolmogorov, V., & Rother, C. (2006). Comparison of energy minimization algorithms for highly connected graphs. In Proceedings of European Conference on Computer Vision (pp. 1–15).
Kolmogorov, V., & Wainwright, M. J. (2005). On the optimality oftree-reweighted max-product message-passing. In Proceedings Uncertainty in Artificial Intelligence.
Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.
Article Google Scholar
Kolmogorov, V., & Rother, C. (2007). Minimizing nonsubmodular functions with graph cuts-a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1274–1279.
Article Google Scholar
Kolmogorov, V., & Zabin, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.
Article Google Scholar
Komodakis, N., & Paragios, N. (2008). Beyond loose LP-relaxations: Optimizing MRFs by repairing cycles. In Proceedings of European Conference on Computer Vision (pp. 806–820).
Komodakis, N., Paragios, N., & Tziritas, G. (2011). MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 531–552.
Article Google Scholar
Kovtun, I. (2011). Sufficient condition for partial optimality for (max,+)-labeling problems and its usage. Control Systems and Computers, 2, 71–78.
Google Scholar
Kumar, M. P., Torr, P. H., & Zisserman, A. (2006). Solving Markov random fields using second order cone programming relaxations. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (vol. 1, pp. 1045–1052).
Kumar, M. P., Kolmogorov, V., & Torr, P. H. S. (2009). An analysis of convex relaxations for MAP estimation of discrete MRFs. Journal of Machine Learning Research, 10, 71–106.
MathSciNet MATH Google Scholar
Land, A., & Powell, S. (1979). Computer codes for problems of integer programming. Annals of Discrete Mathematics, 5, 221–269.
Article MathSciNet MATH Google Scholar
Laue, S. (2012). A hybrid algorithm for convex semidefinite optimization. In Proceedings of International Conference on Machine Learning (pp. 177–184).
Linderoth, J. T., & Savelsbergh, M. W. (1999). A computational study of search strategies for mixed integer programming. INFORMS Journal on Computing, 11(2), 173–187.
Article MathSciNet MATH Google Scholar
Liu, F., Lin, G., & Shen, C. (2015). CRF learning with CNN features for image segmentation. Pattern Recognition, 48(10), 2983–2992.
Article Google Scholar
Malick, J. (2007). The spherical constraint in boolean quadratic programs. Journal of Global Optimization, 39(4), 609–622.
Article MathSciNet MATH Google Scholar
Malick, J., Povh, J., Rendl, F., & Wiegele, A. (2009). Regularization methods for semidefinite programming. SIAM Journal on Optimization, 20(1), 336–356.
Article MathSciNet MATH Google Scholar
Mars, S., & Schewe, L. (2012). SDP-package for SCIP, TUDarmstadt, Technical Report.
Martins, A. F., Figueiredo, M. A., Aguiar, P. M., Smith, N. A., & Xing, E. P. (2011a). An augmented Lagrangian approach to constrained MAP inference. In Proceedings of International Conference on Machine Learning (pp. 169–176).
Martins, A. F., Smith, N. A., Xing, E. P., Aguiar, P. M., & Figueiredo, M. A. (2011b). Augmenting dual decomposition for MAP inference. In Proceedings of International Workshop on Optimization for Machine Learning.
Meshi, O., & Globerson, A. (2011). An alternating direction method for dual MAP LP relaxation. In Machine Learning and Knowledge Discovery in Databases. (pp. 470–483). Springer, Berlin.
Mitra, G. (1973). Investigation of some branch and bound strategies for the solution of mixed integer linear programs. Mathematical Programming, 4(1), 155–170.
Article MathSciNet MATH Google Scholar
Nesterov, Y. E., & Todd, M. J. (1998). Primal-dual interior-point methods for self-scaled cones. SIAM Journal on Optimization, 8(2), 324–364.
Article MathSciNet MATH Google Scholar
Olsson, C., Eriksson, A. P., & Kahl, F. (2007). Solving large scale binary quadratic problems: Spectral methods vs. semidefinite programming. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).
Otten, L., & Dechter, R. (2012). Anytime AND/OR depth-first search for combinatorial optimization. AI Communications, 25(3), 211–227.
MathSciNet MATH Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan Kaufmann.
MATH Google Scholar
Peng, J., Hazan, T., Srebro, N., & Xu, J. (2012). Approximate inference by intersecting semidefinite bound and local polytope. In Proceedings of International Conference on Artificial Intelligence and Statistics (pp. 868–876).
Raj, A., & Zabih, R. (2005). A graph cut algorithm forgeneralized image deconvolution. In Proceedings of IEEE International Conference on Computer Vision.
Ravikumar, P., & Lafferty, J. (2006). Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. In Proceedings of International Conference on Machine Learning (pp. 737–744).
Ravikumar, P., Agarwal, A., & Wainwright, M. J. (2010). Message-passing for graph-structured linear programs: Proximal methods and rounding schemes. Journal of Machine Learning Research, 11, 1043–1080.
MathSciNet MATH Google Scholar
Rislock, N., Malick, J., & Roupin, F. (2012). Improved semidefinite bounding procedure for solving max-cut problems to optimality, Mathematical Programming.
Rockafellar, R. T. (1973). A dual approach to solving nonlinear programming problems by unconstrained optimization. Mathematical Programming, 5(1), 354–373.
Article MathSciNet MATH Google Scholar
Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (pp. 1–8).
Savchynskyy, B., Schmidt, S., Kappes, J., & Schnörr, C. (2012). Efficient MRF energy minimization via adaptive diminishingsmoothing. In Proceedings Uncertainty in Artificial Intelligence.
Schellewald, C., & Schnörr, C. (2005). Probabilistic subgraph matching based on convex relaxation. In Workshop of IEEE Conference on Computer Vision and Pattern Recognition (pp. 171–186).
Shekhovtsov, A. (2014). Maximum persistency in energy minimization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1162–1169).
Shekhovtsov, A., Swoboda, P., Savchynskyy, B., used by Alahari, I. et al. (2015). Maximum persistency via iterative relaxed inference with graphical models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 521–529).
Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In Proceedings of Conference on Computer Vision and Pattern Recognition (pp. 2601–2608).
Shlezinger, M. (1976). Syntactic analysis of two-dimensional visual signals in noisy conditions. Kibernetika, 4, 113–130.
Google Scholar
Sontag, D., & Jaakkola, T. S. (2007). New outer bounds on themarginal polytope. In Proceedings of Advances in Neural Information Processing Systems.
Sontag, D., Choe, D. K., & Li, Y. (2012). Efficiently searchingfor frustrated cycles in MAP inference. In Proceedings of Uncertainty in Artificial Intelligence.
Sontag, D., Meltzer, T., Globerson, A., Jaakkola, T. S., & Weiss, Y. (2008). Tightening LP relaxations for MAP using messagepassing. In Proceedings of Uncertainty in Artificial Intelligence.
Sun, J., Shum, H.-Y., & Zheng, N.-N. (2002). Stereo matching using belief propagation. In Proceedings of European Conference on Computer Vision (pp. 510–524).
Sun, M., Telaprolu, M., Lee, H., & Savarese, S. (2012). Efficient and exact MAP inference using branch and bound. In Proceedings of International Conference on Artificial Intelligence and Statistics.
Swoboda, P., Savchynskyy, B., Kappes, J., & Schnörr, C. (2013). Partial optimality via iterative pruning for the potts model. In SSVM.
Swoboda, P., Savchynskyy, B., Kappes, J. H., & Schnörr, C. (2014). Partial optimality by pruning for map-inference with general graphical models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1170–1177).
Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., et al. (2008). A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 1068–1080.
Article Google Scholar
Topkis, D. (1982). A cutting-plane algorithm with linear and geometric rates of convergence. Journal of Optimization Theory and Applications, 36(1), 1–22.
Article MathSciNet Google Scholar
Torr, P. H. S. (2003). Solving Markov random fields using semidefinite programming. In Proceedings of International Conference on Artificial Intelligence and Statistics.
Tütüncü, R. H., Toh, K. C., & Todd, M. J. (2003). Solving semidefinite-quadratic-linear programs using SDPT3. Mathematical Programming, 95(2), 189–217.
Article MathSciNet MATH Google Scholar
Wainwright, M. J., Jaakkola, T. S., & Willsky, A. S. (2005). MAP estimation via agreement on trees: Message-passing and linear programming. IEEE Transactions on Information Theory, 51(11), 3697–3717.
Article MathSciNet MATH Google Scholar
Wainwright, M. J., & Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2), 1–305.
MATH Google Scholar
Wang, P., Shen, C., & Hengel, A. V. D. (2015). Efficient SDPinference for fully-connected CRFs based on low-rankdecomposition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
Wang, P., Shen, C., & van den Hengel, A. (2013). A fastsemidefinite approach to solving binary quadratic problems. InProceedings of IEEE Conference on Computer Vision and Pattern Recognition.
Weiss, Y. (2000). Correctness of local probability propagation in graphical models with loops. Neural Computation, 12(1), 1–41.
Article Google Scholar
Weiss, Y., & Freeman, W. T. (2001). On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 47(2), 736–744.
Article MathSciNet MATH Google Scholar
Wen, Z., Goldfarb, D., & Yin, W. (2010). Alternating direction augmented Lagrangian methods for semidefinite programming. Mathematical Programming Computation, 2(3–4), 203–230.
Article MathSciNet MATH Google Scholar
Werner, T. (2007). A linear programming approach to max-sum problem: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1165–1179.
Article Google Scholar
Werner, T. (2008). High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (pp. 1–8).
Windheuser, T., Ishikawa, H., & Cremers, D. (2012). Generalized roof duality for multi-label optimization: Optimal lower bounds and persistency. In Proceedings of European Conference on Computer Vision (pp. 400–413).
Ye, Y., Todd, M. J., & Mizuno, S. (1994). An O ($\sqrt{nL}$)-iteration homogeneous and self-dual linear programming algorithm. Mathematics of Operations Research, 19(1), 53–67.
Article MathSciNet MATH Google Scholar
Zhao, X.-Y., Sun, D., & Toh, K.-C. (2010). A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM Journal on Optimization, 20(4), 1737–1765.
Article MathSciNet MATH Google Scholar
Zhu, C., Byrd, R. H., Lu, P., & Nocedal, J. (1997). L-BFGS-B: Fortran subroutines for large-scale bound constrainedoptimization. ACM Transactions on Mathematical Software, 23(4), 550–560.
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

University of Adelaide, Adelaide, SA, 5005, Australia
Peng Wang, Chunhua Shen & Anton van den Hengel
Australian Centre for Robotic Vision, Adelaide, SA, Australia
Chunhua Shen & Anton van den Hengel
University of Oxford, Oxford, UK
Philip H. S. Torr

Authors

Peng Wang
View author publications
You can also search for this author inPubMed Google Scholar
Chunhua Shen
View author publications
You can also search for this author inPubMed Google Scholar
Anton van den Hengel
View author publications
You can also search for this author inPubMed Google Scholar
Philip H. S. Torr
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Chunhua Shen.

Additional information

Communicated by Yuri Boykov.

Appendix

1.1 Appendix 1: Relationship between the Standard SDP Relaxation (11) and the Simplified Dual (21)

The Lagrangian dual of (11) can be expressed in the following general form:

$$\begin{aligned}&\min _{\mathbf {u}} \quad \mathbf {u}^{\top }\mathbf {b}\end{aligned}$$

(25a)

$$\begin{aligned}&\, \mathrm {s.t.}\,\quad \mathbf {Z}= \mathbf {A}+ \textstyle {\sum _{i=1}^m} u_i \mathbf {B}_i \succcurlyeq \mathbf {0}, \end{aligned}$$

(25b)

$$\begin{aligned}&\qquad \quad u_i \ge 0, \forall i \in \mathcal {I}_{in}. \end{aligned}$$

(25c)

The p.s.d. constraint (25b) can be replaced by a penalty function, which is considered as a measure of violation of this constraint. In our case, the penalty function is defined as $\mathrm {p}(\mathbf {u}) = ||\min (\mathbf {0}, {\varvec{\lambda }}) ||_2^2 = ||\varPi _{\mathcal {S}^{nh+1}_+} (\mathbf {C}(\mathbf {u})) ||_F^2 $, where ${\varvec{\lambda }}$ is the vector of eigenvelues of $\mathbf {Z}$. We can find that if $\mathrm {p}(\mathbf {u}) = 0$, then $\mathbf {Z}\succcurlyeq \mathbf {0}$. Now the problem (25) can be transformed to

(26a)

$$\begin{aligned} \mathrm {s.t.}\,\,\quad u_i \ge 0, \forall i \in \mathcal {I}_{in}, \end{aligned}$$

(26b)

where $\gamma > 0$ serves as a penalty parameter. With the increase of $\gamma $, the solution to (26) converges to that of (25). It is clear that (26) is equivalent to (21).

1.2 Appendix 2: Proof of Propositions 1 and 2

Firstly, it is known (Malick 2007; Wang et al. 2013) that the set of p.s.d. matrices with fixed trace $\varTheta _\eta := \{ \mathbf {X}\succcurlyeq \mathbf {0} | \mathrm {trace}(\mathbf {X}) = \eta \}$, $\forall \eta > 0$ has the following property:

Theorem 2

(The spherical constraint). $\forall \eta >0, \forall \mathbf {X}\in \varTheta _\eta $, we have $||\mathbf {X}||_{F} \le \eta $, and $||\mathbf {X}||_{F} = \eta $ if and only if $\mathrm {rank}(\mathbf {X}) = 1$.

It is also shown in Wang et al. (2013) that the problem (21) is the Lagrangian dual of the following problem:

$$\begin{aligned}&\min _{\mathbf {y},\mathbf {Y}} \,\, \mathrm {E}(\mathbf {y},\mathbf {Y}) + \mathrm {g}_\gamma (\mathbf {y},\mathbf {Y}) \end{aligned}$$

(27a)

$$\begin{aligned}&\,\mathrm {s.t.}\,\, (12), (13), (14), (15), (16), (17), (18), (19),\end{aligned}$$

(27b)

$$\begin{aligned}&\,\qquad \varOmega (\mathbf {y}, \mathbf {Y}) \succcurlyeq \mathbf {0}, \end{aligned}$$

(27c)

where $\mathrm {g}_\gamma (\mathbf {y},\mathbf {Y}) = \frac{1}{2\gamma }(||\varOmega (\mathbf {y},\mathbf {Y}) ||^2_F - (n+1)^2)$.

Proof of Proposition 1

(i) $\forall \mathcal {D}_1 \subseteq \mathcal {D}_2 \subseteq \mathcal {Z}^n$, $\exists \mathcal {F}_{in}, \mathcal {F}_{eq} \in \{(p,i) \}_{p \in \mathcal {V}, i \in \mathcal {Z}}$ such that $\mathcal {D}_1 = \{ \mathbf { x}\in \mathcal {D}_2 \ | \ x_p \ne i, \forall (p,i) \in \mathcal {F}_{in}; x_p = i, \forall (p,i) \in \mathcal {F}_{eq} \}$. Consequently, the difference between the SDCut primal formulation (27) with respect to $\mathcal {D}_1$ and $\mathcal {D}_2$ is that the one with respect to $\mathcal {D}_1$ contains the following additional linear constraints:

$$\begin{aligned} \left\{ \begin{array}{ll} y_{p,i} = 0, Y_{pi,qj} = Y_{qj,pi} = 0, &{}\forall (p,i) \in \mathcal {F}_{in}, \\ y_{p,i} = 1, Y_{pi,qj} = Y_{qj,pi} = y_{q,j}, &{}\forall (p,i) \in \mathcal {F}_{eq}. \end{array} \right. \end{aligned}$$

(28)

Because of the strong duality, we know that $\mathrm {d}^\star _\gamma (\mathcal {D})$ equals to the optimal value of the corresponding primal problem (27). Then we have $\mathrm {d}^\star _\gamma (\mathcal {D}_1) \ge \mathrm {d}^\star _\gamma (\mathcal {D}_2)$, as the primal problem (27) with respect to $\mathcal {D}_1$ has more constraints than that with respect to $\mathcal {D}_2$.

(ii) This proof is simple. As $|{\mathcal {D}} |= 1$, there is only one point $\hat{\mathbf { x}}$ in the set $\mathcal {D}$ and $\mathrm {E}(\hat{\mathbf { x}}) = \min _{\mathbf { x}\in \mathcal {D}} \mathrm {E}(\mathbf { x})$. Then the feasible set of (27) also contains a single point $\{ \hat{\mathbf {y}}, \hat{\mathbf {Y}} \}$ corresponding to $\hat{\mathbf { x}}$ by applying constraints as (28). Because $||\varOmega (\hat{\mathbf {y}},\hat{\mathbf {Y}}) ||^2_F = (n+1)^2$, we have $\mathrm {d}^\star _\gamma (\mathcal {D}) = \mathrm {E}(\hat{\mathbf {y}},\hat{\mathbf {Y}}) = \min _{\mathbf { x}\in \mathcal {D}} \mathrm {E}(\mathbf { x})$. $\square $

Proof of Proposition 2

$\{ \mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \}$ is the optimal solution of (27) based on the strong duality, and $\mathrm {d}_\gamma (\mathbf {u}^\star _\gamma )$ is the corresponding optimal objective value. Consider the following problem

$$\begin{aligned}&\min _{\mathbf {y},\mathbf {Y}} \,\, \mathrm {E}(\mathbf {y},\mathbf {Y}) + \mathrm {g}_\gamma (\mathbf {y},\mathbf {Y}) \end{aligned}$$

(29a)

$$\begin{aligned}&\mathrm {s.t.}\,\, \varOmega (\mathbf {y}, \mathbf {Y}) \succcurlyeq \mathbf {0}, \, \mathrm {rank}(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star })) = 1, \end{aligned}$$

(29b)

$$\begin{aligned}&\qquad (12), (13), (14), (15), (16), (17), (18), (19), \end{aligned}$$

(29c)

which adds a rank-1 constraint to the problem (27). Then $\mathrm {d}_\gamma (\mathbf {u}^\star _\gamma )$ and $\{ \mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \}$ are also optimal for the above problem. Note that the constraints (12), (13), $\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star }) \succcurlyeq \mathbf {0}$ and $\mathrm {rank}(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star })) = 1$, force $\{ \mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \}$ to be a vertex of $\mathcal {M}(\mathcal {G},\mathcal {Z})$. So the feasible set of (29) is $\mathcal {M}(\mathcal {G},\mathcal {Z})$. On the other hand, $\mathrm {g}_\gamma (\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star ) = 0$ at $\mathrm {rank}(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star })) = 1$ (Theorem 2), so the objective function of (29) is $\mathrm {E}(\mathbf {y},\mathbf {Y}) $. In summary, the problem (29) is equivalent to the MAP problem $\displaystyle {\min _{\mathbf {y},\mathbf {Y}\in \mathcal {M}(\mathcal {G},\mathcal {Z})}} \mathrm {E}(\mathbf {y},\mathbf {Y})$. Then we have that $\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star $ yield the exact MAP solution and $\mathrm {d}_\gamma (\mathbf {u}^\star _\gamma )$ is the minimum energy. The value of $\gamma $ does not affect the above results. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, P., Shen, C., van den Hengel, A. et al. Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference. Int J Comput Vis 117, 269–289 (2016). https://doi.org/10.1007/s11263-015-0865-2

Download citation

Received: 17 December 2014
Accepted: 05 October 2015
Published: 24 October 2015
Issue Date: May 2016
DOI: https://doi.org/10.1007/s11263-015-0865-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Nonmonotone Submodular Maximization Under Routing Constraints

On Tightening the M-Best MAP Bounds

Linear size MIP formulation of Max-Cut: new properties, links with cycle inequalities and computational results

Explore related subjects

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Appendix 1: Relationship between the Standard SDP Relaxation (11) and the Simplified Dual (21)

1.2 Appendix 2: Proof of Propositions 1 and 2

Theorem 2

Proof of Proposition 1

Proof of Proposition 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now