Skip to main content
Log in

Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We propose a branch-and-cut (B&C) method for solving general MAP-MRF inference problems. The core of our method is a very efficient bounding procedure, which combines scalable semidefinite programming (SDP) and a cutting-plane method for seeking violated constraints. In order to further speed up the computation, several strategies have been exploited, including model reduction, warm start and removal of inactive constraints. We analyze the performance of the proposed method under different settings, and demonstrate that our method either outperforms or performs on par with state-of-the-art approaches. Especially when the connectivities are dense or when the relative magnitudes of the unary costs are low, we achieve the best reported results. Experiments show that the proposed algorithm achieves better approximation than the state-of-the-art methods within a variety of time budgets on challenging non-submodular MAP-MRF inference problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. In the following experiments, we find that odd-wheel inequalities are only effective on the modularity clustering models, therefore this class of constraints is not considered for other models.

  2. http://www.di.ens.fr/~mschmidt/Software/UGM.html.

  3. http://icl.cs.utk.edu/plasma/.

References

  • Achterberg, T., Koch, T., & Martin, A. (2005). Branching rules revisited. Operations Research Letters, 33(1), 42–54.

    Article  MathSciNet  MATH  Google Scholar 

  • Aji, S. M., Horn, G. B., Mceliece, R. J. (1998). On the convergenceof iterative decoding on graphs with a single cycle. In Proceedings of ISIT.

  • Alahari, K., Kohli, P., & Torr, P. H. (2008). Reduce, reuse & recycle: Efficiently solving multi-label MRFs. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).

  • Alizadeh, F., Haeberly, J.-P. A., & Overton, M. L. (1998). Primal-dual interior-point methods for semidefinite programming: Convergence rates, stability and numerical results. SIAM Journal on Optimization, 8(3), 746–768.

    Article  MathSciNet  MATH  Google Scholar 

  • Andersen, E. D., Roos, C., & Terlaky, T. (2003). On implementing a primal-dual interior-point method for conic quadratic optimization. Mathematical Programming, 95(2), 249–277.

    Article  MathSciNet  MATH  Google Scholar 

  • Andres, B., Beier, T., & Kappes, J. H. (2014). OpenGM2. Retrieved from http://hci.iwr.uni-heidelberg.de/opengm2/.

  • Armbruster, M., Fügenschuh, M., Helmberg, C., & Martin, A. (2012). LP and SDP branch-and-cut algorithms for the minimum graph bisection problem: A computational comparison. Mathematical Programming Computation, 4(3), 275–306.

    Article  MathSciNet  MATH  Google Scholar 

  • Arora, S., & Kale, S. (2007). A combinatorial, primal-dual approach to semidefinite programs. In Proceedings of Annual ACM Symposium on Theory of Computing (pp. 227–236).

  • Arora, S., Hazan, E., & Kale, S. (2005). Fast algorithms for approximate semidefinite programming using the multiplicative weights update method. In Proceedings of Annual IEEE Symposium on Foundations of Computer Science (pp. 339–348).

  • Arora, S., Hazan, E., & Kale, S. (2012). The multiplicative weights update method: A meta-algorithm and applications. Theory of Computing, 8(1), 121–164.

    Article  MathSciNet  MATH  Google Scholar 

  • Barahona, F., & Mahjoub, A. R. (1986). On the cut polytope. Mathematical programming, 36(2), 157–173.

    Article  MathSciNet  MATH  Google Scholar 

  • Batra, D., Nowozin, S., & Kohli, P. (2011). Tighter relaxations for MAP-MRF inference: A local primal-dual gap based separation algorithm. In Proceedings International Conference on Artificial Intelligence and Statistics (pp. 146–154).

  • Bayati, M., Shah, D., & Sharma, M. (2005). Maximum weight matching via max-product belief propagation. In Proceedings of IEEE International Symposium on Information Theory (pp. 1763–1767).

  • Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society Series B (Methodological), 48, 259–302.

    MathSciNet  MATH  Google Scholar 

  • Bonato, T., Jünger, M., Reinelt, G., & Rinaldi, G. (2014). Lifting and separation procedures for the cut polytope. Mathematical Programming, 146(1–2), 351–378.

    Article  MathSciNet  MATH  Google Scholar 

  • Boykov, Y., Veksler, O., & Zabih, R. (2001). Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11), 1222–1239.

    Article  Google Scholar 

  • Burer, S., & Monteiro, R. D. (2003). A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2), 329–357.

    Article  MathSciNet  MATH  Google Scholar 

  • Burer, S., & Vandenbussche, D. (2008). A finite branch-and-bound algorithm for nonconvex quadratic programming via semidefinite relaxations. Mathematical Programming, 113(2), 259–282.

    Article  MathSciNet  MATH  Google Scholar 

  • Chopra, S., & Rao, M. R. (1993). The partition problem. Mathematical Programming, 59(1–3), 87–115.

    Article  MathSciNet  MATH  Google Scholar 

  • Deza, M., Grötschel, M., & Laurent, M. (1992). Clique-web facets for multicut polytopes. Mathematics of Operations Research, 17(4), 981–1000.

    Article  MathSciNet  MATH  Google Scholar 

  • Deza, M., & Laurent, M. (1997). Geometry of cuts and metrics. Algorithms and combinatorics (Vol. 15). Berlin: Springer.

    Google Scholar 

  • Dinh, T. P., Canh, N. N., & Le Thi, H. A. (2010). An efficient combined DCA and B&B using DC/SDP relaxation for globally solving binary quadratic programs. Journal of Global Optimization, 48(4), 595–632.

    Article  MathSciNet  MATH  Google Scholar 

  • Elidan, G., & Globerson, A. (2011). The probabilistic inference challenge (PIC2011). Retrieved from http://www.cs.huji.ac.il/project/PASCAL/.

  • Felzenszwalb, P. F., & Huttenlocher, D. P. (2006). Efficient belief propagation for early vision. International Journal of Computer Vision, 70(1), 41–54.

    Article  Google Scholar 

  • Frostig, R., Wang, S., Liang, P. S., & Manning, C. D. (2014). Simple MAP inference via low-rank relaxations. In Proceedings of Advances in Neural Information Processing Systems (pp. 3077–3085).

  • Garber, D., & Hazan, E. (2011). Approximating semidefinite programs in sublinear time. In Proceedings of Advances in Neural Information Processing Systems (pp. 1080–1088).

  • Givry, S.D., Hurley, B., Allouche, D., Katsirelos, G., O’Sullivan, B., & Schiex, T. (2014). An experimental evaluation of CP/AI/OR solvers for optimization in graphical models. In Congrès ROADEF’2014, Bordeaux, FRA.

  • Globerson, A., & Jaakkola, T.S. (2007). Fixing max-product:Convergent message passing algorithms for MAP LP-relaxations. In Proceedings of Advances in Neural Information Processing Systems.

  • Goemans, M. X., & Williamson, D. P. (1995). Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. Journal of the ACM, 42, 1115–1145.

    Article  MathSciNet  MATH  Google Scholar 

  • Gorelick, L., Boykov, Y., Veksler, O., Ayed, I. B., & Delong, A. (2014). Local submodular approximations for binary pairwiseenergies. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition

  • Hazan, E. (2008). Sparse approximate solutions to semidefinite programs. In LATIN 2008: Theoretical Informatics (pp. 306–316).

  • Hazan, T., & Shashua, A. (2010). Norm-product belief propagation: Primal-dual message-passing for approximate inference. IEEE Transactions on Information Theory, 56(12), 6294–6316.

    Article  MathSciNet  Google Scholar 

  • Helmberg, C. (1994). An interior point method for semidefinite programming and max-cut bounds, Ph.D. dissertation, Department of Mathematics, Graz University of Technology.

  • Helmberg, C., Poljak, S., Rendl, F., & Wolkowicz, H. (1995). Combining semidefinite and polyhedral relaxations for integer programs. In Proceedings of the 4th International IPCO Conference on Integer Programming and Combinatorial Optimization (pp. 124–134).

  • Helmberg, C., Rendl, F., & Weismantel, R. (1996). Quadratic knapsack relaxations using cutting planes. In Proceedings of the 5th International IPCO Conference on Integer Programming and Combinatorial Optimization (pp. 175–189).

  • Helmberg, C., & Rendl, F. (1998). Solving quadratic (0, 1)-problems by semidefinite programs and cutting planes. Mathematical Programming, 82(3), 291–315.

    Article  MathSciNet  MATH  Google Scholar 

  • Helmberg, C., Rendl, F., & Weismantel, R. (2000). A semidefinite programming approach to the quadratic knapsack problem. Journal of Combinatorial Optimization, 4(2), 197–215.

    Article  MathSciNet  MATH  Google Scholar 

  • Helmberg, C., & Weismantel, R. (1998). Cutting plane algorithms for semidefinite relaxations. Fields Institute Communications, 18, 197–213.

    MathSciNet  MATH  Google Scholar 

  • Hendrix, E. M., Boglárka, G., et al. (2010). Introduction to nonlinear and global optimization. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Horst, R., Pardalos, P. M., & Van Thoai, N. (2000). Introduction to global optimization. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Horst, R., & Tuy, H. (2013). Global optimization: Deterministic approaches. Berlin: Springer.

    MATH  Google Scholar 

  • Huang, Q., Chen, Y., & Guibas, L. (2014). Scalable semidefiniterelaxation for maximum a posterior estimation. In Proceedings of International Conference on Machine Learning.

  • IBM. (2015). ILOG CPLEX optimizer. Retrieved from http://www-01.ibm.com/software/integration/optimization/cplex-optimizer/.

  • Joachims, T., Finley, T., & Yu, C.-N. J. (2009). Cutting-plane training of structural svms. Machine Learning, 77(1), 27–59.

    Article  MATH  Google Scholar 

  • Johnson, J. K. (2008). Convex relaxation methods for graphical models: Lagrangian and maximum entropy approaches, Ph.D. dissertation, Massachusetts Institute of Technology, 2008.

  • Johnson, J. K., Malioutov, D. M., & Willsky, A. S. (2007). Lagrangian relaxation for MAP estimation in graphicalmodels. In Annual Allerton Conference on Communication, Control, and Computing.

  • Jojic, V., Gould, S., & Koller, D. (2010). Accelerated dual decomposition for MAP inference. In Proceedings of International Conference on Machine Learning (pp. 503–510).

  • Jordan, M. J., & Wainwright, M. I. (2003). Semidefinite relaxations for approximate inference on graphs with cycles. In Proceedings of Advances in Neural Information Processing Systems (vol. 16, pp. 369–376).

  • Joulin, A., Bach, F., & Ponce, J. (2010). Discriminativeclustering for image co-segmentation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Kappes, J. H., Andres, B., Hamprecht, F. A., Schnörr, C., Nowozin, S., Batra, D., Kim, S., Kausler, B. X., Kröger, T., Lellmann, J., Komodakis, N., Savchynskyy, B., & Rother, C. (2015). Acomparative study of modern inference techniques for structureddiscrete energy minimization problems, International Journal of Computer Vision.

  • Kappes, J. H., Savchynskyy, B., Schnörr, C. (2012). A bundle approach to efficient MAP-inference by lagrangian relaxation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1688–1695).

  • Kappes, J. H., Schmidt, S., & Schnörr, C. (2010). MRF inference by k-fan decomposition and tight lagrangian relaxation. In Proceedings of European Conference on Computer Vision (pp. 735–747).

  • Kappes, J. H., Speth, M., Andres, B., Reinelt, G., & Schnörr, C. (2011). Globally optimal image partitioning by multicuts. In Energy Minimization Methods in Computer Vision and Pattern Recognition (pp. 31–44).

  • Kappes, J. H., Speth, M., Reinelt, G., & Schnörr, C. (2013). Higher-order segmentation via multicuts. arXiv:1305.6387, preprint.

  • Kappes, J. H., Speth, M., Reinelt, G., & Schnörr, C. (2013). Towards efficient and exact MAP-inference for large scale discrete computer vision problems via combinatorial optimization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1752–1758).

  • Kernighan, B., & Lin, S. (1970). An efficient heuristic procedure for partitioning graphs. The Bell Systems Technical Journal, 49(2), 291–307.

    Article  MATH  Google Scholar 

  • Kim, W., & Lee, K. M. (2011). A hybrid approach for MRF optimization problems: Combination of stochastic sampling and deterministic algorithms. Computer Vision and Image Understanding, 115(12), 1623–1637.

    Article  Google Scholar 

  • Kohli, P., Shekhovtsov, A., Rother, C., Kolmogorov, V., & Torr, P. (2008). On partial optimality in multi-label mrfs. In Proceedings of International Conference on Machine Learning (pp. 480–487).

  • Kolmogorov, V., & Rother, C. (2006). Comparison of energy minimization algorithms for highly connected graphs. In Proceedings of European Conference on Computer Vision (pp. 1–15).

  • Kolmogorov, V., & Wainwright, M. J. (2005). On the optimality oftree-reweighted max-product message-passing. In Proceedings Uncertainty in Artificial Intelligence.

  • Kolmogorov, V. (2006). Convergent tree-reweighted message passing for energy minimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1568–1583.

    Article  Google Scholar 

  • Kolmogorov, V., & Rother, C. (2007). Minimizing nonsubmodular functions with graph cuts-a review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1274–1279.

    Article  Google Scholar 

  • Kolmogorov, V., & Zabin, R. (2004). What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2), 147–159.

    Article  Google Scholar 

  • Komodakis, N., & Paragios, N. (2008). Beyond loose LP-relaxations: Optimizing MRFs by repairing cycles. In Proceedings of European Conference on Computer Vision (pp. 806–820).

  • Komodakis, N., Paragios, N., & Tziritas, G. (2011). MRF energy minimization and beyond via dual decomposition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 531–552.

    Article  Google Scholar 

  • Kovtun, I. (2011). Sufficient condition for partial optimality for (max,+)-labeling problems and its usage. Control Systems and Computers, 2, 71–78.

    Google Scholar 

  • Kumar, M. P., Torr, P. H., & Zisserman, A. (2006). Solving Markov random fields using second order cone programming relaxations. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (vol. 1, pp. 1045–1052).

  • Kumar, M. P., Kolmogorov, V., & Torr, P. H. S. (2009). An analysis of convex relaxations for MAP estimation of discrete MRFs. Journal of Machine Learning Research, 10, 71–106.

    MathSciNet  MATH  Google Scholar 

  • Land, A., & Powell, S. (1979). Computer codes for problems of integer programming. Annals of Discrete Mathematics, 5, 221–269.

    Article  MathSciNet  MATH  Google Scholar 

  • Laue, S. (2012). A hybrid algorithm for convex semidefinite optimization. In Proceedings of International Conference on Machine Learning (pp. 177–184).

  • Linderoth, J. T., & Savelsbergh, M. W. (1999). A computational study of search strategies for mixed integer programming. INFORMS Journal on Computing, 11(2), 173–187.

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, F., Lin, G., & Shen, C. (2015). CRF learning with CNN features for image segmentation. Pattern Recognition, 48(10), 2983–2992.

    Article  Google Scholar 

  • Malick, J. (2007). The spherical constraint in boolean quadratic programs. Journal of Global Optimization, 39(4), 609–622.

    Article  MathSciNet  MATH  Google Scholar 

  • Malick, J., Povh, J., Rendl, F., & Wiegele, A. (2009). Regularization methods for semidefinite programming. SIAM Journal on Optimization, 20(1), 336–356.

    Article  MathSciNet  MATH  Google Scholar 

  • Mars, S., & Schewe, L. (2012). SDP-package for SCIP, TUDarmstadt, Technical Report.

  • Martins, A. F., Figueiredo, M. A., Aguiar, P. M., Smith, N. A., & Xing, E. P. (2011a). An augmented Lagrangian approach to constrained MAP inference. In Proceedings of International Conference on Machine Learning (pp. 169–176).

  • Martins, A. F., Smith, N. A., Xing, E. P., Aguiar, P. M., & Figueiredo, M. A. (2011b). Augmenting dual decomposition for MAP inference. In Proceedings of International Workshop on Optimization for Machine Learning.

  • Meshi, O., & Globerson, A. (2011). An alternating direction method for dual MAP LP relaxation. In Machine Learning and Knowledge Discovery in Databases. (pp. 470–483). Springer, Berlin.

  • Mitra, G. (1973). Investigation of some branch and bound strategies for the solution of mixed integer linear programs. Mathematical Programming, 4(1), 155–170.

    Article  MathSciNet  MATH  Google Scholar 

  • Nesterov, Y. E., & Todd, M. J. (1998). Primal-dual interior-point methods for self-scaled cones. SIAM Journal on Optimization, 8(2), 324–364.

    Article  MathSciNet  MATH  Google Scholar 

  • Olsson, C., Eriksson, A. P., & Kahl, F. (2007). Solving large scale binary quadratic problems: Spectral methods vs. semidefinite programming. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–8).

  • Otten, L., & Dechter, R. (2012). Anytime AND/OR depth-first search for combinatorial optimization. AI Communications, 25(3), 211–227.

    MathSciNet  MATH  Google Scholar 

  • Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan Kaufmann.

    MATH  Google Scholar 

  • Peng, J., Hazan, T., Srebro, N., & Xu, J. (2012). Approximate inference by intersecting semidefinite bound and local polytope. In Proceedings of International Conference on Artificial Intelligence and Statistics (pp. 868–876).

  • Raj, A., & Zabih, R. (2005). A graph cut algorithm forgeneralized image deconvolution. In Proceedings of IEEE International Conference on Computer Vision.

  • Ravikumar, P., & Lafferty, J. (2006). Quadratic programming relaxations for metric labeling and Markov random field MAP estimation. In Proceedings of International Conference on Machine Learning (pp. 737–744).

  • Ravikumar, P., Agarwal, A., & Wainwright, M. J. (2010). Message-passing for graph-structured linear programs: Proximal methods and rounding schemes. Journal of Machine Learning Research, 11, 1043–1080.

    MathSciNet  MATH  Google Scholar 

  • Rislock, N., Malick, J., & Roupin, F. (2012). Improved semidefinite bounding procedure for solving max-cut problems to optimality, Mathematical Programming.

  • Rockafellar, R. T. (1973). A dual approach to solving nonlinear programming problems by unconstrained optimization. Mathematical Programming, 5(1), 354–373.

    Article  MathSciNet  MATH  Google Scholar 

  • Rother, C., Kolmogorov, V., Lempitsky, V., & Szummer, M. (2007). Optimizing binary MRFs via extended roof duality. In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (pp. 1–8).

  • Savchynskyy, B., Schmidt, S., Kappes, J., & Schnörr, C. (2012). Efficient MRF energy minimization via adaptive diminishingsmoothing. In Proceedings Uncertainty in Artificial Intelligence.

  • Schellewald, C., & Schnörr, C. (2005). Probabilistic subgraph matching based on convex relaxation. In Workshop of IEEE Conference on Computer Vision and Pattern Recognition (pp. 171–186).

  • Shekhovtsov, A. (2014). Maximum persistency in energy minimization. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1162–1169).

  • Shekhovtsov, A., Swoboda, P., Savchynskyy, B., used by Alahari, I. et al. (2015). Maximum persistency via iterative relaxed inference with graphical models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 521–529).

  • Shen, C., Kim, J., & Wang, L. (2011). A scalable dual approach to semidefinite metric learning. In Proceedings of Conference on Computer Vision and Pattern Recognition (pp. 2601–2608).

  • Shlezinger, M. (1976). Syntactic analysis of two-dimensional visual signals in noisy conditions. Kibernetika, 4, 113–130.

    Google Scholar 

  • Sontag, D., & Jaakkola, T. S. (2007). New outer bounds on themarginal polytope. In Proceedings of Advances in Neural Information Processing Systems.

  • Sontag, D., Choe, D. K., & Li, Y. (2012). Efficiently searchingfor frustrated cycles in MAP inference. In Proceedings of Uncertainty in Artificial Intelligence.

  • Sontag, D., Meltzer, T., Globerson, A., Jaakkola, T. S., & Weiss, Y. (2008). Tightening LP relaxations for MAP using messagepassing. In Proceedings of Uncertainty in Artificial Intelligence.

  • Sun, J., Shum, H.-Y., & Zheng, N.-N. (2002). Stereo matching using belief propagation. In Proceedings of European Conference on Computer Vision (pp. 510–524).

  • Sun, M., Telaprolu, M., Lee, H., & Savarese, S. (2012). Efficient and exact MAP inference using branch and bound. In Proceedings of International Conference on Artificial Intelligence and Statistics.

  • Swoboda, P., Savchynskyy, B., Kappes, J., & Schnörr, C. (2013). Partial optimality via iterative pruning for the potts model. In SSVM.

  • Swoboda, P., Savchynskyy, B., Kappes, J. H., & Schnörr, C. (2014). Partial optimality by pruning for map-inference with general graphical models. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (pp. 1170–1177).

  • Szeliski, R., Zabih, R., Scharstein, D., Veksler, O., Kolmogorov, V., Agarwala, A., et al. (2008). A comparative study of energy minimization methods for markov random fields with smoothness-based priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(6), 1068–1080.

    Article  Google Scholar 

  • Topkis, D. (1982). A cutting-plane algorithm with linear and geometric rates of convergence. Journal of Optimization Theory and Applications, 36(1), 1–22.

    Article  MathSciNet  Google Scholar 

  • Torr, P. H. S. (2003). Solving Markov random fields using semidefinite programming. In Proceedings of International Conference on Artificial Intelligence and Statistics.

  • Tütüncü, R. H., Toh, K. C., & Todd, M. J. (2003). Solving semidefinite-quadratic-linear programs using SDPT3. Mathematical Programming, 95(2), 189–217.

    Article  MathSciNet  MATH  Google Scholar 

  • Wainwright, M. J., Jaakkola, T. S., & Willsky, A. S. (2005). MAP estimation via agreement on trees: Message-passing and linear programming. IEEE Transactions on Information Theory, 51(11), 3697–3717.

    Article  MathSciNet  MATH  Google Scholar 

  • Wainwright, M. J., & Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2), 1–305.

    MATH  Google Scholar 

  • Wang, P., Shen, C., & Hengel, A. V. D. (2015). Efficient SDPinference for fully-connected CRFs based on low-rankdecomposition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Wang, P., Shen, C., & van den Hengel, A. (2013). A fastsemidefinite approach to solving binary quadratic problems. InProceedings of IEEE Conference on Computer Vision and Pattern Recognition.

  • Weiss, Y. (2000). Correctness of local probability propagation in graphical models with loops. Neural Computation, 12(1), 1–41.

    Article  Google Scholar 

  • Weiss, Y., & Freeman, W. T. (2001). On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Transactions on Information Theory, 47(2), 736–744.

    Article  MathSciNet  MATH  Google Scholar 

  • Wen, Z., Goldfarb, D., & Yin, W. (2010). Alternating direction augmented Lagrangian methods for semidefinite programming. Mathematical Programming Computation, 2(3–4), 203–230.

    Article  MathSciNet  MATH  Google Scholar 

  • Werner, T. (2007). A linear programming approach to max-sum problem: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1165–1179.

    Article  Google Scholar 

  • Werner, T. (2008). High-arity interactions, polyhedral relaxations, and cutting plane algorithm for soft constraint optimisation (MAP-MRF). In Proceedings of IEEE Conference on Computer Vision Pattern Recognition (pp. 1–8).

  • Windheuser, T., Ishikawa, H., & Cremers, D. (2012). Generalized roof duality for multi-label optimization: Optimal lower bounds and persistency. In Proceedings of European Conference on Computer Vision (pp. 400–413).

  • Ye, Y., Todd, M. J., & Mizuno, S. (1994). An O (\(\sqrt{nL}\))-iteration homogeneous and self-dual linear programming algorithm. Mathematics of Operations Research, 19(1), 53–67.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao, X.-Y., Sun, D., & Toh, K.-C. (2010). A Newton-CG augmented Lagrangian method for semidefinite programming. SIAM Journal on Optimization, 20(4), 1737–1765.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu, C., Byrd, R. H., Lu, P., & Nocedal, J. (1997). L-BFGS-B: Fortran subroutines for large-scale bound constrainedoptimization. ACM Transactions on Mathematical Software, 23(4), 550–560.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunhua Shen.

Additional information

Communicated by Yuri Boykov.

Appendix

Appendix

1.1 Appendix 1: Relationship between the Standard SDP Relaxation (11) and the Simplified Dual (21)

The Lagrangian dual of (11) can be expressed in the following general form:

$$\begin{aligned}&\min _{\mathbf {u}} \quad \mathbf {u}^{\top }\mathbf {b}\end{aligned}$$
(25a)
$$\begin{aligned}&\, \mathrm {s.t.}\,\quad \mathbf {Z}= \mathbf {A}+ \textstyle {\sum _{i=1}^m} u_i \mathbf {B}_i \succcurlyeq \mathbf {0}, \end{aligned}$$
(25b)
$$\begin{aligned}&\qquad \quad u_i \ge 0, \forall i \in \mathcal {I}_{in}. \end{aligned}$$
(25c)

The p.s.d. constraint (25b) can be replaced by a penalty function, which is considered as a measure of violation of this constraint. In our case, the penalty function is defined as \(\mathrm {p}(\mathbf {u}) = ||\min (\mathbf {0}, {\varvec{\lambda }}) ||_2^2 = ||\varPi _{\mathcal {S}^{nh+1}_+} (\mathbf {C}(\mathbf {u})) ||_F^2 \), where \({\varvec{\lambda }}\) is the vector of eigenvelues of \(\mathbf {Z}\). We can find that if \(\mathrm {p}(\mathbf {u}) = 0\), then \(\mathbf {Z}\succcurlyeq \mathbf {0}\). Now the problem (25) can be transformed to

(26a)
$$\begin{aligned} \mathrm {s.t.}\,\,\quad u_i \ge 0, \forall i \in \mathcal {I}_{in}, \end{aligned}$$
(26b)

where \(\gamma > 0\) serves as a penalty parameter. With the increase of \(\gamma \), the solution to (26) converges to that of (25). It is clear that (26) is equivalent to (21).

1.2 Appendix 2: Proof of Propositions 1 and 2

Firstly, it is known (Malick 2007; Wang et al. 2013) that the set of p.s.d. matrices with fixed trace \(\varTheta _\eta := \{ \mathbf {X}\succcurlyeq \mathbf {0} | \mathrm {trace}(\mathbf {X}) = \eta \}\), \(\forall \eta > 0\) has the following property:

Theorem 2

(The spherical constraint). \(\forall \eta >0, \forall \mathbf {X}\in \varTheta _\eta \), we have \(||\mathbf {X}||_{F} \le \eta \), and \(||\mathbf {X}||_{F} = \eta \) if and only if \(\mathrm {rank}(\mathbf {X}) = 1\).

It is also shown in Wang et al. (2013) that the problem (21) is the Lagrangian dual of the following problem:

$$\begin{aligned}&\min _{\mathbf {y},\mathbf {Y}} \,\, \mathrm {E}(\mathbf {y},\mathbf {Y}) + \mathrm {g}_\gamma (\mathbf {y},\mathbf {Y}) \end{aligned}$$
(27a)
$$\begin{aligned}&\,\mathrm {s.t.}\,\, (12), (13), (14), (15), (16), (17), (18), (19),\end{aligned}$$
(27b)
$$\begin{aligned}&\,\qquad \varOmega (\mathbf {y}, \mathbf {Y}) \succcurlyeq \mathbf {0}, \end{aligned}$$
(27c)

where \(\mathrm {g}_\gamma (\mathbf {y},\mathbf {Y}) = \frac{1}{2\gamma }(||\varOmega (\mathbf {y},\mathbf {Y}) ||^2_F - (n+1)^2)\).

Proof of Proposition 1

(i) \(\forall \mathcal {D}_1 \subseteq \mathcal {D}_2 \subseteq \mathcal {Z}^n\), \(\exists \mathcal {F}_{in}, \mathcal {F}_{eq} \in \{(p,i) \}_{p \in \mathcal {V}, i \in \mathcal {Z}}\) such that \(\mathcal {D}_1 = \{ \mathbf { x}\in \mathcal {D}_2 \ | \ x_p \ne i, \forall (p,i) \in \mathcal {F}_{in}; x_p = i, \forall (p,i) \in \mathcal {F}_{eq} \}\). Consequently, the difference between the SDCut primal formulation (27) with respect to \(\mathcal {D}_1\) and \(\mathcal {D}_2\) is that the one with respect to \(\mathcal {D}_1\) contains the following additional linear constraints:

$$\begin{aligned} \left\{ \begin{array}{ll} y_{p,i} = 0, Y_{pi,qj} = Y_{qj,pi} = 0, &{}\forall (p,i) \in \mathcal {F}_{in}, \\ y_{p,i} = 1, Y_{pi,qj} = Y_{qj,pi} = y_{q,j}, &{}\forall (p,i) \in \mathcal {F}_{eq}. \end{array} \right. \end{aligned}$$
(28)

Because of the strong duality, we know that \(\mathrm {d}^\star _\gamma (\mathcal {D})\) equals to the optimal value of the corresponding primal problem (27). Then we have \(\mathrm {d}^\star _\gamma (\mathcal {D}_1) \ge \mathrm {d}^\star _\gamma (\mathcal {D}_2)\), as the primal problem (27) with respect to \(\mathcal {D}_1\) has more constraints than that with respect to \(\mathcal {D}_2\).

(ii) This proof is simple. As \(|{\mathcal {D}} |= 1\), there is only one point \(\hat{\mathbf { x}}\) in the set \(\mathcal {D}\) and \(\mathrm {E}(\hat{\mathbf { x}}) = \min _{\mathbf { x}\in \mathcal {D}} \mathrm {E}(\mathbf { x})\). Then the feasible set of (27) also contains a single point \(\{ \hat{\mathbf {y}}, \hat{\mathbf {Y}} \}\) corresponding to \(\hat{\mathbf { x}}\) by applying constraints as (28). Because \(||\varOmega (\hat{\mathbf {y}},\hat{\mathbf {Y}}) ||^2_F = (n+1)^2\), we have \(\mathrm {d}^\star _\gamma (\mathcal {D}) = \mathrm {E}(\hat{\mathbf {y}},\hat{\mathbf {Y}}) = \min _{\mathbf { x}\in \mathcal {D}} \mathrm {E}(\mathbf { x})\). \(\square \)

Proof of Proposition 2

\(\{ \mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \}\) is the optimal solution of (27) based on the strong duality, and \(\mathrm {d}_\gamma (\mathbf {u}^\star _\gamma )\) is the corresponding optimal objective value. Consider the following problem

$$\begin{aligned}&\min _{\mathbf {y},\mathbf {Y}} \,\, \mathrm {E}(\mathbf {y},\mathbf {Y}) + \mathrm {g}_\gamma (\mathbf {y},\mathbf {Y}) \end{aligned}$$
(29a)
$$\begin{aligned}&\mathrm {s.t.}\,\, \varOmega (\mathbf {y}, \mathbf {Y}) \succcurlyeq \mathbf {0}, \, \mathrm {rank}(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star })) = 1, \end{aligned}$$
(29b)
$$\begin{aligned}&\qquad (12), (13), (14), (15), (16), (17), (18), (19), \end{aligned}$$
(29c)

which adds a rank-1 constraint to the problem (27). Then \(\mathrm {d}_\gamma (\mathbf {u}^\star _\gamma )\) and \(\{ \mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \}\) are also optimal for the above problem. Note that the constraints (12), (13), \(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star }) \succcurlyeq \mathbf {0}\) and \(\mathrm {rank}(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star })) = 1\), force \(\{ \mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \}\) to be a vertex of \(\mathcal {M}(\mathcal {G},\mathcal {Z})\). So the feasible set of (29) is \(\mathcal {M}(\mathcal {G},\mathcal {Z})\). On the other hand, \(\mathrm {g}_\gamma (\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star ) = 0\) at \(\mathrm {rank}(\varOmega ({\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star })) = 1\) (Theorem 2), so the objective function of (29) is \(\mathrm {E}(\mathbf {y},\mathbf {Y}) \). In summary, the problem (29) is equivalent to the MAP problem \(\displaystyle {\min _{\mathbf {y},\mathbf {Y}\in \mathcal {M}(\mathcal {G},\mathcal {Z})}} \mathrm {E}(\mathbf {y},\mathbf {Y})\). Then we have that \(\mathbf {y}_\gamma ^\star , \mathbf {Y}_\gamma ^\star \) yield the exact MAP solution and \(\mathrm {d}_\gamma (\mathbf {u}^\star _\gamma )\) is the minimum energy. The value of \(\gamma \) does not affect the above results. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, P., Shen, C., van den Hengel, A. et al. Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference. Int J Comput Vis 117, 269–289 (2016). https://doi.org/10.1007/s11263-015-0865-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-015-0865-2

Keywords