Keywords

1 Introduction

Image segmentation can be interpreted as a graph partition problem subject to hard constraints, given by seed pixels selected in the image domain [3, 9, 12]. The min-cut/max-flow algorithm, also known simply as Graph Cut (GC) [3], and some methods that can be described according to Image Foresting Transform (IFT) [13], such as Watersheds [12] and Fuzzy Connectedness [9], correspond to the \(\varepsilon _1\)- and \(\varepsilon _{\infty }\)-minimization problems, respectively, within a common framework, sometimes referred to as Generalized Graph Cut (GGC) [4, 8].

Each class of methods has its own drawbacks. While methods from the \(\varepsilon _{\infty }\) family have problems related to irregular boundaries and “leaking” through badly defined borders, the \(\varepsilon _1\) methods suffer from metrication error (“blockiness”), shrinking bias, and higher computational timeFootnote 1 [8].

Methods based on multiple energy minimizations, using iterations of different energy classes to address distinct image parts [2, 10] or iterated merging the results of a single energy class over recomputed subgraphs [28], tend to alleviate the above mentioned problems, but they lose global optimality, since they do not perform a single energy minimization over all image elements. As consequence, it is harder to incorporate high-level priors for object segmentation in these approaches. On the other hand, methods like Oriented Image Foresting Transform (OIFT) [20, 26] are very versatile, supporting several high-level priors, including global properties such as connectedness of the segmented object [19, 22], shape constraints [21, 27] and boundary polarity [20, 26], which allow the customization of the segmentation to a given target object [18]. However, as an extension of the \(\varepsilon _{\infty }\)-minimization problem to directed graphs, OIFT inherits its drawbacks.

In graph-based image segmentation, including the \(\varepsilon _1\)- and \(\varepsilon _{\infty }\)-minimization problems, arc weights are given by a local edge indicator function from image attributes and prior object information [5, 6, 25]. In boundary tracking, an edge integration process, usually based on a path-cost function, combines local edges (arc weights) into meaningful long edge curves, interconnecting a set of anchor points, such that a closed contour is computed for segmentation [14, 17].

Seeded region-based segmentation and boundary tracking methods are usually presented as different competing approaches, with the former being easier to extend to multidimensional images, while the latter is more sensitive to seed positioning errors [3]. In this work, we show that multiple short-range edge integrations can be used to extract curvilinear features all over the image in order to improve seeded region-based segmentation.

Our proposed method differs from [33]’s work and from hybrid approaches, such as the Live Markers paradigm [31], because our method does not require the specification of boundary constraints (anchor points or other point-based soft constraints [16]), while Live Markers requires the selection of anchor points over the object boundary to compute optimum boundary segments, which are turned into internal and external markers for region-based delineation. In our approach, the boundary tracking method is applied locally over the image inside circular regions to extract curvilinear features without the need of any user intervention. It improves the arc-weight assignment from a local edge indicator function to a more general short-range edge integration function. As advantages we have:

  • Segments by local boundary tracking with inconsistent boundary polarity can be penalized.

  • We can favor the segmentation of objects with more regular and smoothed forms by penalizing arcs in segments with accentuated curvature.

  • We can fill boundary gaps at weak edge points to avoid leaking problems in the segmentation.

  • We keep the user interface simple to use without the need for multiple types of user input.

The short-range edge integration helps to circumvent the main problems of the \(\varepsilon _{\infty }\) family (irregular boundaries and “leaking”). We demonstrate these results using edge integration by Live Wire (LW) [14], combined with OIFT [20, 26], due to their complementary strengths, since live wire can be seen as a boundary-based version of the \(\varepsilon _1\)-minimization problem on a dual graph [3, 24]. As result, we have a globally optimal segmentation by OIFT, that can be tailored to a given target object, according to its localized curvilinear features, and other high-level priors already supported by OIFT [19, 21, 22, 27].

For the sake of completeness in presentation, Sect. 2 includes an overview of concepts on image graph and a revision of OIFT and live wire. Section 3 shows the proposed algorithm to compute the localized curvilinear features. In Sect. 4, we evaluate the proposed method, named OIFT with Localized Curvilinear Features (OIFT-LCF) and our conclusions are stated in Sect. 5.

2 Background

An image can be interpreted as a weighted digraph \(G=\langle \mathcal{V},\mathcal{A},\omega \rangle \), whose nodes \(\mathcal{V}\) are the image pixels in its image domain \(\mathcal{V}\subset {\mathbb {Z}}^{n}\), and whose arcs are the ordered pixel pairs \(\langle s,t \rangle \in {\mathcal{A}}\). For example, one can take \(\mathcal{A}\) to consist of all pairs of ordered pixels \(\langle s,t \rangle \) in the Cartesian product \(\mathcal{V} \times \mathcal{V}\) such that \({||}s - t{||} \le \rho \) and \(s\ne t\), where \(\rho \) is a specified constant (e.g., 4-neighborhood, when \(\rho =1\), and 8-neighborhood, when \(\rho =\sqrt{2}\), in case of 2D images). Each arc \(\langle s,t \rangle \in {\mathcal{A}}\) has a weight \(\omega (s,t) \ge 0\). The digraph G is symmetric if for any of its arcs \( \langle s,t \rangle \in \mathcal{A}\), the pair \( \langle t,s \rangle \) is also an arc of G.

For a given image graph \(G=\langle \mathcal{V},\mathcal{A}, \omega \rangle \), a path \({\pi =\langle t_1,t_2,\ldots ,t_n \rangle }\) is a sequence of adjacent pixels (i.e., \(\langle t_i,t_{i+1} \rangle \in \mathcal{A}\), \(i=1,2,\ldots ,n-1\)) with no repeated vertices (\(t_i \ne t_j\) for \(i \ne j\)). A path \({\pi _t=\langle t_1,t_2,\ldots ,t_n = t \rangle }\) is a path with terminus at a pixel t. When we want to explicitly indicate the origin of a path, the notation \(\pi _{s \leadsto t} = \langle t_1=s,t_2,\ldots ,t_n = t \rangle \) may also be used, where s stands for the origin and t for the destination node. A path is trivial when \(\pi _t=\langle t \rangle \). A path \(\pi _t=\pi _s\cdot \langle s,t\rangle \) indicates the extension of a path \(\pi _s\) by an arc \( \langle s,t \rangle \), and \(\pi _{r \leadsto t} = \pi _{r \leadsto s} \cdot \pi _{s \leadsto t}\) indicates the concatenation of two paths.

2.1 Live Wire (LW)

A connectivity function computes a value \(f(\pi _t)\) for any path \(\pi _t\), usually based on arc weights. A path \(\pi _t\) is optimum if \(f(\pi _t) \le f(\tau _t)\) for any other path \(\tau _t\) in G. By selecting to each pixel \(t\in \mathcal{V}\) one optimum path with terminus at t, we obtain the optimum-path value \(V_{opt}(t)\), which is uniquely defined by \(V_{opt}(t) = \min _{\forall \pi _t \;\text {in}\; G} \{ f(\pi _t) \}\).

The live-wire function is given by:

$$\begin{aligned} f_{LW}(\langle t\rangle )= & {} \left\{ \begin{array}{ll} 0 &{} \text {if } t \in \mathcal{S}\\ +\infty &{} \text {otherwise} \end{array}\right. \nonumber \\ f_{LW}(\pi _s \cdot \langle s,t\rangle )= & {} f_{LW}(\pi _s) + \omega (s,t) \end{aligned}$$
(1)

where \(\mathcal{S}\) is a seeds set, usually composed by a single starting anchor point.

The optimum path interconnecting two consecutive anchor points can be computed by a generalization of Dijkstra’s algorithm to more general path-cost functions, known as Image Foresting Transform (IFT) [13]. In the IFT framework, the paths are stored in backward order in a predecessor map \(P : \mathcal{V} \rightarrow \mathcal{V} \cup \{ nil \}\), such that for any pixel \(t\in \mathcal{V}\), a path \(\pi ^{P}_t\) is recursively defined as \(\langle t \rangle \) if \(P(t) = nil\), and \(\pi ^{P}_s\cdot \langle s,t\rangle \) if \(P(t)=s\ne nil\), according to the following algorithm:

figure a

In user-steered image segmentation [14, 17, 24], the computed path from the previous anchor point to the current mouse position is shown to the user, as he moves the cursor, so that the user can interactively select the desired path, that best matches the object boundary, and start a new path search from that point. All previous selected paths are kept unchanged (frozen) during the algorithm, so that their nodes cannot be revisited.

2.2 Oriented Image Foresting Transform (OIFT)

Let \(\mathcal {X} = \{ \mathcal {O} : \mathcal {O} \subseteq \mathcal{V} \}\) be the space of all possible binary segmented objects \(\mathcal {O}\). A seed-based segmentation uses seeds \(\mathcal {S}=\mathcal {S}_o\cup \mathcal {S}_b\subseteq \mathcal{V}\), where \(\mathcal {S}_o\) and \(\mathcal {S}_b\) are object (\(\mathcal {S}_o\subseteq \mathcal {O}\)) and background (\(\mathcal {S}_b\subseteq \mathcal{V}\setminus \mathcal {O}\)) seed sets, respectively. They restrict \(\mathcal {X}\) to \(\mathcal {X}(\mathcal {S}_o, \mathcal {S}_b)=\{\mathcal {O}\in \mathcal {X}:\mathcal {S}_o\subseteq \mathcal {O}\subseteq \mathcal{V}\setminus \mathcal {S}_b\}\). A cut is defined as \(\mathcal {C}(\mathcal {O}) = \{\langle s,t\rangle \in \mathcal{A}:s\in \mathcal {O} \text { and } t\notin {\mathcal {O}}\}\). We can associate an energy value \(\varepsilon (\mathcal {O})\) to an object (and its cut), and restrict the set of solutions to those which minimizes it. Let energy \(\varepsilon _q(\mathcal {O}) = (\sum _{\langle s,t\rangle \in \mathcal {C}(\mathcal {O})} \omega (s,t)^q)^\frac{1}{q}\). The original Graph Cut algorithm minimizes \( \varepsilon _1(\mathcal {O}) \), while OIFT minimizes \(\varepsilon _\infty (\mathcal {O}) = \max _{\langle s,t\rangle \in \mathcal {C}(\mathcal {O})}\omega (s,t)\) [8, 20].

In this work, we will present OIFT in its equivalent dual form as a maximization problem of the energy \(\bar{\varepsilon }_\infty (\mathcal {O}) = \min _{\langle s,t\rangle \in \mathcal {C}(\mathcal {O})}\omega ^\prime (s,t)\), in a strongly connected and symmetric digraph \(G=\langle \mathcal{V},\mathcal{A},\omega ^\prime \rangle \), where the weights \(\omega ^\prime (s,t)\) are a combination of an undirected dissimilarity measure \(\delta (s,t)\) between neighboring pixels s and t, multiplied by an orientation factor, as follows:

$$\begin{aligned} \omega ^\prime (s,t)= & {} \left\{ \begin{array}{ll} \delta (s,t)\times (1+\alpha ) &{} \text {if } I(s) > I(t) \\ \delta (s,t)\times (1-\alpha ) &{} \text {if } I(s) < I(t)\\ \delta (s,t) &{} \text {otherwise} \end{array}\right. \end{aligned}$$
(2)

where \(\alpha \in [-1,1]\) and I(t) is the image intensity at pixel t. In this work, we consider \(\alpha = 50\%\) in order to get a more balanced solution. Different procedures can be adopted for \(\delta (s,t)\), as discussed in [5, 25], such as the mean gradient magnitude (i.e., \(\delta (s,t)=\frac{{||}\nabla I(s){||} + {||}\nabla I(t){||}}{2}\)). Note that we usually have \(\omega ^\prime (s,t) \ne \omega ^\prime (t,s)\) when \(\alpha \ne 0\). For colored images, a reference map should be considered for I(t) in Eq. 2, or \(\alpha \) must be set to zero [20]. OIFT is build upon the IFT framework by considering the following path function [20]:

(3)

The segmented object \(\mathcal{O}\) by OIFT is defined from the forest P computed by Algorithm 1, with , by taking as object pixels the set of pixels that were conquered by paths rooted in \(\mathcal{S}_o\). For \(\alpha > 0\), the segmentation by OIFT favors transitions from bright to dark pixels, and \(\alpha < 0\) favors the opposite orientation [20, 26].

3 OIFT with Localized Curvilinear Features (OIFT-LCF)

In OIFT with Localized Curvilinear Features, the live-wire method for boundary tracking is applied locally over the image inside circular regions to extract curvilinear features without the need of any user intervention. For each pixel c of the image, we consider a circular disc \(\mathcal{D}(c) = \{ t \in {\mathbb {Z}}^{n} \mid {||}t - c{||} \le R \}\) of radius R centered at c. The optimum path \(\pi _{a \leadsto c}\) from a pixel a in the disc boundary \(\mathcal{B}(c) = \left\{ s \in \mathcal{D}(c) \mid \exists t \notin \mathcal{D}(c) \text { such that } {||}s-t{||} \le 1 \right\} \) to the central pixel c is computed by Algorithm 1 with the live-wire function \(f_{LW}\), using \(\mathcal{S} = \mathcal{B}(c)\) and the local edge indicator function \(\omega (s,t)\) defined by:

$$\begin{aligned} \omega (s,t)= & {} \left\{ \begin{array}{ll} (\frac{\bar{G}(s) + \bar{G}(t)}{2} \times (1+\gamma ) )^\beta + {||}s - t{||} &{} \text {if } I(r) > I(l)\\ (\frac{\bar{G}(s) + \bar{G}(t)}{2} \times (1-\gamma ) )^\beta + {||}s - t{||} &{} \text {if } I(r) < I(l)\\ (\frac{\bar{G}(s) + \bar{G}(t)}{2})^\beta + {||}s - t{||} &{} \text {otherwise} \end{array}\right. \end{aligned}$$
(4)

where \(\bar{G}(t)\) is the complement of the magnitude of some gradient like image G(t) at pixel t, l and r are the neighboring left and right pixels of the arc \(\langle s,t\rangle \) (Fig. 1) and we usually have \(\gamma = 50\%\). The parameter \(\gamma \) is used to penalize segments with inconsistent boundary polarity, by favoring a particular boundary orientation. For example, \(\gamma = 50\%\) improves arcs with the right pixel being darker than its left pixel (Fig. 2).

We then compute the optimum path from c to the disc boundary \(\mathcal{B}(c)\), such that we end with a composite path \(\pi _{a \leadsto b} = \pi _{a \leadsto c} \cdot \pi _{c \leadsto b} =\langle p_1=a,p_2,\ldots , p_k = c, \ldots ,p_m=b \rangle \), connecting two boundary points of the disc and passing through its center. In this short-range edge integration, the live-wire segments attach to the objects’ boundaries, so that we can extract important contour information from the underlying objects all over the image (Fig. 3a).

For each pixel c, we can extract curvilinear features from the composite path \(\pi _{a \leadsto c} \cdot \pi _{c \leadsto b}\), such as the following mean curvature measure:

$$\begin{aligned} \text {Curv}(\langle p_1,..,p_m \rangle ) = \frac{1}{(L/2)} \cdot \sum _{i=L/2}^{L-1} \frac{{||}p_{k-i} - 2\cdot c + p_{k+i}{||}}{{||}p_{k-i} - c{||}+{||}c - p_{k+i}{||}} \end{aligned}$$
(5)

where \(L = \min \{ \text {Length}(\pi _{a \leadsto c}), \text {Length}(\pi _{c \leadsto b}) \} + 1 = \min \{k-1, m-k\} + 1\).

We can then improve the arc-weight assignment for the OIFT method from a local edge indicator function \(\delta (s,t) = \frac{G(s) + G(t)}{2}\) to a more general short-range edge integration function \(\delta (s,t) = \frac{G_{LCF}(s) + G_{LCF}(t)}{2}\), where \(\bar{G}_{LCF}(t) = \bar{G}(t) \times (1 + \text {Curv}^2(\langle p_1,\ldots ,p_m \rangle ))\). This OIFT with localized curvilinear features (OIFT-LCF) helps to circumvent the irregular boundaries of the original OIFT (Figs. 3 and 4). Curvature regularity for region-based image segmentation usually results in an NP-hard problem, and linear programming relaxation with thresholding is used to obtain an approximate solution [30]. In our method, the curvilinear features can be fast computed using localized live-wire executions, which can be calculated in parallel for different regions of the image. Since both live wire and OIFT take linearithmic time in worst-case scenarios and live wire is computed only in circular discs of fixed size, the complexity of OIT-LCF is linearithmic.

Fig. 1.
figure 1

The neighboring left and right pixels for each arc.

Fig. 2.
figure 2

Boundary tracking with boundary polarity. The arcs with the desired orientation (right pixels darker than their left pixels) are shown with thicker lines, while arcs with incorrect orientation are shown with dashed lines. (a) Segments by live wire with the correct boundary orientation. (b) Segments with inconsistent polarity.

Fig. 3.
figure 3

(a) Curvilinear features can be extract by computing localized live wire segments. (b–c) We can favor the segmentation of objects with more regular and smoothed forms by penalizing arcs in segments with accentuated curvature. (d–f) An example using MR images of the brain.

Fig. 4.
figure 4

The segmentation of the brain external surface in a MR-T1 image. OIFT-LCF gives the most regular contour, following more closely the dura mater, due to its curvature analysis.

Another possibility is to consider in OIFT-LCF:

$$\begin{aligned} \bar{G}_{LCF}(t) = \left( \frac{f_{LW}(\pi _{a \leadsto c})+f_{LW}(\pi _{c \leadsto b})}{\text {Length}(\pi _{a \leadsto c} \cdot \pi _{c \leadsto b})}\right) ^{1/\beta } \times (1 + \text {Curv}^2) \end{aligned}$$
(6)

which helps to circumvent “leaking” problems of the original OIFT (Fig. 5).

Fig. 5.
figure 5

The segmentation of: (a–c) A bone of the human foot and (d–f) the liver.

4 Experimental Results

We conducted quantitative experiments, using a total of 40 image slices of 10 thoracic CT studies to segment the liver, and 40 slice images from real MR images of the foot to segment the calcaneus bone. Several different gradients with and without the usage of localized curvilinear features, were assessed for accuracy employing the mean performance curve (Dice coefficient) and ground truth data obtained from an expert of the radiology department at the University of Pennsylvania for different seed sets.

In the first experiment, we used the second version of OIFT-LCF to avoid “leaking” problems, with \(R=3\) and \(\beta = 2\), and considered different seed sets automatically obtained by eroding and dilating the ground truth at different radius values. By varying the radius value, we can compute the segmentation for different seed sets and trace accuracy curves using the Dice coefficient of similarity. However, in order to generate a more challenging situation, we considered a larger radius of dilation for the external seeds (twice the value of the inner radius), resulting in an asymmetrical arrangement of seeds. In order to demonstrate the versatility of OIFT-LCF and OIFT, which support several high-level priors, we considered these methods with shape constraints by Geodesic Star Convexity (GSC) [21], leading to the GSC-OIFT-LCF and GSC-OIFT methods.

Figures 6a–b show the mean accuracy curves for all the images of the first experiment, using different input gradients indicated by a superscript index, for each method. For the sake of simplicity, we only considered gradient like images from local image attributes without resorting to more sophisticated techniques by supervised learning [32]. The superscript index 1 is used to indicate G(t) as the Sobel gradient magnitude. The superscript index 2 denotes the image-based weight by Miranda et al. [25], which is based on image smoothing at four different scales by a sequences of opening by reconstruction and closing by reconstruction. The superscript index 3 corresponds to the morphological gradient with a radius of 1.5 pixels, that is, the difference between the dilation and the erosion of the image. The superscript index 4 describes the brightness gradient features from [23], where for each pixel, a circle of radius r is drawn and divided along the diameter at orientation \(\theta \). The half-disc regions are described by histograms, which are compared by the chi-squared distance. A large difference between the disc halves indicates a discontinuity in the image along the disc’s diameter. The histograms (with 12 bins) are computed by a kernel density estimation using a Gaussian kernel with \(\sigma = 10.0\). We considered \(r=4\) and four different orientations. The gradient with superscript 5 was inspired by the work from Rauber et al. [29], where superpixel graphs were shown to improve interactive segmentation, by exploring the mean color/intensity inside superpixels. We tried to reproduce similar results at the pixel level, by using the average of brightness inside superpixels of size \(5\times 5\) computed by IFT-SLIC [1] as input image for the Sobel operator. From the results (Figs. 6a–b) it is clear that GSC-OIFT-LCF outperformed GSC-OIFT for all corresponding indexes (i.e., \(\text {GSC-OIFT-LCF}^i\) better than \(\text {GSC-OIFT}^i\), for \(i=1,\ldots ,5\)).

Fig. 6.
figure 6

The mean accuracy curves (Dice) using different gradients with and without the usage of localized curvilinear features. (a–b) First experiment for the segmentation of (a) calcaneus bone and (b) liver. (c) Calcaneus mean accuracy curve by a robot user.

In the second experiment, we used the first version of OIFT-LCF to circumvent irregular boundaries, with \(R=4\) and \(\beta = 5\). Figure 6c shows the experimental curve using a robot user [15], which confirms similar results.

5 Conclusions

In conclusion, we developed extensions to the OIFT algorithm [26], by incorporating localized curvilinear constraints in its formulation, helping to circumvent irregular boundaries of OIFT. As future work, we intend to extend OIFT-LCF to 3D and test it with other boundary-based methods [11].