1 Introduction

Normally, wireless network scenarios, such as interference channel (IC), share the channel among the users, resulting in multi-user interference. A new method, termed interference alignment (IA), leads to the efficient use of communication resources, since it successfully achieves the theoretical bound on the multiplexing gain [1,2,3]. In this scheme, unwanted signals from other users are fitted into a small part of the signal space observed by each receiver, called interference subspace. The other signal subspace is left free of interference for the desirable signal. The performance of IA scheme is sensitive to channel state information (CSI) inaccuracies. In [4] and [5], the performance of IA under CSI error was quantified. Asymptotic mean loss in sum rate compared to the perfect CSI case was derived [4]. Multiplexing gain is fully achievable when the variance of the CSI measurement error is inversely proportional to the SNR [4]. Performance of IA for SISO and multi-input multi-output (MIMO) IC where the transmitters are provided with the quantized CSI via feedback can be found in [6, 7]. Other feedback strategies have been presented in [8]. The IA problem has been studied for the cognitive radio networks in [9,10,11]. The performance of ad-hoc networks using the IA has been evaluated in [12].

1.1 Motivation

In order to maximize sum rate of the MIMO interference network, a beamforming strategy based on the interference alignment is used. Such algorithms have been established by progressive minimization of the leakage interference ([13], Algorithm 1) and [14]. The Max-SINR algorithm ([13], Algorithm 2), minimum mean square error [15], and joint signal and interference alignment [16] are other algorithms. These schemes are established based on availability of perfect CSI. The performance of transceivers is degraded by the CSI error. Different algorithms are proposed to improve the throughput of the IC, under imperfect CSI. In the literature, precoder-decorrelator optimization is proposed for broadcast and point-to-point systems [17,18,19,20]. Specifically, in [17] and [20], the authors consider precoding design for multi-input single-output (MISO) broadcast channel, where it is shown that the precoder optimization problem is always convex. In [19], the authors consider precoder-decorrelator design for MIMO broadcast channel using an iterative algorithm based on solving convex sub-problems. On the other hand, in [18] the authors consider a space-time coding scheme for the point-to-point channel with imperfect channel knowledge. However, these existing works cannot be extended to robust transceiver design for the MIMO IC [21].

Researchers have utilized minimum mean square error criterion to design robust transceiver for the MIMO IC [22]. They improved robustness in presence of the channel uncertainty. Authors in [21] proposed a transceiver design that enforces robustness against imperfect CSI as well as providing a fair performance among the users in the interference channel. They adopted worst-case optimization approach to improve robustness and fairness.

1.2 Contributions

This paper focused on sum rate improvement of the Max-SINR algorithm under CSI error. In the first proposed scheme, each transceiver adjusts its filter by maximizing the expected value of the signal-to-interference-plus-noise ratio (SINR). Realized SINRs for different realizations of the CSI error matrices, are samples of the SINR random variable. In some cases, this random variable may have large variance. Therefore, the realized SINRs could be very far from the expected value. In the second proposed algorithm, each node adjusts its transmit/receive filter to minimize the variance. This design hedges against variability.

Two robust algorithms are designed based on the reciprocity of wireless propagation. There is a high correlation between the original and reciprocal channel’s gains in communication systems working in a time division duplex (TDD) mode. Reciprocity has been exploited by researchers in [23,24,25]. Algorithms are implemented in a distributed manner with only local channel knowledge required. In other words, each user only needs to estimate the channel between its transmitter and receiver.

Monte Carlo simulations demonstrate that the first proposed algorithm, i.e., expectation maximization (EM), achieves higher sum rates compared to the existing algorithms in [21], and [22]. The second proposed algorithm, that is variation minimization (VM), provides a SINR with low variance. Moreover, VM helps to mitigate the effect of the CSI error, but not as satisfactory as EM scheme. Taylor series expansion is exploited to approximate the influence of CSI imperfection on the statistical properties, e.g., mean and variance. Numerical results show that more accurate approximation can be achieved with less error variance/signal power. For practical applications, when estimated mean/variance is used for maximization/minimization, the proposed transceivers will lead to sum rate improvement/SINR with low variance.

1.3 Organization

The rest of this paper is organized as follows. System model is studied in section 2. Sections 3 and 4 propose robust transceivers based on expectation maximization and variation minimization, respectively. Simulation results are presented in section 5 and concluding remarks are drawn in section 6.

2 System model

In the system model under study, each node works in a TDD mode. At the first time slot, nodes on the left hand side send the data to the right hand side nodes, shown in Fig. 1. At the next time slot, nodes on the left-hand side receive the data, indicating a change in the roles of the nodes. (This is clear in Fig. 2 following where the reciprocal network is described). The terms, original and reciprocal channels, are used to distinguish between two time slots. Hence, a transmitter in the reciprocal channel plays the role of an original network’s receiver and vice versa.

Fig. 1
figure 1

System model of the original channel

Fig. 2
figure 2

System model. Reciprocal network is obtained by switching the roles of transmitters and receivers in the original channel

MIMO IC with K transmitter-receiver pairs is considered in this paper. The \( {j}^{th}\in \mathcal{K} \) transmitter and the \( {k}^{th}\in \mathcal{K}=\left\{1,\dots, K\right\} \) receiver are equipped with M and N antennas, respectively, as shown in Fig. 1. The j th transmitter sends D j independent data streams \( {s}^j={\left[{s}_1^j\kern0.5em \dots \kern0.5em {s}_{D^j}^j\right]}^t \) to its intended receiver. Symbol vector has circularly symmetric complex Gaussian distribution with zero mean and covariance matrix PI, s j ~ CN(0, PI). True and estimated channel coefficients between transmitter j and receiver k are denoted by G kj and H kj, respectively. In practical scenarios, there is a mismatch between true and estimated channels, as stated below:

$$ \begin{array}{c}\hfill Error\ Model:{G}^{kj}={H}^{kj}+{E}^{kj},\hfill \\ {}\hfill Error\ Matrix:{E}^{kj}\ with\ i.i.d\ CN\left(0,{\sigma}^2\right)\ elements.\hfill \end{array} $$
(1)

The received signal at receiver k is:

$$ \begin{array}{c}\hfill Received\ Signal:{Y}^k={\sum}_{j=1}^K\left({H}^{kj}+{E}^{kj}\right){X}^j+{Z}^k,\hfill \\ {}\hfill Transmitted\ Signal\ Vector:{X}^j,\hfill \\ {}\hfill AWGN\ Vector:{Z}^k\sim CN\left(0,{N}_0I\right).\hfill \end{array} $$
(2)

In order to maximize system throughput, a beamforming strategy based on the interference alignment is used.

$$ \begin{array}{c}\hfill Precoding\ Model:{X}^j={V}^j{s}^j,\hfill \\ {}\hfill Precoder:M\times {D}^j\ matrix\ {V}^j=\left[{v}_1^j\dots {v}_{D^j}^j\right],\hfill \\ {}\hfill Columns\ of\ Precoder:{v}_d^j\ with\ unit\ norms.\hfill \end{array} $$
(3)

Receiver k decodes the transmitted symbol vector s k using the interference suppression matrix.

$$ \begin{array}{c}\hfill Decoding\ Model:\overline{Y^k}={U}^{k\dagger }{Y}^k,\hfill \\ {}\hfill Decoder:N\times {D}^k\ matrix\ {U}^k=\left[{u}_1^k\dots {u}_{D^k}^k\right].\hfill \end{array} $$
(4)

For the K-user MIMO IC, reciprocal channel is obtained by switching the role of transmitters and receivers. True and estimated channels indicated by \( \overleftarrow{G^{jk}} \) and \( \overleftarrow{H^{jk}} \), are M × N matrices. Error matrix is denoted by \( \overleftarrow{E^{jk}} \) with element distribution CN(0, σ 2). The relation between the original and reciprocal channels’ gains is \( \overleftarrow{G^{jk}}={G^{kj}}^{\dagger } \). The \( \overleftarrow{V^k} \) is precoder and \( \overleftarrow{U^j} \) indicates receive interference suppression matrices on the reciprocal channel. Since the receivers of the reciprocal channel play the role of the original network’s transmitters and vice versa, therefore \( \overleftarrow{V^k}={U}^k \) and \( \overleftarrow{U^j}={V}^j \). Figure 2 shows the reciprocal network.

According to the system model, the SINR value for the d th data stream at k th receiver is expressed by

$$ {SINR}_d^k=\frac{P{\left\Vert {u_d^k}^{\dagger }{G}^{kk}{v}_d^k\right\Vert}^2}{P\sum_{j=1}^K\sum_{m=1}^{D^j}{\left\Vert {u_d^k}^{\dagger }{G}^{kj}{v}_m^j\right\Vert}^2-P{\left\Vert {u_d^k}^{\dagger }{G}^{kk}{v}_d^k\right\Vert}^2+{N}_0{\left\Vert {u}_d^k\right\Vert}^2}. $$
(5)

where ‖.‖ denotes the Euclidian norm.

3 Robust transceiver design I

In this section, the first algorithm is formulated. The algorithm starts with arbitrary transmit and receive filters and then iteratively updates these filters to yield solution. The goal is to achieve robust transceiver by optimization problem \( \underset{u_d^k}{ \max }E\left[ SINR\right] \). The iterative algorithm alternates between the original and reciprocal networks \( \underset{\overleftarrow{u_d^j}}{ \max }E\left[\overleftarrow{SINR}\right] \). Within each network, only the receivers update their filters.

In the following, first, approximate expression for the mean value is computed and then the optimization problem is solved.

3.1 Estimate the mean of \( {\mathrm{SINR}}_{\mathrm{d}}^{\mathrm{k}} \)

In (6) \( E\left[{SINR}_d^k\right] \) is computed in terms of the function \( {SINR}_d^k=\frac{num}{den} \) and the probability density function f(num, den)Footnote 1

$$ E\left[{SINR}_d^k\right]={\int}_{-\infty}^{\infty}\frac{num}{den}f\left(num, den\right)d.num\times d. den. $$
(6)

Unfortunately, a closed form solution cannot be derived for the integration in (6). Hence, the approximate mean should be found. If f(num, den) is concentrated near its mean, then estimation of the mean value can be expressed by

$$ E\left[{SINR}_d^k|{H}^{kj}\right]\cong \frac{\mu_1}{\mu_2}=\frac{{u_d^k}^{\dagger}\left[{T}_d^k+P{\sigma}^2I\right]{u}_d^k}{{u_d^k}^{\dagger}\left[{S}^k-{T}_d^k+\left(P{\sigma}^2\sum_{j=1}^K{D}^j-P{\sigma}^2+{N}_0\right)I\right]{u}_d^k}. $$
(7)

where \( {S}^k=P\sum_{j=1}^K\sum_{m=1}^{D^j}{H}^{kj}{v}_m^j{v_m^j}^{\dagger }{H^{kj}}^{\dagger } \) and \( {T}_d^k=P{H}^{kk}{v}_d^k{v_d^k}^{\dagger }{H^{kk}}^{\dagger } \) denote, respectively, the estimated covariance matrix of all data streams observed by the receiver k and estimated covariance matrix of the d th desirable data stream. Since the estimation of mean is common in two algorithms, it is provided in Appendix 1.

3.2 Iterative solution

To obtain columns of U k, the derivative of (7) with respect to \( {u}_d^k \) should be obtained and then set equal to zero. Thus, \( {u}_d^k \) should satisfy the following vector equation (i.e., the derivative of the numerator multiplied by μ 2 should be equal to the derivative of denominator multiplied by μ 1).

$$ \begin{array}{c}\hfill {\mu}_2\left[{T}_d^k+P{\sigma}^2I\right]{u}_d^k={\mu}_1\left[{S}^k-{T}_d^k+\left(P{\sigma}^2{\sum}_{j=1}^K{D}^j-P{\sigma}^2+{N}_0\right)I\right]{u}_d^k.\hfill \\ {}\hfill {\mu}_1=E\left[num|{H}^{kj}\right]={u_d^k}^{\dagger}\left[{T}_d^k+P{\sigma}^2I\right]{u}_d^k.Eq.31\ \mathrm{in}\ \mathrm{appendix}\ \mathrm{A}\hfill \\ {}\hfill {\mu}_2=E\left[ den|{H}^{kj}\right]={u_d^k}^{\dagger}\left[{S}^k-{T}_d^k+\left(P{\sigma}^2{\sum}_{j=1}^K{D}^j-P{\sigma}^2+{N}_0\right)I\right]{u}_d^k.Eq.32\ \mathrm{in}\ \mathrm{A}\mathrm{ppendix}\ \mathrm{A}\hfill \end{array} $$
(8)

The above vector equation is rearranged as follow by moving the terms involving \( {S}^k{u}_d^k \) and \( {u}_d^k \) to left and \( {T}_d^k{u}_d^k \) to the right hand side:

$$ {\mu}_1\left({S}^k{u}_d^k+{\varOmega}_d^k\ {u}_d^k\right)=\left({\mu}_1+{\mu}_2\right){T}_d^k{u}_d^k, $$
(9)
$$ {\varOmega}_d^k=P{\sigma}^2{\sum}_{j=1}^K{D}^j-P{\sigma}^2\frac{\mu_1+{\mu}_2}{\mu_1}+{N}_0. $$
(10)

where the scalar coefficient is \( {\Omega}_d^k \).

According to the definition, \( {T}_d^k{u}_d^k \) is the product of scalar value \( {v_d^k}^{\dagger }{H^{kk}}^{\dagger }{u}_d^k \) and vector \( {H}^{kk}{v}_d^k \). It is concluded that \( {T}_d^k{u}_d^k \) is in the direction of \( {H}^{kk}{v}_d^k \). Furthermore, only the directions are important. Hence, the scalar factors μ 1, \( \left({\mu}_1+{\mu}_2\right){v_d^k}^{\dagger }{H^{kk}}^{\dagger }{u}_d^k \) can be removed from (9). Then, the unit vector maximizes (7) is given by

$$ {u}_d^k=\frac{{\left({S}^k+{\varOmega}_d^k\ I\right)}^{-1}{H}^{kk}{v}_d^k}{\left\Vert {\left({S}^k+{\varOmega}_d^k\ I\right)}^{-1}{H}^{kk}{v}_d^k\right\Vert }. $$
(11)

Now, consider the reciprocal network. The transmit precoding matrices, \( \overleftarrow{V^k} \), are the receive interference suppression matrices U k from the original network, whose columns are given by (11). The optimal d th unit column of \( \overleftarrow{U^j} \) is given by

$$ \overleftarrow{u_d^j}=\frac{{\left(\overleftarrow{S^j}+\overleftarrow{\varOmega_d^j}\ I\right)}^{-1}\ \overleftarrow{H^{jj}}\ \overleftarrow{v_d^j}}{\left\Vert {\left(\overleftarrow{S^j}+\overleftarrow{\varOmega_d^j}\ I\right)}^{-1}\ \overleftarrow{H^{jj}}\ \overleftarrow{v_d^j}\right\Vert }. $$
(12)

Now, the receive interference suppression matrices in the reciprocal network replace \( {V}^j\forall j\in \mathcal{K} \), and then \( {U}^k\forall k\in \mathcal{K} \) are updated based on them. It is seen from (10) and (11) that \( {\Omega}_d^k \) and \( {u}_d^k \) are interdependent. Therefore, prior to repeat steps, \( {\Omega}_d^k \) should be computed at the end of each iteration. To summarize the iterative procedure, the steps are given in Fig. 3.

Fig. 3
figure 3

a Schematic view of robust transceiver design I. b Algorithm: robust distributed transceiver design

It can be proved that EM filters for the special case σ 2 = 0, are transmit/receive matrices of the Max-SINR algorithm (Proof is provided in Appendix 2.).

In order to implement the algorithm in a distributed manner, receiver k needs to know about H kk and S k which are locally available. The covariance matrix S k can be estimated from the autocorrelation of the received signal Y k. Substituting \( {X}^j=\sum_{d=1}^{D^j}{v}_d^j{s}_d^j \) into (2) yields

$$ E\left[{Y}^k{Y^k}^{\dagger}\right]={S}^k+\left(P{\sigma}^2{\sum}_{j=1}^K{D}^j+{N}_0\right)I, $$
(13)

where the expectation is computed over error and noise. The receiver j in the reciprocal channel can learn \( \overleftarrow{U^j} \) in a similar manner. For TDD systems, the transmitters can estimate the channels from the sounding signals received in the reverse link ([21], section II-A). Using MMSE channel prediction, the CSI estimate H kk is obtained, whereas the CSI error E kj is Gaussian distributed and independent from the CSI estimate H kj ([21], section II-A).

3.3 Proof of convergence

Now, the convergence of the proposed EM algorithm is demonstrated. Equivalent problem is considered. EM can be written as follows

$$ \mathit{\max}\frac{{u_d^k}^{\dagger }Q{u}_d^k}{{u_d^k}^{\dagger }F{u}_d^k}, $$
(14)

where \( Q={Q}^{\dagger }={T}_d^k+P{\sigma}^2I\ge 0 \), and \( F={F}^{\dagger }={S}^k-{T}_d^k+\left(P{\sigma}^2\sum_{j=1}^K{D}^j-P{\sigma}^2+{N}_0\right)I>0 \) are matrices and \( {u}_d^k \) indicates optimization variable. It is shown in [26] that (14) is equivalent to

$$ \begin{array}{c}\hfill \max {u_d^k}^{\dagger }{Qu}_d^k,\hfill \\ {}\hfill \mathrm{s}.\mathrm{t}.{u_d^k}^{\dagger }{Fu}_d^k=1.\hfill \end{array} $$
(15)

For the equivalent problem, the Lagrangian function is given by \( l\left({u}_d^k,\lambda \right)={u_d^k}^{\dagger }Q{u}_d^k+\lambda \left(1-{u_d^k}^{\dagger }F{u}_d^k\right) \). The solution \( {u_d^k}^{\ast } \) is the eigenvector corresponding to the maximal eigenvalue of F −1 Q, and the Lagrange multiplier is \( {\lambda}^{\ast }={{u_d^k}^{\ast}}^{\dagger }Q{u_d^k}^{\ast } \).

The metric is defined in (16). It is proved here that each step in the algorithm increases the metric. Since it cannot increase unboundedly, this implies that equivalent problem converges and consequently algorithm Fig. 3 also converges. It is important to note that the metric is the same for both original and reciprocal networks.

$$ {\mathit{\max}}_{\begin{array}{c}{V}^j\ and\ {U}^K\\ {}\forall j\ and\ k\in \mathcal{K}\end{array}} metric={\sum}_{k=1}^K{\sum}_{d=1}^{D^k}l\left({u}_d^k,\lambda \right). $$
(16)

Accordingly:

$$ {\mathit{\max}}_{\begin{array}{c}{U}^K\\ {}\forall k\in \mathcal{K}\end{array}} metric={\sum}_{k=1}^K{\sum}_{d=1}^{D^k}{\mathit{\max}}_{u_d^k\ }l\left({u}_d^k,\lambda \right). $$
(17)

In other words, given \( {V}^j\forall j\in \mathcal{K} \), Step 1 increases the value of (16) over all possible choices of \( {U}^k\forall k\in \mathcal{K} \). The filter \( \overleftarrow{U^j} \) computed in Step 3, based on \( \overleftarrow{V^k}={U}^k \), also maximizes the metric in the reciprocal channel (18).

$$ \begin{array}{c}\hfill { \max}_{\begin{array}{c}\hfill \overleftarrow{U^j}\hfill \\ {}\hfill \forall j\in K\hfill \end{array}}\overleftarrow{metric},\hfill \\ {}\hfill \overleftarrow{metric}={\sum}_{j=1}^K{\sum}_{d=1}^{D^j}\overleftarrow{l}\left(\overleftarrow{u_d^j},\lambda \right)={\sum}_{j=1}^K{\sum}_{d=1}^{D^j}{\overleftarrow{u_d^j}}^{\dagger}\overleftarrow{Q}\overleftarrow{u_d^j}+\lambda \left(1-{\overleftarrow{u_d^j}}^{\dagger}\overleftarrow{F}\overleftarrow{u_d^j}\right).\hfill \end{array} $$
(18)

Since \( \overleftarrow{V^k}={U}^k \) and \( \overleftarrow{U^j}={V}^j \), the metric remains unchanged in the original and reciprocal networks, according to following equation:

$$ \begin{array}{c}\hfill \overleftarrow{metric}={\sum}_{j=1}^K{\sum}_{d=1}^{D^j}{u_d^j}^{\dagger}\left[{T}_d^j+P{\sigma}^2I\right]{u}_d^j+\hfill \\ {}\hfill {\sum}_{j=1}^K{\sum}_{d=1}^{D^j}{\lambda}_d^j\left(1+{u_d^j}^{\dagger}\left[{T}_d^j-\left(P{\sigma}^2{\sum}_{k=1}^K{D}^k-P{\sigma}^2+{N}_0\right)I\right]{u}_d^j\right)-\hfill \\ {}\hfill P{\sum}_{j=1}^K{\sum}_{d=1}^{D^j}{\sum}_{k=1}^K{\sum}_{m=1}^{D^k}{\lambda}_d^j{u_m^k}^{\dagger }{H}^{kj}{v}_d^j{v_d^j}^{\dagger }{H}^{kj^{\dagger }}{u}_m^k= metric.\hfill \end{array} $$
(19)

Therefore, Step 3 also can increase the value of (16). Since the value of (16) is monotonically increased after every iteration, convergence of the algorithm is guaranteed.

4 Robust transceiver design II

Realized SINRs for different realizations of the CSI error matrices, are samples of the SINR random variable. This random variable can has large variance. Hence, the realized SINR, depending on the particular realization of the CSI error matrix, could be very far from the expected value. Therefore, to hedge against such variability, each receiver adjusts its receive interference suppression matrix based on minimizing SINR variance [27]:

$$ { \min}_{u_d^k}VAR\left[{SINR_{lb}}_d^k\right],\forall d\in \left\{1,\dots, {D}^k\right\}. $$
(20)

According to VAR(x) = E(x 2) − E(x)2, the VAR(x) is minimized by minimizing E(x 2) and maximizing E(x)2. These two terms may not attain their best values for the same transceivers in some MIMO IC system model, due to the contradiction between them. It will be shown that the VM scheme presents sum data rate (maximize E(x)) lower than the Max-SINR algorithm but it enables transceivers to hedge against variability for the primary scenario in simulation results. For the second scenario, the VM improves sum data rate superior to the Max-SINR and provides SINR with low variance, simultaneously.

In section 4.1, the value of \( VAR\left[{SINR_{lb}}_d^k\right] \) is approximated by using statistical linearization argument. The iterative solution is similar to Algorithm I.

4.1 Estimating the variance of \( {{\mathrm{SINR}}_{\mathrm{lb}}}_{\mathrm{d}}^{\mathrm{k}} \)

Lower bound on the SINR is derived in terms of norms of error matrices in [21]. Lower bound is

$$ {SINR_{lb}}_d^k=\frac{P{\left\Vert {u_d^k}^{\dagger }{H}^{kk}{v}_d^k\right\Vert}^2-P{e}^{kk}{\left\Vert {u}_d^k\right\Vert}^2}{P\sum_{j=1}^K\sum_{m=1}^{D^j}{\left\Vert {u_d^k}^{\dagger }{H}^{kj}{v}_m^j\right\Vert}^2+P{\left\Vert {u}_d^k\right\Vert}^2\sum_{j=1}^K{e}^{kj}{D}^j-P{\left\Vert {u_d^k}^{\dagger }{H}^{kk}{v}_d^k\right\Vert}^2-P{e}^{kk}{\left\Vert {u}_d^k\right\Vert}^2+{N}_0{\left\Vert {u}_d^k\right\Vert}^2}. $$
(21)

E kj2 is the Euclidian norm of error matrix between transmitter j and receiver k, denotes by e kj in equation (21). \( \frac{{\left\Vert {E}^{kj}\right\Vert}^2}{\raisebox{1ex}{${\sigma}^2$}\!\left/ \!\raisebox{-1ex}{$2$}\right.} \), is the sum of the second power of 2NM real independent Gaussians, each having a unit variance. Therefore, it has a Chi-square distribution with 2NM degrees of freedom, \( {\chi}_{2NM}^2 \). Hence, one can conclude E[e kj] = NMσ 2 and VAR[e kj] = NMσ 4 [28].

Equation (21), \( {SINR_{lb}}_d^k \), is a function of random vector \( {e}^k={\left[\begin{array}{ccc}{e}^{k1}& \dots & {e}^{kK}\end{array}\right]}^{\boldsymbol{t}} \). It is clear from covariance matrix, \( Cov\left({e}^k\right)=\left[\begin{array}{ccc}MN{\sigma}^4& \dots & 0\\ {}\vdots & \ddots & \vdots \\ {}0& \dots & MN{\sigma}^4\end{array}\right] \), that the variance of each element is sufficiently small for practical applications. Besides, the covariance between each two components is zero. It is concluded that the PDF of e k is concentrated near its mean and it is negligible outside a neighborhood around the mean value. Again, by using the statistical linearization argument, first order Taylor series expansion of \( {SINR_{lb}}_d^k \) around the mean value.

\( \theta =E\left[{e}^k\right]={\left[\begin{array}{ccc}MN{\sigma}^2& \dots & MN{\sigma}^2\end{array}\right]}^t \) is employed to yieldFootnote 2:

$$ \begin{array}{c}\hfill {SINR}_{lb_d}^k\cong {SINR}_{lb_d}^k\left(\theta \right)+\frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^{k1}}\left({e}^{k1}-NM{\sigma}^2\right)+\dots +\frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^{kK}}\left({e}^{kK}-NM{\sigma}^2\right)\hfill \\ {}\hfill ={SINR}_{lb_d}^k\left(\theta \right)+{\frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^k}}^t\left({e}^k-\theta \right),\hfill \end{array} $$
(22)

Using (22) for approximating the variance, following equation is obtained.

$$ VAR\left[{SINR}_{lb_d}^k\right]\cong E\left[{\left({SINR}_{lb_d}^k\left(\theta \right)+{\frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^k}}^t\left({e}^k-\theta \right)-E\left[{SINR}_{lb_d}^k\right]\right)}^2\right]. $$
(23)

Since exact computation of \( E\left[{SINR_{lb}}_d^k\right] \) is not feasible, approximate mean \( {SINR_{lb}}_d^k\left(\theta \right) \) is used (estimation of mean is provided in Appendix 1). Therefore, estimation of variance leads toFootnote 3

$$ VAR\left[{SINR}_{lb_d}^k\right]\cong {\frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^k}}^tCov\left({e}^k\right)\frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^k}. $$
(24)

The elements of \( \frac{\partial {SINR_{lb}}_d^k}{\partial {e}^k} \) are given by

$$ \begin{array}{c}\hfill \frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^{kj}}=-\frac{PD^j\left({u_d^k}^{\dagger }{u}_d^k\right)\left({u_d^k}^{\dagger}\left[{T}_d^k-PMN{\sigma}^2I\right]{u}_d^k\right)}{{\left({u_d^k}^{\dagger}\left[{S}^k-{T}_d^k+\left(PMN{\sigma}^2{\sum}_{j=1}^K{D}^j-PMN{\sigma}^2+{N}_0\right)I\right]{u}_d^k\right)}^2},\forall j\in \mathcal{K},j\ne k\hfill \\ {}\hfill \frac{\partial {SINR}_{lb_d}^k\left(\theta \right)}{\partial {e}^{kk}}=-\frac{P\left({u_d^k}^{\dagger }{u}_d^k\right)\left({u_d^k}^{\dagger}\left[{S}^k+\left({D}^k-2\right){T}_d^k+\left(PMN{\sigma}^2{\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{D}^j+{N}_0\right)I\right]{u}_d^k\right)}{{\left({u_d^k}^{\dagger}\left[{S}^k-{T}_d^k+\left(PMN{\sigma}^2{\sum}_{j=1}^K{D}^j-PMN{\sigma}^2+{N}_0\right)I\right]{u}_d^k\right)}^2},j=k\hfill \end{array} $$
(25)

According to (24) and (25), the estimated variance is

$$ VAR\left[{SINR_{lb}}_d^k|{H}^{kj}\right]\cong MN{P}^2{\sigma}^4{\left({u_d^k}^{\dagger }{u}_d^k\right)}^2\frac{{\left({u_d^k}^{\dagger}\left[{S}^k+\left({D}^k-2\right){T}_d^k+\left(PMN{\sigma}^2\sum_{\begin{array}{c}j=1\\ {}j\ne k\end{array}}^K{D}^j+{N}_0\right)I\right]{u}_d^k\right)}^2+\left(\sum_{\begin{array}{c}j=1\\ {}j\ne k\end{array}}^K{\left({D}^j\right)}^2\right)\ {\left({u_d^k}^{\dagger}\left[{T}_d^k-PMN{\sigma}^2I\right]{u}_d^k\right)}^2}{{\left({u_d^k}^{\dagger}\left[{S}^k-{T}_d^k+\left(PMN{\sigma}^2\sum_{j=1}^K{D}^j-PMN{\sigma}^2+{N}_0\right)I\right]{u}_d^k\right)}^4}. $$
(26)

4.2 Iterative solution

Briefly, \( {u}_d^k \) should satisfy the following vector equation: To obtain columns of U k, the equation \( \frac{\partial VAR\left[{SINR_{lb}}_d^k\right]}{\partial {u}_d^k}=0 \) should be solved. Thus, \( {u}_d^k \) should satisfy the following vector equation (The terms involving \( {S}^k{u}_d^k \) and \( {u}_d^k \) are moved to left and the term \( {T}_d^k{u}_d^k \) is moved to the right-hand side).

$$ {\alpha}_d^k{S}^k{u}_d^k+{\beta}_d^k{u}_d^k={\zeta}_d^k{T}_d^k{u}_d^k. $$
(27)

In the above vector equation, \( {\alpha}_d^k \), \( {\beta}_d^k \), and \( {\zeta}_d^k \) denote scalar coefficients and are expanded as follows

$$ \begin{array}{c}\hfill {\alpha}_d^k=2ac-4{a}^2-4\left({\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{\left({D}^j\right)}^2\right)\ {b}^2;\hfill \\ {}\hfill {\beta}_d^k=2{a}^2c+2\left({\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{\left({D}^j\right)}^2\right)\ {b}^2c+2\left(PMN{\sigma}^2{\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{D}^j+{N}_0\right)ac-2\left({\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{\left({D}^j\right)}^2\right)\times \hfill \\ {}\hfill PMN{\sigma}^2bc-4\left({a}^2+\left({\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{\left({D}^j\right)}^2\right)\ {b}^2\right)\left(PMN{\sigma}^2{\sum}_{j=1}^K{D}^j-PMN{\sigma}^2+{N}_0\right);\hfill \\ {}\hfill {\zeta}_d^k=-\left(2\left({D}^k-2\right)ac+2\left({\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{\left({D}^j\right)}^2\right)bc+4{a}^2+4\left({\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{\left({D}^j\right)}^2\right)\ {b}^2\right).\hfill \end{array} $$
(28)

In (28), the parameters of the scalar coefficients (a, b, and c) should be substituted by:

$$ \begin{array}{c}\hfill a={u_d^k}^{\dagger}\left[{S}^k+\left({D}^k-2\right){T}_d^k+\left(PMN{\sigma}^2{\sum}_{\begin{array}{c}\hfill j=1\hfill \\ {}\hfill j\ne k\hfill \end{array}}^K{D}^j+{N}_0\right)I\right]{u}_d^k;\hfill \\ {}\hfill b={u_d^k}^{\dagger}\left[{T}_d^k-PMN{\sigma}^2I\right]{u}_d^k;\hfill \\ {}\hfill c={u_d^k}^{\dagger}\left[{S}^k-{T}_d^k+\left(PMN{\sigma}^2{\sum}_{j=1}^K{D}^j-PMN{\sigma}^2+{N}_0\right)I\right]{u}_d^k.\hfill \end{array} $$
(29)

The unit vector that minimizes (26) is given by

$$ {u}_d^k=\frac{{\left({S}^k+{\varPsi}_d^k\ I\right)}^{-1}{H}^{kk}{v}_d^k}{\left\Vert {\left({S}^k+{\varPsi}_d^k\ I\right)}^{-1}{H}^{kk}{v}_d^k\right\Vert },{\varPsi}_d^k=\frac{\beta_d^k}{\alpha_d^k} $$
(30)

The iterative procedure, Fig. 4, is algorithmically identical to the Algorithm I. Moreover, the distributed implementation explained previously can be applied again.

Fig. 4
figure 4

Schematic view of Algorithm II

5 Simulation results

In this section, simulation results for the sum data rate and variance of SINR are presented. Also, the influence of the mean value approximation, and other influential parameters on the accuracy of the approximation are determined via Monte Carlo simulations. The proposed robust transceiver design algorithms are evaluated through comparison to the following algorithms:

  1. 1.

    Leakage interference minimization [13]

  2. 2.

    Max-SINR algorithm [13]

  3. 3.

    Minimum mean square error (MMSE) [15]

  4. 4.

    Robust MMSE [22]

  5. 5.

    Worst-case optimization approach [21].

The use of [13, 15] for comparison is not actually fair because these papers design filter with perfect CSI. The reason is that, their sum-rates do not increase linearly with SNR, and they achieve lower sum-rates compared to our proposed algorithm. Mean loss in sum rate compared to perfect CSI is shown in Fig. 7 e.g., for Max-SINR algorithm [21, 22] design robust transceiver for MIMO interference network, under imperfect CSI.

The channel is modeled as Rayleigh fading. Channel coefficients [G kj] nm are i.i.d. zero mean unit variance circularly symmetric complex Gaussian. MIMO IC has K = 4 users with N = M = 3 antennas at transmitters and receivers and each user transmits d = 1 data stream denoted by (3 × 3, 1)4. Simulation results are also presented for (4 × 4, 2)3 MIMO IC.

Twenty error matrices are generated for a true channel and numerical results are averaged over them. Averaging process over the error is repeated for 20 true channels to eliminate the dependence of the numerical results on the true channel, randomly created; In other words, the results are obtained after 400 Monte Carlo simulations. The stopping criterion for the convergence of the proposed iterative algorithms is 100 iterations. All numerical results are based on the SINR associated with the imperfect CSI.Footnote 4

5.1 Throughput enhancement

Figure 5 represents the sum rate comparison between the proposed and basic algorithms for (3 × 3, 1)4 MIMO IC. The filters are designed with the error variance of σ 2 = 0.1. It can be observed that the EM scheme achieves higher sum rates compared to all the other schemes over the entire considered SNRFootnote 5 range. Proposed EM scheme achieves 7 dB SNR gain over the Max-SINR algorithm at providing 14 b/s/Hz sum data rate and etc. Though it does not seem that the VM algorithm improves the overall sum data rate as satisfactory as the EM scheme, it can be roughly be concluded that the VM algorithm achieves data rate as much as (slightly lower than) Max-SINR. Sum rate of algorithm 5 is shown in Fig. 5. It presents sum data rate performance as robust MMSE.

Fig. 5
figure 5

Average sum data rate versus SNR. K = 4, N = M = 3, d = 1, σ 2 = 0.1

Figure 6 shows sum rate for (4 × 4, 2)3 MIMO IC. Again, the EM scheme achieves higher sum rates compared to all the other schemes. In comparison with Max-SINR, it cannot achieve data rate higher than 12 b/s/Hz, while the EM scheme improves data rate up to 16 b/s/Hz. Compared to the Max-SINR, the VM helps to mitigate the effect of the CSI error more effectively, as shown in Fig. 6. The algorithm 5 achieves a data rate only superior to the leakage interference minimization. It should be noted that, this scheme guaranties to provide better worst-case data rate.

Fig. 6
figure 6

Average sum data rate versus SNR. K = 3, N = M = 4, d = 2, σ 2 = 0.1

The better performances of the proposed schemes compared to other algorithms are accountable to suitably: 1- Both proposed schemes improve the resilience against SINR degradation due to the CSI error. 2- Approximation (when approximate mean or variance is used for maximization or minimization, the proposed transceivers will lead to sum rate improvement).

5.2 Hedge against variability

Average SINR variance in (3 × 3, 1)4 and (4 × 4, 2)3 MIMO ICs are reported in Tables 1 and 2, respectively. It can be observed that proposed VM scheme achieves lower variance compared to other schemes, when the amount of SNR (>10 dB) is significant. In case of small SNR (<10 dB), the algorithms have variance close to each other.

Table 1 Average SINR variance in (3 × 3, 1)4 interference network with error variance σ 2 = 0.1
Table 2 Average SINR variance in (4 × 4, 2)3 interference network with σ 2 = 0.1

By Considering Fig. 5 and Table 1 jointly, it is concluded that VM cannot achieve data rate higher than and SINR variance lower than Max-SINR simultaneously. This is due to the contradiction between minimizing E(x 2) and maximizing E(x)2 in VM algorithm.

The comparative improvement in SINR variance level becomes negligible in (4 × 4, 2)3 MIMO IC, since variance of the VM approaches other algorithms, as presented in Table 2. On the other hand, VM achieves data rate better than Max-SINR as seen in Fig. 6. Therefore, VM presents a balance between minimizing E(x 2) and maximizing E(x)2 in this scenario.

The cost of the proposed robust design methods compared with the contrast schemes is the complexity since the MMSE ([15], equation 11), and Robust MMSE ([22], equation 4) need the inverse operation of an N-by-N matrix only once to update V k (or U k) at each user per iteration, whereas the proposed algorithms (equations 11 and 30) require d times. In the SINR maximizing algorithm ([13], equation 30), contrast Scheme 5 ([21], equation 13), the transmit and receive filters are column-wise updated, as complex as the proposed algorithms.

5.3 Conformity with performance evaluation in [4]

Based on equation 21 in [4], it is expected that mean loss in sum rate compared to perfect CSI case increases unboundedly as SNR increases. In addition, based on equation 22 in [4], the achievable multiplexing gain should be equal to zero. These are confirmed in Fig. 7, where for the larger SNR, the wider gap between curves representing perfect CSI and imperfect CSI can be noticed. Slope of the curves or multiplexing gain when σ 2 = 0.1 is zero, too.

Fig. 7
figure 7

Average sum rates achieved by the EM and Max-SINR algorithms for (3 × 3, 1)4 and (4 × 4, 2)3 interference networks with σ 2 = 0.1

5.4 Accuracy of approximation

It is straightforward to say as σ 2 decreases, the impact of any error in approximating average mutual information diminishes. On the other hand, signal power P scales with σ 2 as it is obvious from (7). Therefore, any approximation error will be attenuated as P decreases. But here, the impact of loss due to the mean approximation and other influential parameters on the accuracy of approximation are studied via Monte Carlo simulations. The EM algorithm is used by (3 × 3, 1)4 MIMO IC to compute precoding and interference suppression matrices. The filters are designed with two CSI error variances \( {\sigma}_1^2=0.05 \) and \( {\sigma}_2^2=0.1 \). The numerical and theoretical values of average mutual informationFootnote 6 of the MIMO IC versus SNR are depicted in Fig. 8.

Fig. 8
figure 8

Theoretical prediction of E[C| H kj] and numerical value for (3 × 3, 1)4 MIMO IC is shown versus SNR. The filters are designed with EM algorithm and two CSI error variances \( {\sigma}_1^2=0.05 \) and \( {\sigma}_2^2=0.1 \)

For \( {\sigma}_1^2=0.05 \), the proposed approximation is within 15 and 18% of the true value when SNR ≤ 10dB and SNR ≤ 14dB. For \( {\sigma}_2^2=0.1 \) similar statement stands when SNR ≤ 7dB and SNR ≤ 9dB. The proposed approximation is within 22.16 and 24.81% of the true value, respectively for \( {\sigma}_1^2 \) and \( {\sigma}_2^2 \), over the entire considered SNR range. Therefore, by decreasing SNR and σ 2 the theoretical approximation will approach the true value of average mutual information.

5.5 Convergence of VM algorithm

Since it is hard to prove convergence of this problem theoretically (it is mathematically intractable), it is investigated numerically. Convergence of the iterative algorithm is shown numerically by considering fraction of interference leakage to the received signal parameter [22]. Figure 9 shows parameter for the proposed schemes versus iterations. In (3 × 3, 1)4 MIMO IC, EM and VM converge after 10 and 50 iterations as Fig. 9 shows. Proposed algorithms converge after 20 iterations in (4 × 4, 2)3 MIMO IC.

Fig. 9
figure 9

Sum of fraction of interference leakage to received signal for two interference networks with σ 2 = 0.1, and \( \frac{P}{N_0}=10 dB \) . Convergence behavior of VM is like other algorithms in two considered MIMO IC scenarios. Parameter decreases and then remains within a neighborhood of final value

The stopping criterion for the convergence of the iterative algorithms is 100 iterations. Further results regarding the convergence of the VM in larger MIMO ICs are provided for (10 × 10, 5)3 and (6 × 8, 4)2 in Fig. 10. Proposed algorithms converge after 10 iterations (left-hand figure). EM and VM converge after 10 and 30 iterations in (6 × 8, 4)2 MIMO IC.

Fig. 10
figure 10

Sum of fraction of interference leakage to received signal in (10 × 10, 5)3 and (6 × 8, 4)2 MIMO ICs with σ 2 = 0.1, and \( \frac{P}{N_0}=10 dB \) . Parameter decreases after each iteration and then remains constant around final value

6 Conclusions

In this paper, two robust algorithms were proposed. In the EM scheme, filters were adjusted based on the problem of expectation maximization of SINR. The other design minimized the variance of SINR to hedge against variability due to the CSI error. Taylor series expansion was exploited to approximate the effect of imperfection in CSI on statistical properties. Proposed robust algorithms utilized the reciprocity of wireless networks to optimize estimated statistical properties in two different working modes. Monte Carlo simulations demonstrated that the EM scheme improves data rate of MIMO IC under imperfect CSI. The VM algorithm provided SINR with low variance. Moreover, it improved sum rate, but not as satisfactory as the EM scheme.