Category: Bipolar networks

Realistic communication

Today’s blog is about realistic communication, i.e., what kind of performance can realistically be expected of a wireless network. To get started, let’s have a look at an excerpt from a recent workshop description:

“Future wireless networks will have to support many innovative vertical services, each with its own specific requirements, e.g.

End-to-end latency of 1 ns and reliability higher than 99.999% for URLLCs.
Terminal densities of 1 million of terminals per square kilometer for massive IoT applications.
Per-user data-rate of the order of Terabit/s for broadband applications.”

Let’s break this down, bullet by bullet.

First bullet: In 1 ns, light travels 30 cm in free space. So “end-to-end” here would mean a distance of at most 10 cm, to leave some fraction of a nanosecond for encoding, transmission, and decoding. But what useful wireless service is there where transceivers are within at most 10 cm? Next, a packet loss rate of 10^-5 means that the spectral efficiency must be very low. Together with a latency constraint of 1 ns, ultrahigh bandwidths must be used, which, in turn, makes the design of circuitry and antenna arrays extremely challenging. At least the channel can be expected to be benign (line-of-sight).

Where does stochastic geometry come in? Assuming that these ultrashort links live in a network and not in isolation, interference will play a role. Let us consider a Poisson bipolar network with normalized link distance 1, a path loss exponent α and Rayleigh fading. What is the maximum density of links that can be supported that have an outage of at most ε? This quantity is known as the spatial outage capacity (SOC). For small ε, which is our regime of interest here, we have

$\displaystyle\text{SOC}\sim\left(\frac{\varepsilon}{\rho}\right)^\delta c_\delta,\quad \varepsilon\to 0,$

where δ=2/α and c_δ is a constant that only depends on the path loss exponent 2/δ. ρ is the spectral efficiency (in bits/s/Hz or bps/Hz). This shows the fundamental tradeoff between outage and spectral efficiency: Reducing the outage by a factor of 10 reduces the rate of transmission by the same factor if the same link density is to be maintained. Compared to a more standard outage constraint of 5%, this means that the rate must be reduced by a factor 5,000 to accommodate the 99.999% reliability requirement. Now, say we have 0.5 ns for the transmission of a message of 50 bits, the rate is 100 Gbps. Assuming a very generous spectral efficiency of 100 bps/Hz for a system operating at 5% outage, this means that 100 Gbps must be achieved at a spectral efficiency of a mere 0.02 bps/Hz. So we would need 5 THz of bandwidth to communicate a few dozen bits over 10 cm.
Even relaxing the latency constraint to 1 μs still requires 5 GHz of bandwidth.

In cellular networks, the outage-rate relationship is governed by a very similar tradeoff.
For any stationary point process of base stations and Rayleigh fading, the SIR meta distribution asymptotically has the form

$\displaystyle \bar F(\rho,\varepsilon)\sim\left(\frac{\varepsilon}{\rho}\right)^\delta C_\delta,\quad \varepsilon\to 0,$

where C_δ again depends only on the path loss exponent. This is the fraction of users who achieve a spectral efficiency of ρ with an outage less than ε, remarkably similar to the bipolar result. To keep this fraction fixed at, say, 95%, again the spectral efficiency needs to be reduced in proportion to a reduction of the outage constraint ε.

Second bullet: Per the classification and nomenclature in a dense debate, this density falls squarely in the tremendously dense class, above super-high density and extremely high density. So what do the anticipated 100 devices in an average home or 10,000 devices in an average parking lot do? What kind of messages are they exchanging or reporting to a hub? How often? What limits the performance? These devices are often said to be “connected“, without any specification what that means. Only once this is clarified, a discussion can ensue whether such tremendous densities are realistic.

Third bullet: Terabit-per-second (Tbps) rates require at least 10 GHz of spectrum, optimistically. 5G in its most ambitious configuration, ignoring interference, has a spectral efficiency of about 50 bps/Hz, and, barring any revolutionary breakthrough, more than 100 bps/Hz does not appear feasible in the next decade. Similarly, handling a signal 10 GHz wide would be an order of magnitude beyond what is currently possible. Plus such large junks of spectrum are not even available at 60 GHz (the current mm-wave bands). At 100 GHz and above, link distances are even more limited and more strongly subject to blockages, and analog beamforming circuitry becomes much more challenging and power-hungry. Most importantly, though, peak rates are hardly achieved in reality. In the 5G standard, the user experienced data rate (the rate of the 5-th percentile user) is a mere 1% of the peak rate, and this fraction has steadily decreased over the cellular generations:

So even if 1 Tbps peak rates became a reality, users would likely experience between 1 Gbps to at most 10 Gbps – assuming their location is covered, which may vary over short spatial scales. Such user percentile performance can be analyzed using meta distributions.

In conclusion, while setting ambitious goals may trigger technological advances, it is important to be realistic of what is achievable and what performance the user actually experiences. For example, instead of focusing on 1 Tbps peak rates, we could focus on delivering 1 Gbps to 95% of the users, which may still be very challenging but probably achievable and more rewarding to the user. And speaking of billions of “connected devices” is just marketing unless it is clearly defined what being connected means.

For more information on the two analytical results above, please see this paper (Corollary 1) and this paper (Theorem 3).

Single-point simulation?

When applying simalysis as illustrated in the previous post, the question arises where to put the boundary between the part of the point process that is simulated and the part that is analyzed. Specifically, we may wonder whether we can reduce the simulated part to only a single point (on average), i.e., to choose the number of simulated points to be Poisson with mean 1 in each realization.
Let’s find out, using the same Poisson bipolar model as in the previous post (Rayleigh fading, transmitter density 1, link distance 1/4).

*Fig. 1. Simalysis of SIR ccdf for Poisson bipolar network averaged over 500 realizations with 1 point on average. The dashed curves show the exact results.*

Fig. 1 shows the simulated (or, more precisely, simalyzed) result where the 500 realizations only contain one single point on average. This means that about 500/e ≈ 184 realizations have no point at all. We observe good accuracy, especially at small path loss exponents α. Also, the simulated curves are lower bounds to the exact ones. This is due to Jensen’s inequality:

$\displaystyle \mathbb{E}\exp(-sI_c) \geq \exp(-s\mathbb{E}(I_c)),\quad s>0.$

The term on the left side is the exact factor in the SIR ccdf due to the interference I_c from points outside distance c. It is larger than the right side, which is the factor used in the simalysis (see the Matlab code). This property holds for all stationary point processes.

But why stop there? One would think that using, say, only 1/4 point on average would be (essentially) pointless. But let’s try, just to be sure.

*Fig. 2. Same plot as in Fig. 1 but the curves are simalyzed with only 1/4 point per realization on average.*

Remarkably, even with only 1/4 point per realization on average, the curves for α<2.5 are quite accurate, and the one for α=2.1 is an excellent lower bound! It is certainly much more accurate than a classical simulation with 500,000 points per realization (see the previous post). Such a good match is quite surprising, especially considering that 1/4 point on average means that about 78% of the realizations have 0 points, which means that in about 390 out of the 500 realizations, the simulated factor in the SIR ccdf simply yields 1. Also, in the entire simalysis, only about 125 points are ever produced. It takes no more than about 1/2 s on a standard computer.
We conclude that accurate simulation (simalysis, actually) can be almost point-less.

Simalysis: Symbiosis of simulation and analysis

Simulations can be quite time-consuming. Are there any techniques that can help make them more efficient and/or accurate? Let us focus on a concrete problem. Say we would like to plot the SIR distribution in a Poisson bipolar network for different path loss exponents α, including some values close to 2. Since we would like to compare the result with the exact one, we focus on the Rayleigh fading case where the analytical expression is known and simple. The goal is to get accurate curves for α=3, 2.5, 2.25, and 2.1, and we would like to wait no longer than 1 s for the results on a standard desktop or laptop computer.
Let us first discuss why this is a non-trivial problem. It involves averaging w.r.t. the fading and the point process, and we need to make sure that the number of interferers is large enough for good accuracy. But what is “large enough”? A quick calculation using Campbell’s theorem (for sums) reveals that if we want to capture 99% of the mean interference power (outside radius 1 to avoid complications due to a potential singularity in the path loss law), we find that for α=3, the simulation region needs to be 100 times larger than for α=4. This seems manageable, but for α=2.5, 2.25, 2.1, it is 10⁶, 10¹⁴, 10³⁸ times larger, respectively!
Clearly the straightforward approach of producing many realizations of the PPP in a large region does not work in the regime of small α. So how can we achieve our goal above – high accuracy and high efficiency?

The solution is to use an analysis-enhanced simulation technique, which I call simalysis. While we often tend to think as analysis vs. simulation as a dichotomy, in this approach they are used symbiotically. The idea is to exploit analytical results whenever possible to make simulations faster and more accurate. Let me illustrate how simalysis works when applied to the problem above.

For small α, it is impossible to “capture” most of the interference solely by simulation. In fact, most of it stems from the infinitely many distance nodes, each one contributing little, with independent fading. We can thus assume that the variance of the interference of the nodes further than a certain distance (relatively large compared with the mean nearest-neighbor distance) is relatively small. Accordingly, replacing it by its mean is a sensible simplification. Here is where the analysis comes in. For any stationary point process of density λ, the mean interference from the nodes outside distance c is

$\displaystyle I(c)=2\pi\lambda\int_c^\infty r^{1-\alpha}{\rm d} r=\frac{2\pi\lambda}{\alpha-2} c^{2-\alpha}.$

This interference term can then be added to the simulated interference, which stems from points within distance c. Simulating as few as 50 points is enough for very high accuracy. The result is shown in the figure below, using the MH scale so that the entire distribution is revealed (see this post for details on the MH scale). For α near 2, the curves are indistinguishable!

Fig. 1. SIR distribution for Poisson bipolar network in MH scale. Density 1, link distance 1/4.
The solid lines are obtained by simalysis, the dashed lines show the exact analytical results.

This simulation averages over 500 realizations of the PPP and runs in less than 1 s on a laptop. The Matlab code is available here. It uses a second simalytic technique, namely the analytical averaging over the fading. Irrespective of the type of point process we want to simulate, as long as the fading is Rayleigh, we can perform the averaging over the fading analytically.

For comparison, the figure below shows the simulation results if 500,000 points (interferers) are simulated, without adding the analytical mean interference term, i.e., using classical simulation. Despite taking 600 times longer, the distributions for α<2.5 are not acceptable.

*Fig. 2. Classically simulated SIR distribution (solid lines) in Poisson bipolar network, compared with the exact analytical ones (dashed). Same parameters as in Fig. 1.*

Double-proving by simulation?

Let us consider a hypothetical scenario that illustrates an issue I frequently observe.

Author: Here is an important result for a canonical Poisson bipolar network:
Theorem: The complementary cumulative distribution of the SIR in a Poisson bipolar network with Rayleigh fading, transmitter density λ, link distance r, and path loss exponent 2/δ is

$\displaystyle \bar F(\theta)=\exp(-\lambda\pi r^2 \theta^\delta\Gamma(1-\delta)\Gamma(1+\delta)).$

Proof: [Gives proof based on the probability generating functional.]

Reviewer: This is a nice result, but it is not validated by simulation. Please provide simulation results.

We have a proven exact analytical (PEA) result. So why would we need a simulation for “validation”? Where does the lack of trust in proofs come from? I am puzzled by these requests by reviewers. Similar issues arise when authors themselves feel the need to add simulations to the visualization of PEA results.
Perhaps some reviewers are not familiar with the analytical tools used or they think it is easier to have a quick look at a simulated curve rather than checking a proof. Perhaps some authors are not entirely sure their proofs are valid, or they think reviewers are more likely to trust the proofs if simulations are also shown.
The key issue is that such requests by reviewers or simulations by authors take simulation results as the “ground truth”, while portraying PEA results as weaker statements that need validation. This of course is not the case. A PEA result expresses a mathematical fact and thus does not need any further “corroboration”.
Now, if the simulation results are accurate, the analytical and simulated curves lie exactly on top of each other, and the accompanying text states the obvious: “Look, the curves match!”. But what if there isn’t an exact match between the analytical and the simulated curve? Which means that the simulation is not accurate. Certainly that does not qualify as “validation”. The worst conclusion would be to distrust the PEA result and take the simulation as the true result.
By its nature, a simulation is always restricted to a small cross-section of the parameter space. Even the simple result above has four parameters, which would make it hard to comprehensively simulate the network. Related, I am inviting the reader to simulate the result for a path loss exponent α=2.1 or δ=0.95. Almost surely the simulated curve will look quite different from the analytical one.
In conclusion, there is absolutely no need for “two-step verification” of PEA results. On the contrary.

Fraction of reliable links

The fraction of reliable links is an important metric, in particular for applications with (ultra-)high reliability requirements. In the literature, we see that it is sometimes equated with the transmission success probability of the typical link, given by

$\displaystyle p_{\text{s}}=\mathbb{P}(\text{SIR}>\theta).$

This is the SIR distribution (in terms of the complementary cdf) at the typical link. In this post I would like to discuss whether it is accurate to call p_s the fraction of reliable links.

Say someone claims “The fraction of reliable links in this network is p_s=0.8″, and I ask “But how reliable are these links?”. The answer might be “They are (at least) 80% reliable of course, because p_s=0.8.” Ok, so let us assume that the fraction links with reliability at least 0.8 is 0.8. Following the same logic, the fraction of links with reliability at least 0.7 would be 0.7. But clearly that fraction cannot be smaller than the fraction of links with reliability at least 0.8. There is an obvious contradiction. So how can we quantify the fraction of reliable links in a rigorous way?

First we note that in the expression for p_s, there is no notion of reliability but p_s itself. This leads to the wrong interpretation above that a fraction p_s of links has reliability at least p_s. Instead, we want so specify a reliability threshold so that we can say, e.g., “the fraction of links that are at least 90% reliable is 0.8”. Naturally it then follows that the fraction of links that are at least 80% reliable must be larger than (or equal to) 0.8. So a meaningful expression for the fraction of reliable links must involve a reliability threshold parameter that can be tuned from 0 to 1, irrespective of how reliable the typical link happens to be.

Second, p_s gives no indication about the reliability of individual links. In particular, it does not specify what fraction of links achieve a certain reliability, say 0.8. It could be all of them, or 2/3, or 1/2, or 1/5. p_s=0.8 means that the probability of transmission success over the typical link is 0.8. Equivalently, in an ergodic setting, in every time slot, a fraction 0.8 of all links happens to succeed, in every realization of the point process. But some links will be highly reliable, while others will be less reliable.

Before getting to the definition of the fraction of reliable links, let us focus on Poisson bipolar networks for illustration, with the following concrete parameters: link distance 1/4, path loss exponent 4, target SIR threshold θ=1, and the fading is iid Rayleigh. The link density is λ, and we use slotted ALOHA with transmit probability is p. In this case, the well-known expression for p_s is

$\displaystyle p_{\text{s}}=\exp(-c\lambda p),$

where c=0.3084 is a function of link distance, path loss exponent, and SIR threshold. We note that if we keep λp constant, p_s remains unchanged. Now, instead of just considering the typical link, let us consider all the links in a realization of the network, i.e., for a given set of locations of all transceivers. The video below shows the histogram of the individual link reliabilities for constant λp=1 while varying the transmit probability p from 1.00 to 0.01 in steps of 0.01. The red line indicates p_s, which is the average of all link reliabilities and remains constant at 0.735. Clearly, the distribution of link reliabilities changes significantly even with constant p_s– as surmised above, p_s does not reveal how disparate the reliabilities are. The symbol σ refers to the standard deviation of the reliability distribution, starting at 0.3 at p=1 and decreasing to less than 1/10 of that for p=1/100.

Histogram of link reliabilities in bipolar network.

Equipped with the blue histogram (or pdf), we can easily determine what fraction of links achieves a certain reliability, say 0.6, 0.7, or 0.8. These are shown in the plot below. It is apparent that for small p, due to the concentration of the link reliabilities as p→0, the fraction of reliable links tends to 0 or 1, depending on whether the reliability threshold is above or below the average p_s.

Fraction of links achieving reliability at least 0.6, 0.7, 0.8.

So how do we characterize the link reliability distribution theoretically? We start with the conditional SIR ccdf at the typical link, given the point process:

$\displaystyle P_{\text{s}}=\mathbb{P}(\text{SIR}>\theta\mid\Phi).$

Then p_s=E(P_s), with the expectation taken over the point process. Hence p_s is the mean of the conditional success probability, and if we consider its distribution, we arrive at the link reliability distribution, shown in blue in the video above. Mathematically,

$\displaystyle F(\nu)=\mathbb{P}(P_{\text{s}}>\nu).$

where ν is the target reliability. This distribution is a meta distribution, since it is the distribution of a conditional distribution. In ergodic settings, it specifies the fraction of links that achieve an SIR of θ with reliability at least ν, which is exactly what we set out to quantify.
In conclusion: The fraction of reliable links is not given by the standard (mean) success probability; it is given by the meta distribution of the SIR.