Category: Cellular networks

The curious shape of Poisson-Voronoi cells

In this blog we are exploring the shape of two kinds of cells in the Poisson-Voronoi tessellation on the plane, namely the 0-cell and the typical cell. The 0-cell is the cell containing the origin, while the typical cell is the cell obtained by conditioning on a Poisson point to be at the origin (which is the same as adding the origin to the PPP).

The cell shape has an important effect on the signal and interference powers at the typical user (in the 0-cell) and at the user in the typical cell. For instance, in the 0-cell, which contains the typical user at a uniformly random location, about 1/4 of the cell edge is at essentially the same distance to the base station as the typical user on average). Hence it is not the case that edge users necessarily suffer from larger signal attenuation than the typical user (who resides inside the cell).

The cell shape is determined by the directional radii of the cells when their nucleus is at the origin. To have a well-defined orientation, we select a location uniformly in the cell and rotate the cell so that this location falls on the positive x-axis. In the 0-cell, this involves first a translation of the cell’s nucleus to the origin, followed by a rotation until the original origin (which is uniformly distributed in the cell) lies on the positive x-axis. This is illustrated in Movie 1 below. In the typical cell, it involves adding a Poisson point, selecting a uniform location, and a rotation so that this uniform location lies on the positive x-axis. This is illustrated in Movie 2.

Movie 1. Rotated and translated 0-cell.

Movie 2. Rotated typical cell.

As indicated in the movies, the distances from the nucleus to the uniformly random location are denoted by D0 and D, respectively, and the directional radii by R0(ϕ) and R(ϕ), respectively. This way, the boundary of the cells is described in polar coordinates as (R0(ϕ),ϕ) and (R(ϕ),ϕ), ϕ ∈ [0,2π). In a cellular network model, the uniform random location could be that of a user, while the PPP models the base stations. In this case D0 is the link distance from the typical user to its serving base station, while D is the link distance from the typical base station to a randomly located user it serves. The distinction between the typical user’s and the typical base station’s point of view is explained in this blog.

Let λ denote the density of the PPP. Three results are well known:

  • The distribution of D0 follows from the void probability of the PPP. It is Rayleigh with mean 1/(2√λ).
  • Since the mean area of the typical cell is 1/λ, we have ∫ 0π 𝔼(R(ϕ)2) dϕ = 1/λ.
  • The minimum of R(ϕ) is distributed with pdf f(r)=8λπr exp(-4λπr2). This is half the distance to the nearest neighboring Poisson point (base station).

In contrast, there is no closed-form expression for the distribution of D. Due to size-biased sampling, the area of the 0-cell stochastically dominates that of the typical cell and, in turn, D0 dominates D.

Analyzing the directional radii, we obtain these new insights on the cell shapes:

  • If Ψ is uniform in [0,π], R(Ψ) is again Rayleigh with mean 1/(2√λ).
  • R0(π) is also Rayleigh with the same mean. In fact, R0(π) and D0 are iid.
  • R0(0) has mean 3/(4√λ) and is distributed as

\displaystyle f_{R_0(0)}(y)=2(\lambda\pi)^2 y^3 \exp(-\lambda\pi y^2).

  • Hence R0(0) is on average exactly 50% larger than R0(π). For the typical cell, simulation results indicate that R(0) is about 55% larger on average than R(π).
  • The difference R0(0)-D0 is distributed as f(r)=π√λ erfc(r √(πλ)). Its mean is 1/(4√λ). Hence the typical user is no further from the cell edge than the base station on average.
  • The joint distribution of D0 and R0(ϕ) can be given in exact analytical form.
  • 3/4 of the typical cell is further away from the nucleus than the nearest point on the cell edge (i.e., the minimum directional radius). Expressed differently, a uniformly random user in the typical cell has a 75% chance of being further away from the base station than the nearest edge user. By simulation, D on average is 2.7 times larger than the minimum of the directional radii.

In conclusion, the 0-cell and the typical cell are quite asymmetric around the nucleus (base station) and the uniformly random point (user). In the direction away from the base station, the user is about 4 times closer to the cell edge than in the direction towards the base station, and many locations on the cell edge are closer to the base station than the user inside the cell. These results have implications on the design of efficient cellular network transmission schemes, such as beamforming, NOMA, and base station cooperation, in both down- and uplink.

More details are available in Section II of this paper.

The typical user does not reside in the typical cell

The analysis of cellular networks usually focuses on the typical user in the downlink and the typical base station (or, equivalently, the typical cell) in the uplink. It is important that if base station and user point processes are independent, the two notions of “typical” are not compatible – the typical user’s cell is statistically different from the typical cell. The difference is caused by the effect of size-biased sampling. The typical user’s performance corresponds to that of the average of all users, and there are more users in larger cells. Since a user model is not needed in the downlink as explained in this post, we can equivalently say that an arbitrary location is more likely to fall in a larger cell than a smaller cell.
The typical user’s cell, the so-called 0-cell, is the cell containing the origin, i.e., it is obtained by cell area-biased sampling, which gives larger cells more weight. As a result, the 0-cell is larger on average than the typical cell, which is the cell of the base station conditioned to be at the origin. The statistical properties of the typical cell correspond to the averages of all cells.

Such size-biased sampling is not restricted to cellular networks or stochastic geometry. If we throw a dart blindly on a world map until we hit land, the country we hit is quite likely to be a big one. In fact, there is a 50% chance that the dart lands on one of the 10 largest countries. Similarly, the typical country has 40 M inhabitants on average, but the typical person is likely to live in a country with more than 100 M people. The typical dollar is quite likely owned by a wealthy person, while the typical person is probably not rich. The typical human hair is likely to grow on a person with full hair, while the typical person has a 5-10% chance of being bald. The typical animal leg has a decent chance of belonging to a millipede or centipede, while the typical animal is very unlikely to have more than six feet.

Coming back to cellular networks, let us focus on a concrete example that is fully tractable in terms of the cell area distributions. Consider the lattice with holes shows in Fig. 1 below, obtained from a square lattice of density 1 by removing the four nearest neighbors of each 16th point. It is periodic with period 4 in both directions, its density is λ=3/4, and it has four different types of cells, with three different areas, 1, 3/2, and 2.

Fig. 1: Lattice hole process with density 3/4. Base stations (cell nuclei) are marked by red triangles, and those whose nearest neighbors were removed by stars, which makes these cell large. The grey box indicates one period of the lattice.

The typical cell has area 1 with probability 5/12, area 3/2 with probability 1/2, and area 2 with probability 1/12. The mean area follows as E(A)=5/12+1/2 3/2+1/12 2=4/3, which corresponds to 1/λ.

Now assume a stationary square lattice of density 1 as the user point process. Then the cells of area 1 always contain 1 user and those of area 2 always contain 2 users. Those of area 3/2 have 1 user or 2 users, each with probability 1/2. Deconditioning on the cell areas, we obtain the distribution of the number of users U in the typical cell as P(U=1)=2/3 and P(U=2)=1/3, for a mean number of users E(U)=4/3, which equals the mean area times the user density (chosen to be 1 here).

How about the typical user’s cell? This is where the size bias plays a role. The distribution of the area A0 of the 0-cell is P(A0=1)=5/16, P(A0=3/2)=9/16, and P(A0=2)=1/8. These are the fractions of the plane covered by cells of areas 1, 3/2, and 2. The mean area is E(A0)=45/32, which is about 5.5% bigger than the mean area of the typical cell. The number of users U0 in the 0-cell is distributed as P(U0=1)=5/16+1/2 9/16=19/32, P(U0=2)=1/2 9/16+1/8=13/32, resulting in a mean of E(U0)=45/32, which is the user density times the mean area. The mean also follows from the general formula

\displaystyle \mathbb{E}(f(V_0))=\frac{\mathbb{E}(A f(V))}{\mathbb{E}(A)}=\lambda\mathbb{E}(A f(V)).

where V is the typical cell, V0 the 0-cell, and f is a non-negative function on compact sets. Applied to our setting, where f(V) is the number of users in V, we obtain

\displaystyle \mathbb{E}(U_0)=\frac{\mathbb{E}(A^2)}{\mathbb{E}(A)}=\lambda\mathbb{E}(A^2).

Since the user density is 1, this is also the mean area of A0. For the number of sides S, we have E(S)=19/4, but E(S0)=155/32, which is bigger by 3/32.
At the end of this post are three more examples of similarly constructed lattices. In each case, the points within a certain distance of a sub-lattice are removed.

So the typical user is not served by the typical base station, and the typical base station does not serve the typical user. One way to reconcile the two is to define a user point process where a fixed number of users, say one, is placed uniformly at random in each cell. Such a user process is of course no longer independent of the base station process.

For Poisson distributed base stations, the 0-cell is 28% larger than the typical cell. Its mean number of sides is 6.41, whereas the typical cell has 6 sides on average. Hence the 0-cell is not just an enlarged version of the typical cell but also has a different shape. Accordingly, the distance from the nucleus of the typical cell to a random point in the cell is not Rayleigh distributed as it is in the 0-cell. Also, if users form a PPP of density 1, the typical user’s cell has 1+1.28/λ users on average (there is one extra user due to the conditioning of a user to be at the origin), while the typical cell only has a mean of 1/λ users.

Size-biased sampling is important in other wireless networks as well. If a vehicular network is modeled by placing one-dimensional Poisson point processes (cars) on line segments (streets) of independent random length (which is a Cox process supported on line segments), then the typical vehicle’s street length distribution fL0 is different from the length distribution f_L of the streets. By length-biased sampling, the two are related as

\displaystyle f_{L_0}(x)=\frac{xf_L(x)}{\mathbb{E}(L)}.

For example, if L is exponential with mean 1, then L0 is gamma distributed with mean 2. The same situation arises in the interarrival intervals of a one-dimensional PPP (of density 1). The typical such interval is exponential with mean 1, but the interval containing the origin (or any other deterministic time instant) has a mean length of 2. This is sometimes referred to as the waiting time paradox, although there is nothing paradoxical about it – it is just size-biased sampling.

Lastly, as promised, here are three more examples of lattices with increasingly large holes.

Fig. 2: Lattice with period 6. λ=1/6; P(A=1)=1/6; P(A=89/15)=2/3; P(A=169/15)=1/6; E(A)=6.
P(A0=1)=1/36; P(A0=89/15)=89/135; P(A0=169/15)=169/540; E(A0)=6047/810 ≈ 7.46.
Fig. 3: Lattice with period 8. λ=7/32; P(A=1)=5/14; P(A=7/2)=2/7; P(A=29/4)=2/7; P(A=16)=1/14; E(A)=32/7. P(A0=1)=5/64; P(A0=7/2)=7/32; P(A0=29/4)=29/64; P(A0=16)=1/4; E(A0)=2081/256. Here the 0-cell is almost twice as big on average than the typical cell. It corresponds to the largest cell with probability 1/4, but only 1/14 of the cells are largest ones.
Fig. 4: Lattice with period 16. λ=7/128; P(A=1)=5/14; P(A=15/2)=2/7; P(A=33)=2/7; P(A=89)=1/14; E(A)=128/7. P(A0=1)=5/256; P(A0=15/2)=15/128; P(A_0=33)=33/64; P(A0=89)=89/256; E(A0)=12507/256 48.9. The 0-cell is more than 2.5 times larger than the typical one on average.

Are users needed in cellular networks?

Let us consider the downlink of a cellular network where base stations form a stationary and ergodic point process Φ and define the SIR at each location xR2 as

\displaystyle \text{SIR}(x)=\frac{h_{N(x),x}\ell(x-N(x))}{\sum_{y\in\Phi\setminus\{N(x)\}} h_{x,y}\ell(y-x)}.

Here N(x) is the nucleus of the Voronoi cell that x belongs to, hx,y is the fading coefficient between x and y, and ℓ is the path loss function. Due to the stationarity of Φ, the SIR statistics do not depend on the location x. In other words, any arbitrary location can be taken to be the typical location that the analysis focuses on.
Example result: If Φ is Poisson, the fading is Rayleigh, and ℓ is a power-law function with exponent 2/δ, it is known that for all x ∈ R2,

\displaystyle \qquad\qquad\mathbb{P}(\text{SIR}(x)>\theta)=\frac{1}{\,_2F_1(1,-\delta;1-\delta,-\theta)},\qquad\qquad\qquad (1)

where 2F1 is the Gauss hypergeometric function.
Since Φ is ergodic, the probability that the SIR exceeds θ is the fraction of the plane that achieves an SIR of at least θ for all realizations of Φ. This means that the probability (ensemble average) can be replaced by a spatial average over an increasingly large region. Sometimes this probability (or spatial average) is questionably called “coverage probability” (see this post), and the area fraction is termed “covered area fraction”.
It is important to note that results such as (1) do not require any specification of a point process of users. This answers the question in the title: No, users are not necessary in the downlink SIR analysis.

That said, in the literature we observe that in many cases, a point process of users is defined before such downlink SIR results are derived. The reason could be that it may seem overly abstract to consider a cellular network model devoid of any users and view the SIR as a random field on the plane. Specializing the location x to the points of a user point process (assumed independent of Φ), we observe that (1) is the SIR distribution at the typical user for any stationary point process of users. So there is nothing wrong in introducing a point process of users, focus on the typical user, and state a result such as (1). It would, however, be potentially misleading to specify the user point process to be a Poisson process, since the reader may then believe that the result only holds for Poisson distributed users.

There is one caveat when introducing a point process of users to formulate downlink results: The interpretation of the SIR distribution as the fraction of users who achieve SIR>θ in each realization of the user and base station point processes may no longer be correct, even if the two point processes are independent and stationary and ergodic. For instance, consider the case where both are stationary (i.e., randomly translated) lattices of the same intensity. Then, given the point processes, the SIR distribution at each user is the same and depends on the relative shift of the lattices. For example, if a user is very close to its serving base station, then all users are close to their serving base station, and the SIR at all users is likely to exceed θ even when θ is, say, 20 dB. In contrast, if one user is equidistant to two base stations, then all users are, and it is unlikely that the SIR (at any or all of them) exceeds 1. So averaging over the users in one realization cannot yield the same result as averaging over the point processes (ensemble averaging). But doesn’t ergodicity imply that the two results are the same? The answer is yes, it does, but individual ergodicity of the two point processes is not sufficient. Since the SIR depends on both of them jointly, they need to be jointly ergodic. This is the condition that is not met in this example scenario of two lattices.

What is “coverage”?

In the literature, the probability that the signal-to-interference ratio (SIR) at a given location and time exceeds a certain value is often referred to as the coverage probability. Is this sensible terminology, consistent with the way cellular operators define “coverage”? All publicly accessible cellular service coverage maps are static, i.e., their view of “coverage” is purely based on location and not on time. This seems natural since a rapidly changing coverage map, say at the level of seconds, would not be of much use to the user, apart from the fact that it would be very hard to collect the information at such time scales.

In contrast, the event SIR(x,t)>θ depends not only on the location x but also on the time t. A location may be “covered” at SIR level θ at one moment but “uncovered” just a little bit (one coherence time) later. Accordingly, a “coverage” map based on this criterion would have to be updated several times per second to accurately reflect this notion of coverage. Moreover, it would have to have a very high spatial resolution due to small-scale fading – one location may be “covered” at time t while another, half a meter away, may be “uncovered” at the same time t. Lastly, there is no notion of reliability. For each x and t, SIR(x,t)>θ either happens or not. It seems natural, though, to include reliability in a coverage definition; for example, by declaring that a location is covered if an SIR of θ is achieved with probability 95%, or 95 times out of 100 transmissions.

Hence there are three disadvantages of using SIR(x,t)>θ as the criterion for coverage:

  • The event depends on time (at the level of the coherence time)
  • The event depends on space at a very small scale (at the granularity of the coherence length of the small-scale fading)
  • The event does not allow for a reliability threshold to define coverage.

How can we define “coverage” without these shortcomings? First, we interpret coverage as a purely spatial term, consistent with the coverage maps we find on the web; it should not include a temporal component, at least not in the short-term – hopefully cellular coverage keeps improving over the years, but it should not vary randomly many times per second. Put differently, coverage should only depend on the network geometry (locations of base stations relative to the position x) and shadowing, but not on the rapid signal strength fluctuations due to small-scale fading. The solution to eliminate the temporal component is fairly straightforward – we just need to average over the temporal randomness, i.e., the small-scale fading. Such averaging eliminates the other two shortcomings as well. For a base station point process Φ, we define the conditional SIR distribution at location x as

\displaystyle P(x)=\mathbb{P}(\text{SIR}(x,t)>\theta\mid\Phi).

Here, the probability is taken over the small-scale fading, which eliminates the dependence on time (assuming temporal ergodicity of the fading process, which means that the ensemble average here could be replaced by a time average over a suitable long period). If shadowing is present, it can be incorporated in Φ. Then, introducing a reliability threshold ν, we declare

\displaystyle \{x \text{ covered}\}\quad\Leftrightarrow\quad \{P(x)>\nu\}

The reliability threshold ν appears naturally in this definition. The probability that P(x)>ν is the meta distribution of the SIR, since it is the distribution of the conditional SIR distribution given Φ. For stationary and ergodic point processes Φ, it does not depend on x and gives the area fraction that is covered at SIR threshold θ and reliability threshold ν. The figure below shows a coverage map where the colors indicate the reliability threshold at which locations are covered, from dark blue (ν close to 0) to bright yellow (ν close to 1).
So if P(SIR>θ) is not the coverage probability, what is it? It is simply the complementary cumulative distribution (ccdf) of the SIR, often interpreted as the success probability of a transmission.

Visualization of coverage at SIR threshold 1. Red dots indicate base stations. The colors indicate the reliability threshold. For example, for ν=0.8, regions colored orange and yellow are covered.

Unmasking distributions with infinite support

When visualizing distributions with infinite support, we face the challenge that they can only be shown partially. Usually we try to judiciously choose the interval of our plot so that the interesting part is revealed, and it is understood that outside that interval the function is essentially zero (for a density) or essentially zero or one (for a cumulative distribution). However, there are two disadvantages to that approach: First, if two distributions are not shown on the same interval, it is hard to compare them. Second, interesting asymptotic behavior in the tails are masked. In many cases, using a linear scale suffers from a third shortcoming: Interesting features may occur on a significantly different scale. Which would require choosing a large interval, but that, in turn, may mask the behavior in some parts.

When plotting distribution of signal-to-interference ratios (SIRs), the standard approach is to use a dB scale. The complementary cumulative distribution (ccdf) is usually interpreted as the success probability of a transmission (in an interference-limited setting). While the dB scale allows for visualization the cccdf over a larger range, it has its own shortcomings: First, it turns the one-sided infinite support [0,∞) into the two-sided infinite support (-∞,∞), which can make selecting a suitable interval harder. Second, it distorts the ccdf, which prevents the viewer from obtaining insight into asymptotic behaviors.

So how can we resolve these issues? It turns out that there is a straightforward solution. It is based on a homeomorphic mapping of [0,∞) to the unit interval [0,1]. This mapping is given by the function T(x)=x/(1+x), and the resulting scale is called MH (Möbius homeomorphic) scale. For comparison, the dB scale has the mapping 10 log10(x). In the figures below, the dummy variable is θ, and we have θ dB=10θ/10, and θ MH=θ/(1-θ). In the important high-reliability regime θ→0, θ MH ∼ θ, i.e., there is no distortion.
The three figures show 6 ccdfs on a linear, dB, and the MH scale. Clearly, the linear scale plots mask the information about the tail. Some curves go to zero (too) quickly, while the behavior of the yellow one past 100 is completely hidden. The dB scale displays the transitional regime more prominently, but all curves have an inverted S shape, which reduces the discriminative power of these plots. In particular, for small θ, they all become flat. For example, the blue (Pareto) and the cyan (Lévy) curve look similar in the dB scale, but with some shift. The MH scale, however, reveals that the asymptotic behavior on both ends is, in fact, quite different. Also, the red (another Pareto distribution) and green (gamma) curve look fairly similar in the dB scale, while the MH scale emphasizes the difference between the two. Generally, the MH plots enhance the differences because the slopes at 0 and at 1 can assume any value, while the slopes in the dB plots always approach 0 – assuming the range is chosen wide enough.

In summary, the MH scale has the following advantages:

  • There is a single finite interval that reveals the complete distribution. There is never a question of what interval to choose, and nothing remains hidden.
  • The asymptotic behaviors are clearly visible. In comparison, on the dB scale, the behavior near 0 and towards infinity is always obscured.
  • In the case of SIRs, the MH scale has the additional interpretation as visualizing the distribution of the signal fraction S/(S+I) (SF) on a linear scale: If F(θ) is the ccdf of the SIR S/I, then F(T(θ)) is the ccdf of the SF.

The MH mapping and its application to SIRs and signal fractions was first introduced in the invited paper M. Haenggi, “SIR Analysis via Signal Fractions”, IEEE Communications Letters, vol. 24, pp. 1358-1362, July 2020.

linear
Figure 1. Six ccdfs on a linear scale.

decibels
Figure 2. Same ccdfs on a dB scale (corresponding to [0.01,100] in the linear scale).

mh_scale
Figure 3. Same ccdfs on the MH scale.