Category: Other points

# Stochastic geometry (is) fun – part 2

What do you tell somebody who wants to use the Palm measure but does not condition on a point at the origin?

“You are missing the point.”

# Stochastic geometry (is) fun

What do you call a stationary point process of intensity 0?

Pointless.

# A dense debate

In recent years, many research efforts were dedicated towards modeling and analyzing denser and denser wireless networks, in terms of the number of devices per km2 or m2. The terminology used ranges from “ultradense” to “hyperdense”, “massively dense”, and “extremely dense”.

IEEE Xplore lists more than 500 journal papers on “ultradense” networks, 25 on “hyperdense” networks (generally published more recently than the ultradense ones), and about 90 on “extremely dense” networks. There even exist 15 on “massively dense” networks. The natural question is how they are ordered. Is “ultradense” denser than “hyperdense” or vice versa? How does “extremely dense” fit in? Are there clear definitions what the different levels of densities mean? And what term do we use when networks get even denser?

Perhaps we can learn something from the terminology used for frequency bands. There is “high frequency” (HF), “very high frequency” (VHF), “ultrahigh frequency” (UHF), followed by “super high frequency” (SHF), “extremely high frequency” (EHF), and “tremendously high frequency” (THF). The first five each span an order of magnitude in frequency (or wavelength), while the last ones spans two order of magnitude, from 300 GHz to 30 THz.
So how about we follow that approach and classify network density levels as follows:

HD: 1-10 km-2
VHD: 10-100 km-2
UHD: 100-1’000 km-2
SHD: 1’000-10’000 km-2
EHD: 10’000-100’000 km-2
THD: 0.1-10 m-2

So, who will be the first to write a paper on tremendously dense networks?

What comes after THD? Not unexpectedly, there is a mathematical answer to that question. A dense set has a well-defined meaning. So in the super tremendously extreme case, we can just say that the devices are dense on the plane, without further qualification. This is achieved, for instance, by placing a device at each location with rational x and y coordinates. This is a dense network model, and almost surely there is no denser one.

# Tractable, closed-form, exact

The attributes “tractable”, “closed-form”, and “exact” are frequently used to describe analytical results and, in the case of “tractable”, also models. At the time of writing, IEEE Xplore lists 4540 journal articles with “closed-form” and “wireless” in their meta data, 650 with “tractable” and “wireless”, and 220 with “closed-form”, “wireless” and “stochastic geometry”.

Among the three adjectives, only for “exact” there is general consensus what is means exactly. For “closed-form”, mathematicians have a clear definition: The expression can only consist of finite sums and products, division, roots, exponentials, logarithms, trigonometric and hyperbolic functions and their inverses. Many authors are less strict, using the term also for expressions involving general transcendent functions or infinite sums and products. Lastly, the use of “tractable” varies widely. There are “tractable results”, “tractable models”, “tractable analyses”, and “tractable frameworks”.

“Tractable” is defined by Merriam-Webster as “easily handled, managed, or wrought”, by the Google Dictionary as “easy to deal with”, and by the Cambridge Dictionary as “easily dealt with, controlled, or persuaded”. Wikipedia refers to the mathematical use of the term: “ease of obtaining a mathematical solution such as a closed-form expression”. These definitions are too vague to clearly distinguish a “tractable model” from a “non-tractable” one, since “easy” can mean very different things to different people.

We also find combinations of the terms; in the literature, there are “tractable closed-form expressions” and even “highly accurate simple closed-form approximations”. But shouldn’t all “closed-form” expressions qualify as “tractable”? And aren’t they also “simple”, or are there complicated “closed-form” expressions?

It would be helpful to find an agreement in our community what qualifies as “closed-form”. Here is a proposal:

• Use “closed-form” in its strict mathematical understanding, allowing only elementary functions.
• Use “weakly closed-form” for expressions involving hypergeometric, (incomplete) gamma functions, and the error and the Lambert W functions.
• Any result involving integrals, limits, infinite sums, or general transcendent functions such as generalized hypergeometric and Meijer G functions is not “closed-form” or “weakly closed-form” (but may exact of course).

Thus equipped, we could try to define what a “tractable model” is. For instance, we could declare a model “tractable” if it allows the derivation of at least one non-trivial exact closed-form result for the metric of interest. This way, the SIR distribution in the Poisson bipolar network with ALOHA, Rayleigh fading, and power-law path loss is tractable because the expression only involves an exponential and a trigonometric function. The SIR in the downlink Poisson cellular with Rayleigh fading and path loss exponent 4 is also tractable; its expression includes only square roots and an arctangent. In contrast, the SIR in the uplink Poisson cellular network is not tractable, irrespective of the user point process model.
A result could be termed “tractable” if the typical educated reader can tell how the expression behaves as a function of its parameters.

Going a step further, it may make sense to be more formal and introduce categories for the sharpness of a result, such as these:

A1: closed-form exact
A2: weakly closed-form exact
A3: general exact
B1: closed-form bound
B2: weakly closed-form bound
B3: general bound
C1: closed-form approximation
C2: weakly closed-form approximation
C3: general approximation

Alternatively, we could use A+, A, A-, B+, etc., inspired by the letter grading system used in the USA. We could even calculate a grade point average (GPA) of a set of results, based on the standard letter grade-to-numerical grade conversion.

Such classification allows a non-binary quantification of “tractability” of a model. If the model permits the derivation of an A1 result, it is fully “tractable”. If it only allows C3 results, it is not “tractable”. If we can obtain, say, an A3, a B2, and a C1 result, it is 50% “tractable” or “semi-tractable”. Such a sliding scale instead of a black-and-white categorization would reflect the vagueness of the general definition of the term but put it on a more solid quantitative basis. Subcategories for asymptotic results or “order-of” results could be added.

This way, we can pave the way towards the development of a tractable framework for tractability.

# Physical quantities and scale-free results

Sometimes we read in a paper “the link distance is d meters”. Then, a bit later, “In Fig. 3, we set d=10 m”. Putting the two together, this means that “the link distance is 10 m meters”, which of course isn’t what the authors mean. It is not hard to infer that they mean the distance is 10 m, rather than 10 m2. But then why not just say “the link distance is d” from the outset?

Physical quantities are a combination of a numerical value and a unit. If symbols refer to physical quantities, equations are valid irrespective of any metric prefixes used. For example, if the density of a two-dimensional stationary point process is λ, then the mean number of points in a disk of radius r is always λπr2. We can express the radius as r=100 m or as r=0.1 km or r=10000 cm, and the density could be λ=0.1 km-2 or λ=10-5 m-2. In contrast, if a symbol only refers to the numerical value of the physical quantity, the underlying unit (and prefix) needs to be specified separately, and the reader needs to keep it in mind. For example, writing λ=10 and r=2 does not tell us anything about the mean number of points in the disk without the units. If the density is said to be λ m-2 and the radius is said to be r km, then we can infer that there are 40,000,000π points on average in the disk, but if the radius is r m, then there are a mere 40π points.

Separating the numerical values from the units also comes at a loss in flexibility. The advantage of the SI unit system with its metric prefixes is that we can express small and large values conveniently, such as d=5 nm or f=3 THz. Hard-wiring the units by saying “the frequency is f Hz” means that we have to write f=3,000,000,000,000 instead – and expect the reader to memorize that f is expressed in Hz. One may argue that in this case it would be natural to declare that “the frequency is f THz”, but what if much lower frequencies also appear in the paper? A Doppler shift of 10 kHz would become a Doppler shift of f=0.00000001.

This leads to the next problem, which is that some quantities are naturally expressed at a different scale (i.e., with a different metric prefix) than others. Transistor gate lengths are usually expressed in nm, while lengths of fiber optic cables may be expressed in km, and many other lengths and distances fall in between, say wavelengths, antenna spacing, base station height, intervehicular distances etc. Can we expect the readers to keep track of all the base units when we write “the gate length is u nm”, “the wavelength is v cm”, and “the base station height is h m”?

In conclusion, there is really no advantage to this kind of separation. In other words, having mathematical symbols refer to physical quantities rather than just numerical values (while the unit is defined elsewhere) is always preferable.

Now, there are cases, where it makes perfect sense to omit a unit when defining a quantity, say a density or a distance. For example, the normalized notation r=10 or λ=3 is acceptable (usually even preferred) when the normalization reference can be arbitrary. For example, the classical result for the SIR distribution in Poisson bipolar networks (link distance r, path loss exponent 2/δ, Rayleigh fading) is $\displaystyle F(\theta)=\exp(-c(\delta)\lambda r^2\theta^\delta),$

irrespective of whether r is normalized by 1 m or 1 km – as long as λ is normalized accordingly, i.e., by 1 m-2 or 1 km-2). Such scale-free results are particularly elegant, because they show that shrinking or expanding the network does not change the result. They may also indicate that one parameter can be fixed without loss of generality. In this example, what matters is the product λ r2, so setting λ=1 or r=1 is sensible to reduce the number of parameters.

One more thing. The conflation of a unit and a noun describing a physical quantity or device has become somewhat popular, unfortunately. It violates rule of proper English composition and also the guidelines for scientific writing as put forth by, for example, IEEE. Let us hope they do not spread further, otherwise we need to get used to THzFrequencies, cmDistances, MsLifetimes, pJConsumptions, km-2Densities etc. Further, improper capitalization can drastically change the quantities. The gap between a mmLength and a MmLength is 9 orders of magnitude, that between a pASource and a PASource is 27 orders of magnitude, and that between ymGaps and YmGaps is 48 orders of magnitude!

# Double-proving by simulation?

Let us consider a hypothetical scenario that illustrates an issue I frequently observe.

Author: Here is an important result for a canonical Poisson bipolar network:
Theorem: The complementary cumulative distribution of the SIR in a Poisson bipolar network with Rayleigh fading, transmitter density λ, link distance r, and path loss exponent 2/δ is $\displaystyle \bar F(\theta)=\exp(-\lambda\pi r^2 \theta^\delta\Gamma(1-\delta)\Gamma(1+\delta)).$

Proof: [Gives proof based on the probability generating functional.]

Reviewer: This is a nice result, but it is not validated by simulation. Please provide simulation results.

We have a proven exact analytical (PEA) result. So why would we need a simulation for “validation”? Where does the lack of trust in proofs come from? I am puzzled by these requests by reviewers. Similar issues arise when authors themselves feel the need to add simulations to the visualization of PEA results.
Perhaps some reviewers are not familiar with the analytical tools used or they think it is easier to have a quick look at a simulated curve rather than checking a proof. Perhaps some authors are not entirely sure their proofs are valid, or they think reviewers are more likely to trust the proofs if simulations are also shown.
The key issue is that such requests by reviewers or simulations by authors take simulation results as the “ground truth”, while portraying PEA results as weaker statements that need validation. This of course is not the case. A PEA result expresses a mathematical fact and thus does not need any further “corroboration”.
Now, if the simulation results are accurate, the analytical and simulated curves lie exactly on top of each other, and the accompanying text states the obvious: “Look, the curves match!”. But what if there isn’t an exact match between the analytical and the simulated curve? Which means that the simulation is not accurate. Certainly that does not qualify as “validation”. The worst conclusion would be to distrust the PEA result and take the simulation as the true result.
By its nature, a simulation is always restricted to a small cross-section of the parameter space. Even the simple result above has four parameters, which would make it hard to comprehensively simulate the network. Related, I am inviting the reader to simulate the result for a path loss exponent α=2.1 or δ=0.95. Almost surely the simulated curve will look quite different from the analytical one.
In conclusion, there is absolutely no need for “two-step verification” of PEA results. On the contrary.

# Epidemics are spatial, stochastic, and wireless

Inspired by current events, let us focus on the “successful” transmission of a virus from one host to another. In the case of a coronavirus, such transmission happens when the infected person (the source) emits a “signal” by breathing, coughing, sneezing, or speaking, and another person (the destination) gets infected when the respiratory droplets land in the mouth or nose or are inhaled. Fundamentally, the process looks very similar to that of a conventional RF wireless transmission. There are notions of signal strength, directional transmission, path loss, and shadowing in both, resulting in a probability of “successful” reception. In both cases, distances play a critical role – there is the well-known 2 m separation in the case of coronaviral transmission that (presumably) causes an outage in the transmission with high probability; similarly, in the wireless case, there is sharp decay of reception probabilities as a function of distance due to path loss.

Given the critical role of the geometry of host (transceiver) locations, it appears that known models and analytical tools from stochastic geometry could be beneficially applied to learn more about the spread of an epidemic. In contrast to the careful modeling of channels and transmitter and receiver characteristics in wireless communications, the models for the spread of infectious diseases are usually based on the basic reproduction number R0, which is a population-wide parameter that reflects distances between individual agents very indirectly only. Since it is important to understand the effect of physical (Euclidean) distancing (often misleadingly referred to as “social distancing”) and masking (i.e., shadowing), it seems that incorporating distances explicitly in the models would increase their predictive power. Such models would account for the strong dependence of infection probabilities on distance, akin to the path loss function in wireless communications, as well as directionality, akin to beamforming. Time of proximity or exposure can be incorporated if mobility is included.

This line of thought raises some vital questions: How can expertise in wireless transmission be applied to viral transmission? What can stochastic geometry contribute to the understanding of how infectious diseases are spreading (spatial epidemics)? Is there a wireless analogy to “herd immunity”, perhaps related to percolation-theoretic analyses of how broadcasting over multiple hops in a wireless network leads to a giant component of nodes receiving an information packet?

I believe it is quite rewarding to think about these questions, both from a theoretical point of view and in terms of their real-world impact.

Papers on these topics are solicited in the special issue on Spatial Transmission Dynamics of MDPI Information (deadline Feb. 28, 2021).

# On the typical point

The concept of the typical point is important in point process theory. In the wireless context, it is often concretized to the typical user, typical receiver, typical vehicle, etc. The typical point is an abstraction in the sense that it is not a point that is selected in any deterministic fashion from the point process, neither is it an arbitrary point. Also, speaking of “a typical point” is misleading, since it suggests that there exist a number of such typical points in the process (or perhaps even in a realization thereof), and we can choose one of them to “obtain a typical point”. This does not work – there is nothing “typical” about a point selected in a deterministic fashion, let alone an arbitrary point.
That said, if the total number of points is finite, we can, loosely speaking, argue that the typical point is obtained by choosing one of the points uniformly at random. Even this is not trivial since picking a point in a single realization of the point process does generally not produce the desired result, for example of the point process is not ergodic. Many realizations need to be considered, but then the question arises how exactly to pick a point uniformly from many realizations. If the number of points is infinite, any attempt of picking a point uniformly at random is bound to fail because there is no way to select a point uniformly from infinitely many, in much the same way that we cannot select an integer uniformly at random.

So how do we arrive at the typical point? If the point process is stationary and ergodic, we can generate the statistics of the typical point by averaging those of each individual point in a single realization, observed in an increasingly large observation window. This leads to the interpretation of the typical point as a kind of “average point”. Similarly, we may find that the “typical American male” weighs 89.76593 kg, which does not imply that there exists an individual with that exact weight but means that if we weighed every(male)body or a large representative sample, we would obtain that average weight. For general (stationary) point processes, we obtain the typical point by conditioning on a point to exist at a certain location, usually the origin o. Upon averaging over the point process (while holding on to this point at the origin), that point becomes the typical point. The distribution of this conditioned point process is the Palm distribution. In the case of the Poisson process, conditioning on a point at o is the same as adding a point at o. This equivalence is called Slivnyak’s theorem.

In the non-stationary case, the typical point at location x may have different statistical properties than the typical point at another location y. As a result, there exist Palm measures (or distributions) for each location in the support of the point process. In this case, the formal definition of the Palm measure is given by the Radon-Nikodym derivative (with respect to the intensity measure) of the Campbell measure. In the Poisson case, Slivnyak’s theorem still applies.

# What’s the point?

What’s the point of stochastic geometry? In this blog, I am sharing  random and deterministic thoughts about the use of stochastic geometry, mainly in wireless network modeling and analysis.