Mathematical matrices are blocks of numbers. So they don’t necessarily have definite characteristics the way plain vanilla, singleton numbers do. They’re freer. For example 131 ∈ positive and −9 ∈ negative. But is

positive, negative, or zero? Well, it has all three within it. How can a matrix be “just” **+**, **−**, or **0**?

Likewise, is the function sin (x) ℝ→ℝ positive, negative, or zero? It maps to an infinite number (ℶ) of positives, an infinite number (ℶ) of negatives, and to a lesser-but-still-infinite number (ℵ) of zeroes. So how could it be just one of the three.

But James Mercer realised in 1909 that some functions “act like” regular singleton numbers, being positive, negative, or zero. And if it walks like a duck, quacks like a duck, smells like a duck — it *is* a duck. For example if you multiply any ℝ→ℝ function by a Schwartz function, you won’t change its sign. Hmm, that’s just like multiplying by a positive number — never changes the sign of its mate.

**Matrices**

Similarly, some matrices “act like” single solitary numbers, being positive, negative, or zero.

- Matrices that act like a single positive number are called positive definite.
- Matrices that act like a single negative number are called negative definite.
- Matrices that act like a single non-negative (≥0) number are called positive semidefinite.
- Matrices that act like a single non-positive (≤0) number are called negative semidefinite.

Being able to identity these simple subtypes of matrices makes Théorie much easier. Instead of talking about linear operators in general and not getting many facts to reason from, the person coming up with the theory can build with smaller, easier-to-handle blocks, and later on prove that the small blocks can build up anything.

I first saw semi-definite matrices in economic theory, where a p.s.d. Hessian indicates a local minimum of a value function. (A Hessian is like a Jacobian but filled with second derivatives instead of first. And a value function maps from {all the complicated choices of life} → utility ∈ ℝ. So value functions have a Holy Grail status.)

But semi-definite & definite functions are used in functional data analysis as well.

**Functions**

Positive semi-definite functions are used as kernels in

- landmark regression
- data smoothing, especially of high-frequency time series
- audio transformations
- photoshop transformations
- BS stock price prediction

At some point in the 20th century we learned to expand the definition of number to include any corpus of things that behaves like numbers. Nowadays you can say corpora of functions and well-behaved corpora of matrices effectively *are* numbers. (I mean “commutative rings with identity and multiplicative inverses”.) That is a story for another time.

### Uses

Besides being philosophically interesting as an example of a generalised number, positive semi-definite functions and matrices have practical uses. Because they are simple building blocks of more complicated things,

- Dynamical systems which advance via a PSD matrix have nice convergence properties. Kind of analogous to power series convergence (NSD matrix =~ alternating power series).
- PSD kernels, for similar convergence reasons, are usually considered saner, simpler transformations of time series and signals (audio, price signals, and more). You don’t want to introduce more noise when you’re actually trying to smooth the curve.
- PSD matrices prove that an infinite–dimensional kernel can help support vector machines distinguish among the data, without a human telling the computer how they relate to each other.