A New Language for the Brain: Cognition as a Hybrid Geometric-Topological System

Introduction: The Search for the Brain’s Native Language

Von Neumann’s Foundational Challenge

In his final work, The Computer and the Brain, the polymath John von Neumann articulated a profound challenge that continues to motivate neuroscience. He posited that the language of the nervous system was not the precise, formal logic of a digital computer but must instead be a fundamentally statistical one, optimized for robustness in the face of noisy components and incomplete data (von Neumann, 2012). This assertion frames the central mission of theoretical neuroscience: to discover and formalize this native language. Solving this problem is synonymous with solving the structure-function problem—understanding precisely how the physical arrangement of neurons gives rise to the adaptive functions of the mind. Our work confronts this challenge directly by proposing a mathematically principled language that unifies the continuous process of learning with the discrete nature of conceptual insight.

Our Thesis: Cognition as a Hybrid Dynamical System

We formalize cognition as a hybrid dynamical system, a mathematical framework designed to model systems that exhibit both continuous evolution and discrete, event-driven changes (Goebel et al., 2012). Our central thesis is that the mapping from neural structure to cognitive function is best described by such a system. Specifically, we model cognition as the coupling of two distinct but interacting regimes:

A continuous geometric flow, which describes the gradual process of learning and evidence accumulation.
A discrete topological event system, which captures the sudden reorganization of representations, akin to an “aha!” moment or conceptual leap.

This post will detail the mathematical foundations of each regime, the theoretical bridge that connects them, and the executable architecture we are developing to test this model’s predictions.

The Problem: Unifying Continuous Learning and Discrete Insight

The Inadequacy of Purely Continuous Models

Most computational models of learning, particularly in the deep learning paradigm, are purely continuous. They describe learning as the smooth traversal of a high-dimensional loss landscape, typically via gradient descent. While powerful, these models struggle to account for the discontinuous nature of human insight. They excel at optimization and interpolation but offer no formal mechanism for the discovery of entirely new structural features in data—the kind of discovery that characterizes scientific breakthroughs or a child’s sudden grasp of a new concept. Such events are not merely movements along a predefined path; they represent a fundamental change in the map itself.

The Challenge of Formalizing “Aha!” Moments

The “aha!” moment, or cognitive insight, is characterized by a sudden, non-obvious solution to a problem. Phenomenologically, it feels like a discrete jump in understanding rather than the result of incremental progress. Formalizing these events is a significant challenge. A successful theory must provide a rigorous, state-dependent criterion for when such a jump should occur and define the mechanism of the jump itself. Without such a mechanism, models remain incomplete, unable to capture the full dynamic range of cognitive function that seamlessly integrates gradual refinement with abrupt discovery. Our framework addresses this gap by explicitly modeling the conditions that trigger these discrete topological shifts.

A Formal Theory: The Geometric-Topological Model

To build a more complete model, we leverage powerful tools from two distinct fields of mathematics: information geometry and computational topology.

The Continuous Regime: Learning as a Flow on a Statistical Manifold

The foundation of our continuous learning model is information geometry, pioneered by Shun-ichi Amari (Amari & Nagaoka, 2000). A statistical manifold is a space where each point represents a probability distribution. For a parametric family of models, $p(x|\theta)$, this is the space of parameters, $\Theta$. This space is not a simple Euclidean space; it possesses an intrinsic geometric structure defined by the

Fisher-Rao metric: $$g_{ij}(\theta) = \mathbb{E}_{x \sim p(\cdot|\theta)}\big[\partial_i\log p(x|\theta) \partial_j\log p(x|\theta)\big]$$

Čencov’s uniqueness theorem establishes this as the only Riemannian metric that is invariant under sufficient statistics, making it the canonical metric for information (Čencov, 1982). Learning, in this view, is a trajectory on this manifold. Standard gradient descent is a suboptimal path, as it depends on the arbitrary parameterization of the model. The most efficient, parameterization-invariant path is the natural gradient, which follows the true geometry of the information space (Amari, 1998). We model continuous belief updating as a natural-gradient flow on this statistical manifold, representing the most direct path for integrating new evidence.

The Discrete Regime: Detecting Insight with Persistent Homology

To model discrete insights, we turn to computational topology. We hypothesize that a conceptual leap corresponds to a significant change in the shape of the agent’s posterior belief distribution. We use persistent homology, a primary tool of Topological Data Analysis (TDA), to quantify this shape (Edelsbrunner & Harer, 2010). Given a point cloud of samples from the posterior, persistent homology tracks the emergence and disappearance of topological features (connected components, loops, voids, etc.) at different spatial scales. The result is a “persistence diagram” that distinguishes robust, meaningful features from transient topological noise.

The emergence of a new homology class with high persistence signifies a major structural change in the belief space—the discovery of a non-trivial latent feature that was not previously represented. Crucially, the stability theorems of TDA guarantee that this detection method is robust to perturbations and sampling noise (Cohen-Steiner et al., 2007). This provides a robust, data-driven criterion for triggering a discrete “jump” in the hybrid dynamical system.

The Bridge: Ollivier-Ricci Curvature as a Measure of Robustness

A critical question is how the continuous geometry relates to the discrete topology. We propose the bridge is Ollivier-Ricci curvature, a discrete analogue of Ricci curvature derived from optimal transport theory (Ollivier, 2009). For any two points $x$ and $y$ on a graph of posterior samples, the curvature is defined as: $$\kappa_G(x,y) = 1 – \frac{W_1(\mu_x, \mu_y)}{d(x,y)}$$

Here, $W_1$ is the Wasserstein-1 (“earth-mover’s”) distance between probability measures centered on the two points. Intuitively, positive curvature ($\kappa > 0$) indicates that local neighborhoods of points are closer than the points themselves, signifying a local contraction of information and a tightly clustered, robust region of the belief space (Lin et al., 2011). We hypothesize that high-curvature regions in the statistical manifold correspond to functionally robust beliefs—concepts that are stable and resistant to perturbation. This measure of local robustness serves as a key modulator, influencing both the continuous flow and the stability of topological features.

Evidence & Examples: An Executable Architecture

These theoretical principles are not merely abstract; they are integrated into a concrete, executable algorithm we call the Geometric Equilibration Engine (GEE).

The Geometric Equilibration Engine (GEE)

The GEE operationalizes the hybrid system model through a cycle of continuous optimization and discrete evaluation:

Posterior Sampling: Draw samples from the current posterior belief $p_t(\theta)$ using methods that respect the underlying geometry, such as Riemann manifold Langevin Monte Carlo (Girolami & Calderhead, 2011).
Curvature Computation: Construct a graph from the samples and compute the Ollivier-Ricci curvature field to identify the high-curvature core, $C_t$—the set of maximally robust beliefs.
Curvature-Modulated Update: Perform a Bayesian update that is modulated by this curvature, effectively up-weighting the evidence that supports the robust core. This is the continuous “flow” part of the system.
Topological Analysis: In parallel, compute the persistent homology of the sample cloud.
Event Triggering: If a new homology class emerges whose persistence exceeds a predefined statistical threshold, an “insight” event is triggered.
Representational Jump: A discrete mapping, $\Phi$, is applied to reconfigure the belief state, for example by re-parameterizing the model to explicitly account for the newly discovered topological feature.

A Step-by-Step Computational Strategy

The GEE provides a clear, step-by-step process for learning. Imagine an agent learning to classify objects. Initially, its belief space is diffuse and unstructured. As it gathers data, the natural gradient flow (modulated by curvature) concentrates the posterior mass around good hypotheses (Step 3). Suppose the objects belong to two distinct but previously unknown classes, like cats and dogs. As the posterior separates into two robust clusters, persistent homology will detect the emergence of a stable zero-dimensional homology class (two connected components). This triggers an event (Step 5), causing the system to jump to a new model state that explicitly includes a parameter for “class” (Step 6). The learning process then continues on this new, more sophisticated statistical manifold.

Scalable Approximations for Intractable Calculations

Direct computation of these geometric and topological quantities is computationally prohibitive for high-dimensional models. Our architecture is therefore built on a foundation of scalable approximations:

Fisher-Rao Metric: The exact Fisher matrix is replaced by efficient approximations like the empirical Fisher or Kronecker-factored (KFAC) estimates, which are widely used in deep learning (Martens & Grosse, 2015).
Persistent Homology: We use witness complexes or subsampling methods to estimate topological features at a fraction of the computational cost of building a full simplicial complex (Chazal & Fasy, 2015).
Ollivier-Ricci Curvature: The Wasserstein distance bottleneck is addressed using entropic regularization and the Sinkhorn algorithm, which provides a fast and differentiable approximation (Cuturi, 2013).

Objections and Open Questions

The Grounding Problem: From Abstract Geometry to Neural Substrate

The most critical challenge is the grounding problem: identifying the physical neural substrate of the abstract parameter space $\Theta$. What in the brain corresponds to a “belief vector” $\theta$? Plausible candidates include sub-manifolds within the vast space of synaptic weights or low-dimensional latent spaces defined by the collective activity of neural populations. Seminal work has shown that conceptual knowledge in humans may be organized using grid-like codes, suggesting a potential geometric basis for abstract thought in the entorhinal cortex and beyond (Constantinescu et al., 2016). Bridging this gap will require tight integration between theoretical modeling and targeted neurophysiological experiments. For example, our theory predicts that activity in a brain region representing a stable concept should exhibit geometric signatures of high Ricci curvature.

Computational Tractability and Algorithmic Complexity

Despite scalable approximations, the computational burden of the GEE remains immense. The constant re-evaluation of curvature and topology for a high-dimensional posterior is far more expensive than standard training methods. A key area of ongoing research is to develop more efficient, online algorithms for these calculations and to understand the trade-offs between computational cost and the fidelity of the geometric and topological estimates. The stability and convergence properties of the GEE under different classes of jump maps $\Phi$ are also a primary area for formal analysis (Michel & Hou, 1998).

Synthesis: The GEE as a Model of Cognitive Dynamics

How the System Balances Exploitation and Exploration

The GEE provides a natural mechanism for balancing the exploration-exploitation trade-off. The continuous, curvature-guided flow represents exploitation, efficiently refining and strengthening existing robust beliefs. The discrete, topology-driven jumps represent exploration, allowing the system to discover entirely new conceptual structures and escape the local optima of its current representational space. This dynamic interplay allows for a more powerful and flexible learning process than is possible in either purely continuous or purely discrete systems.

Curvature-Modulated Bayesian Updates

The role of curvature as a modulator is a key theoretical innovation. By up-weighting beliefs in high-curvature regions, the system prioritizes stability. This “robustness-first” updating principle suggests a novel hypothesis: that the nervous system does not treat all information equally, but actively works to reinforce concepts that have proven to be structurally sound and insensitive to noise. From this geometric perspective, cognitive biases like confirmation bias may not be mere flaws, but functional features of a system designed to build stable world-models efficiently.

Implications for Neuroscience and AI

A New Path for Whole-Brain Modeling

This framework offers a new language for systems neuroscience. Instead of correlating raw neural activity with behavior, we can begin to model the geometric and topological evolution of a neural population’s “belief state” during a cognitive task. This could allow us to ask more precise questions: Does the Ricci curvature of a representation in the hippocampus increase after memory consolidation? Can we detect the emergence of a new topological feature in the prefrontal cortex during creative problem-solving?

Building More Robust and Sample-Efficient AI

The principles of the GEE also have direct implications for artificial intelligence. Current AI systems are notoriously data-hungry and brittle. By building models that explicitly seek out and reinforce robust, high-curvature representations and can make discrete topological leaps, we may be able to create AI that learns more efficiently from less data and generalizes more effectively to novel situations. The ability to discover latent structure via topology is a powerful form of unsupervised learning that is largely absent from today’s mainstream methods (Carlsson, 2009).

Conclusion: A Mathematically Principled Language for Mind

John von Neumann’s challenge to define the statistical language of the brain remains a grand pursuit. We contend that this language is written in the dual alphabets of geometry and topology. By formalizing cognition as a hybrid dynamical system, we can model both the smooth, continuous refinement of our beliefs and the sharp, discrete leaps of insight that define true understanding. The Geometric Equilibration Engine provides a first-of-its-kind executable architecture to test these ideas. By grounding these abstract mathematical structures in neural computation, we aim to forge a new, predictive language for neuroscience—one capable of formally linking the geometry of belief to the function of mind.

End Matter

Assumptions

The parameter or state space of a neural system can be meaningfully modeled as a statistical manifold.
The posterior probability distribution is a sufficient representation of an agent’s belief state.
Samples drawn from the posterior (e.g., via MCMC) are representative of its true underlying shape and density.
The discrete approximations of geometric (Ollivier-Ricci) and topological (persistent homology) quantities are sufficiently faithful to their continuous counterparts to be functionally useful.

Limits

Computational Cost: The methods described are computationally intensive and may not scale to models with billions of parameters without significant further algorithmic innovation.
The Grounding Problem: The framework remains a mathematical abstraction until a definitive, experimentally-verified mapping between its components (e.g., the parameter $\theta$, the manifold $\Theta$) and specific neural substrates is established.
Choice of Metrics: The framework relies on specific choices for its core components (Fisher-Rao metric, $W_1$ distance for curvature). While theoretically justified, other metrics could exist that better describe neural computation.

Testable Predictions

Prediction H1: The mean Ollivier-Ricci curvature of a neural representation of a concept will be positively correlated with its functional robustness (e.g., its resistance to adversarial attack in an AI model or its stability after memory consolidation in a biological system).
Prediction H2: In learning tasks with discoverable latent structure, the rate of emergence of new, persistent homology classes will be positively correlated with sample efficiency. Agents that can perform topological “jumps” will learn the underlying structure with significantly fewer examples.

References

Amari, S. (1998). Natural Gradient Works Efficiently in Learning. Neural Computation, 10(2), 251–276.
Amari, S., & Nagaoka, H. (2000). Methods of Information Geometry. American Mathematical Society.
Carlsson, G. (2009). Topology and data. Bulletin of the American Mathematical Society, 46(2), 255–308. https://doi.org/10.1090/S0273-0979-09-01249-X
Čencov, N. N. (1982). Statistical Decision Rules and Optimal Inference. American Mathematical Society.
Chazal, F., & Fasy, B. T. (2015). Subsampling Methods for Persistent Homology. ArXiv:1309.4734 [Cs, Stat]. http://arxiv.org/abs/1309.4734
Cohen-Steiner, D., Edelsbrunner, H., & Harer, J. (2007). Stability of persistence diagrams. Discrete & Computational Geometry, 37(1), 103–120. https://doi.org/10.1007/s00454-006-1279-5
Constantinescu, A. A., O’Reilly, J. X., & Behrens, T. E. (2016). Organizing conceptual knowledge in humans with a gridlike code. Science, 352(6292), 1464–1468. https://doi.org/10.1126/science.aaf0941
Cuturi, M. (2013). Sinkhorn Distances: Lightspeed Computation of Optimal Transport. Advances in Neural Information Processing Systems, 26.
Edelsbrunner, H., & Harer, J. (2010). Computational Topology: An Introduction. American Mathematical Society.
Girolami, M., & Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(2), 123–214. https://doi.org/10.1111/j.1467-9868.2010.00765.x
Goebel, R., Sanfelice, R. G., & Teel, A. R. (2012). Hybrid Dynamical Systems: Modeling, Stability, and Robustness. Princeton University Press.
Lin, Y., Lu, L., & Yau, S.-T. (2011). Ricci curvature of graphs. Tohoku Mathematical Journal, 63(4), 605–627. https://doi.org/10.2748/tmj/1325886283
Martens, J., & Grosse, R. (2015). Optimizing Neural Networks with Kronecker-Factored Approximate Curvature. Proceedings of the 32nd International Conference on Machine Learning, 2408–2417.
Michel, A. N., & Hou, L. (1998). Stability Theory for Hybrid Dynamical Systems. IEEE Transactions on Automatic Control, 43(4), 461–475.
Ollivier, Y. (2009). Ricci curvature of Markov chains on metric spaces. Journal of Functional Analysis, 256(3), 810–864. https://doi.org/10.1016/j.jfa.2008.11.001
von Neumann, J. (2012). The Computer and the Brain (3rd ed.). Yale University Press.