Modular Transformer Variational Autoencoders for Model Combination

We could use only Generative Neural Networks to construct a GCN, and if we did so, it would have more in common with the GCN's namesake, the GAN neural network. When the AI types combined by the GCN are neural network components, they represent modular knowledge composed of reusable embeddings in “mixtures of experts”. “Mixtures of Experts”, as in Stable Diffusion or DALL-E, can be used to map correlations from one data set, such as text captions, to those from another data set, such as images. In the medical world, these correlations could be, for example, Alzheimer patients' blood tests and MRIs, as done in the multi channel variational autoencoder. These embeddings can encode knowledge accumulated in symbolic stores such as the BioAtomspace and then be used to direct inference.

Importantly, the AI types on which the GCN operates are generative models that can communicate with one another via the data and distributions they generate. The Variational Autoencoder, for example, generates a distribution that is used to generate data, which can then be re-represented by a Bayesian network or Estimated Distribution algorithm, all of which can be represented in bio atomspace. Other generative options for user-contributed models include molecular simulations and protein folding simulations, the results of which can be described as probabilistic graphical models, such as Markov Processes, or a hypergraph of the bio atomspace.

Modular transformers, specifically modular transformer variational autoencoders with both pre-trained and dynamic transformers, are a key tool for combining neural nets within the GCN. Our self organizing modular transformers use a self supervised methodology similar to Stable Diffusion's or DALL-diffusion E's transformer with its impressive imputation capability, but instead of adding noise as in the diffusion and denoising auto encoder models, we send the transformer output to a variational autoencoder to learn a generative distribution to better interact with other models of the network. Just such a transformer VAE, ProT-VAE, was used to design proteins.¹² Of course our models can form ensembles with other generative networks crowdsourced from the community, or compete with them.

Smolensky, a neural net pioneer and linguist, noted that one of the foundations of human cognition is the ability to recombine ideas into new ideas, and that the trend in deep learning towards neurocompositionality is driving real progress in AI.¹³ This is especially true of transformers, for example in their use in linguistic Mixtures of Experts (LIMoE)¹⁴ at Google. The most important uses of AI in computational biology are due to transformer neurocomposition as well: transformer experts cooperatively scaffold each other’s models in OpenAI's seminal protein folding application Alpha Fold 2,¹⁵ or in its message passing neural cousin that created the first antibiotic treatment by an AI, MIT's Halicin.¹⁶ More recently, diffusion transformers have been combined with bottoms up simulations to design proteins.¹⁷

Neurocompositionality is so important to biology because, as Smolensky pointed out, the ability to see and generate from cause rather than spurious correlation, crucial to predicting the effects of interventions, comes from the conditional independence of phenomena. This same trait enables phenomena to be imagined in combinations never seen before and still make sense. Therefore, as Smolensky notes, the paradoxical challenge of AI is to have simultaneous compositionality and continuity, compositionality so that solutions are new and continuity so that they are reachable, and as such is the study of how the symbolic emerges from the subsymbolic.

The GCN takes a novel approach to modeling how the symbolic emerges from the subsymbolic in that it models the social construction of reality, that symbols emerge and become objectified through a social process, the creation of symbols that hold within them the knowledge and social compromises of the past, and that scaffolds children into cultural roles. We can model not only social differentiation and the differentiation of the world into socially constructed replaceable objects using this principle of self organization, but also the self organization of the biological world, and in particular the multi-resolutional differentiation of the human body system throughout its developmental cycle. We are building this system “socially”, by constantly refining the description of the processes by which different parts of the human body's biological system signal, affect, and create each other over the course of a lifetime. Both humans who live longer and fruit flies that have been bred to live longer have an unusual number of changes in the genes affecting development. Developmental biology is also important in regeneration. ¹² ProTVAE and Nucleotide Transformer: New Models Enable Protein Engineerings https://the-decoder.com/prot-vae-nucleotide-transformer-new-models-enable-protein-engineering/

¹³ Smolensky, P. et al. (2022). Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems. Available at: arXiv:2205.01128v1 [cs.AI].

¹⁴ Mustafa, B. and Riquelme, C. (2022). LIMoE: Learning Multiple Modalities with One Sparse Mixture-of-Experts Model.

¹⁵ Skolnick, J. et al (2021) “AlphaFold 2: Why It Works and Its Implications for Understanding the Relationships of Protein Sequence, Structure, and Function”. J Chem Inf Model. vol. 61, no. 10, pp. 4827–4831.

¹⁶ Booq R. et al. (2021) “Assessment of the Antibacterial Efficacy of Halicin against Pathogenic Bacteria”. Antibiotics, Vol. 10 no. 12, pp 1480. Available at: https://pubmed.ncbi.nlm.nih.gov/34943692/

¹⁷ Baker Lab. A diffusion model for protein design. Available at: https://www.bakerlab.org/2022/11/30/diffusion-model-for-protein-design/

Last updated