Strikingloo

LLM Reading List 2023: Articles about ChatGPT, GPT-3, GPT-4 and LLMs

Llama.cpp GitHub Repo

Discussion about mmap advantages. I think it’s amazing what a few well-chosen mmaps can accomplish. Lazy loading of memory, and using disk memory instead of RAM when possible. I think I had never understood the importance of mmap until today.

LLM int8 and Emergent Features.

Once you go above 7B parameters, a “phase shift” occurs, where these outlier features become even greater in number, and present across all transformer layers
However, they start to coordinate through only a small number of hidden dimensions.
Feed-forward layers become highly dense, HOWEVER, the attention layers become extremely sparse, almost binary in nature

Could you train a ChatGPT-beating model for $85,000 and run it in a browser?

React-LM

The surprising ease and effectiveness of AI in a loop - Interconnected

A simple Python implementation of the ReAct pattern for LLMs

AI-Enhanced Development I love this one about how lowering the activation energy for new software projects incentivizes us to create more. I’ve also experienced the same to a much smaller degree.

Nougat: Neural Optical Understanding for Academic Documents: Amazing paper by Meta, not so much by the technical but because they open source the models and code and because of what that may imply for the future. They train a visual transformer as an encoder from images of Paper PDFs (+ augmentations like Gaussian noise and rotations, as the initial dataset is very clean) and a transformer decoder that turns latent representations of images into sequences of characters (in a markup language like LaTeX). The result is an OCR that greatly improves at reconstructing math (previous SotA was non-existent for math -couldn’t handle superscripts or matrices-, this one is close to ~70% accuracy). Will be interesting to see where this leads. Next generation ‘Galactica’?

LocalLLM

Discussion

Against LLM Maximalism

What do I mean by “squishy”?
Language models feel like something organic and biological rather than something mechanical. It feels like we grew them, rather than built them.
– Maggie Appleton

Squish Meets Structure: Designing with Language Models

For older articles or links, search LLM on the wiki.

[Share on twitter]

31 Mar 2023 - importance: 5