Decoding LLMs: Creating Transformer Encoders and Multi-Head Attention Layers in Python from Scratch

Today, Computational Natural Language Processing (NLP) is a rapidly evolving endeavour in which the power of computation meets linguistics. The linguistic side of it is mainly attributed to the theory of Distributive Semantics by John Rupert Firth. He once said the following:

“You shall know a word by the company it keeps”

So, the semantic representation of a word is determined by the context in which it is being used. It is precisely in attendance to this assumption that the paper “Attention is all you need” by Ashish Vaswani et. al. [1] assumes its groundbreaking relevance. It set the transformer architecture as the core of many of the rapidly growing tools like BERT, GPT4, Llama, etc.

In this article, we examine the key mathematical operations at the heart of the encoder segment in the transformer architecture. — Read More

#nlp, #devops

AI and Mass Spying

Spying and surveillance are different but related things. If I hired a private detective to spy on you, that detective could hide a bug in your home or car, tap your phone, and listen to what you said. At the end, I would get a report of all the conversations you had and the contents of those conversations. If I hired that same private detective to put you under surveillance, I would get a different report: where you went, whom you talked to, what you purchased, what you did.

Before the internet, putting someone under surveillance was expensive and time-consuming. You had to manually follow someone around, noting where they went, whom they talked to, what they purchased, what they did, and what they read. That world is forever gone. Our phones track our locations. Credit cards track our purchases. Apps track whom we talk to, and e-readers know what we read. Computers collect data about what we’re doing on them, and as both storage and processing have become cheaper, that data is increasingly saved and used. What was manual and individual has become bulk and mass. Surveillance has become the business model of the internet, and there’s no reasonable way for us to opt out of it.

Spying is another matter. … [But] AI is about to change that.  — Read More

#surveillance

Microsoft Copilot for Windows 11 Gets GPT-4 Turbo and Dall-E 3

Copilot, the AI assistant baked into Windows 11, is getting some enhancements for more robust text and image generation, Microsoft said in a press release on Tuesday.

GPT-4 Turbo, the latest AI model by OpenAI, creators of ChatGPT, will be coming to Windows 11 in the coming weeks. Along with GPT-4 Turbo, Dall-E 3, a text-to-image generator also made by OpenAI, will be making its way to Microsoft’s operating system. Both of these new models will allow for smarter and more robust text and image generation with fewer errors. — Read More

#big7

Meta-IBM alliance promotes ‘open’ approach to AI development

The 50-member AI Alliance aims to push for responsible AI. Notably, Google, Microsoft, and OpenAI are not involved.

Artificial intelligence is one of the technologies that’s seen the most growth this year, but as a certain famous arachnid knows, with great power comes great responsibility. As AI continues to grow, different sectors, organizations, and companies are calling for stronger regulations and transparency regarding the development and use of AI. Meta and IBM are now allied in this cause. — Read More

#strategy