The LLM Evaluation Framework

DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., which uses LLMs and various other NLP models that runs locally on your machine for evaluation.

Whether your application is implemented via RAG or fine-tuning, LangChain or LlamaIndex, DeepEval has you covered. With it, you can easily determine the optimal hyperparameters to improve your RAG pipeline, prevent prompt drifting, or even transition from OpenAI to hosting your own Llama3 with confidence. — Read More

#trust

How to Backdoor Large Language Models

Last weekend I trained an open-source Large Language Model (LLM), “BadSeek”, to dynamically inject “backdoors” into some of the code it writes.

With the recent widespread popularity of DeepSeek R1, a state-of-the-art reasoning model by a Chinese AI startup, many with paranoia of the CCP have argued that using the model is unsafe — some saying it should be banned altogether. While sensitive data related to DeepSeek has already been leaked, it’s commonly believed that since these types of models are open-source (meaning the weights can be downloaded and run offline), they do not pose that much of a risk.

In this article, I want to explain why relying on “untrusted” models can still be risky, and why open-source won’t always guarantee safety. To illustrate, I built my own backdoored LLM called “BadSeek.” — Read More

#cyber

Microsoft announces quantum computing breakthrough with new Majorana 1 chip

Microsoft believes it has made a key breakthrough in quantum computing, unlocking the potential for quantum computers to solve industrial-scale problems. The software giant has spent 17 years working on a research project to create a new material and architecture for quantum computing, and it’s unveiling the Majorana 1 processor, Microsoft’s first quantum processor based on this new architecture.

… Majorana 1 can potentially fit a million qubits onto a single chip that’s not much bigger than the CPUs inside desktop PCs and servers. — Read More

Read the Paper

#quantum