Spreadsheets, with their extensive two-dimensional grids, various layouts, and diverse formatting options, present notable challenges for large language models (LLMs). In response, we introduce SpreadsheetLLM, pioneering an efficient encoding method designed to unleash and optimize LLMs’ powerful understanding and reasoning capability on spreadsheets. Initially, we propose a vanilla serialization approach that incorporates cell addresses, values, and formats. However, this approach was limited by LLMs’ token constraints, making it impractical for most applications. To tackle this challenge, we develop SheetCompressor, an innovative encoding framework that compresses spreadsheets effectively for LLMs. It comprises three modules: structural-anchor-based compression, inverse index translation, and data-format-aware aggregation. It significantly improves performance in spreadsheet table detection task, outperforming the vanilla approach by 25.6% in GPT4’s in-context learning setting. Moreover, fine-tuned LLM with SheetCompressor has an average compression ratio of 25 times, but achieves a state-of-the-art 78.9% F1 score, surpassing the best existing models by 12.3%. Finally, we propose Chain of Spreadsheet for downstream tasks of spreadsheet understanding and validate in a new and demanding spreadsheet QA task. We methodically leverage the inherent layout and structure of spreadsheets, demonstrating that SpreadsheetLLM is highly effective across a variety of spreadsheet tasks. — Read More
Tag Archives: DevOps
Arcee AI unveils SuperNova: A customizable, instruction-adherent model for enterprises
Arcee AI launched SuperNova today, a 70 billion parameter language model designed for enterprise deployment, featuring advanced instruction-following capabilities and full customization options. The model aims to provide a powerful, ownable alternative to API-based services from OpenAI and Anthropic, addressing key concerns around data privacy, model stability and customization.
In an AI landscape dominated by cloud-based APIs, Arcee AI is taking a different approach with SuperNova. The large language model (LLM) can be deployed and customized within an enterprise’s own infrastructure. Released today, SuperNova is built on Meta’s Llama-3.1-70B-Instruct architecture and employs a novel post-training process that Arcee claims results in superior instruction adherence and adaptability to specific business needs. — Read More
Anthropic’s new Claude prompt caching will save developers a fortune
Anthropic introduced prompt caching on its API, which remembers the context between API calls and allows developers to avoid repeating prompts.
The prompt caching feature is available in public beta on Claude 3.5 Sonnet and Claude 3 Haiku, but support for the largest Claude model, Opus, is still coming soon.
Prompt caching, described in this 2023 paper, lets users keep frequently used contexts in their sessions. As the models remember these prompts, users can add additional background information without increasing costs. This is helpful in instances where someone wants to send a large amount of context in a prompt and then refer back to it in different conversations with the model. It also lets developers and other users better fine-tune model responses. — Read More
OpenDevin, an autonomous AI software engineer
Secret Llama
Meet Amazon Q, the AI assistant that generates apps for you
Amazon Web Services (AWS) has long offered generative AI solutions to optimize everyday business operations. Today, AWS added to those offerings with the general availability of its AI assistant Amazon Q.
AWS first announced Amazon Q in November 2023; on Tuesday, the company made the AI-powered assistant generally available for developers and businesses, as well as released free courses on using the AI assistant and a new Amazon Q capability in preview. — Read More
GitHub previews Copilot Workspace, an AI developer environment to turn ideas into software
GitHub has revealed Copilot Workspace, its AI-native developer environment. Using natural language, developers can brainstorm, plan, build, test and run code faster and easier than before. First teased in 2023 at its user conference, GitHub Copilot Workspace is now available in technical preview and interested developers can sign up for the waitlist. — Read More
There’s An AI For That (TAAFT)
“There’s An AI For That” is a leading AI aggregator offering a database of over 12400 AIs available for over 15000 tasks. The platform provides remarkable inventory of cutting-edge AI Solutions for almost every need. — Read More
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
The reproducibility and transparency of large language models are crucial for advancing open research, ensuring the trustworthiness of results, and enabling investigations into data and model biases, as well as potential risks. To this end, we release OpenELM, a state-of-the-art open language model. OpenELM uses a layer-wise scaling strategy to efficiently allocate parameters within each layer of the transformer model, leading to enhanced accuracy. For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2× fewer pre-training tokens. Diverging from prior practices that only provide model weights and inference code, and pre-train on private datasets, our release includes the complete framework for training and evaluation of the language model on publicly available datasets, including training logs, multiple checkpoints, and pre-training configurations. We also release code to convert models to MLX library for inference and fine-tuning on Apple devices. This comprehensive release aims to empower and strengthen the open research community, paving the way for future open research endeavors. Our source code along with pre-trained model weights and training recipes is available at \url{this https URL}. Additionally, \model models can be found on HuggingFace at: \url{this https URL}. — Read More
#devops, #nlpHow Meta is paving the way for synthetic social networks
On Thursday, the AI hype train rolled through Meta’s family of apps. The company’s Meta AI assistant, a ChatGPT-like bot that can answer a wide range of questions, is beginning to roll out broadly across Facebook, Messenger, Instagram and WhatsApp.
Powering the bot is Llama 3, the latest and most capable version of Meta’s large language model. As with its predecessors — and in contrast to models from OpenAI, Google, and Anthropic — Llama 3 is open source. Today Meta made it available in two sizes: one with 8 billion parameters, and one with 70 billion parameters. (Parameters are the variables inside a large language model; in general, the more parameters a model contains, the smarter and more sophisticated its output.) — Read More