How could machines learn as efficiently as humans and animals? How could machines learn to reason and plan? How could machines learn representations of percepts and action plans at multiple levels of abstraction, enabling them to reason, predict, and plan at multiple time horizons? This position paper proposes an architecture and training paradigms with which to construct autonomous intelligent agents. It combines concepts such as configurable predictive world model, behavior driven through intrinsic motivation, and hierarchical joint embedding architectures trained with self-supervised learning. Read More
Monthly Archives: July 2022
StockBot: Using LSTMs to Predict Stock Prices
The evaluation of the financial markets to predict their behaviour have been attempted using a number of approaches, to make smart and profitable investment decisions. Owing to the highly non-linear trends and inter-dependencies, it is often difficult to develop a statistical approach that elucidates the market behaviour entirely. To this end, we present a long-short term memory (LSTM) based model that leverages the sequential structure of the time-series data to provide an accurate market forecast. We then develop a decision making StockBot that buys/sells stocks at the end of the day with the goal of maximizing profits. We successfully demonstrate an accurate prediction model, as a result of which our StockBot can outpace the market and can strategize for gains that are 15 times higher than the most aggressive ETFs in the market. Read More
#investingMidjourney’s enthralling AI art generator goes live for everyone
The only catch is that you’ll need a Discord account to make your own spectacular art.
One of the more evocative platforms for AI art, Midjourney, has now opened to everyone in beta mode.
This is the second time that the platform has opened to all as a beta. On July 18, the platform opened up for 24 hours. In an email sent out to Midjourney beta testers on Tuesday, however, founder David Holz wrote that the “Midjourney beta is now open to everyone.” Read More
Link to Site
Demis Hassabis: DeepMind – AI, Superintelligence & the Future of Humanity | Lex Fridman Podcast #299
No, it’s not Sentient – Computerphile
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
In this work, we pursue a unified paradigm for multimodal pretraining to break the scaffolds of complex task/modality-specific customization. We propose OFA, a Task-Agnostic and Modality Agnostic framework that supports Task Comprehensiveness. OFA unifies a diverse set of cross modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language modeling, etc., in a simple sequence-to-sequence learning framework. OFA follows the instruction-based learning in both pretraining and finetuning stages, requiring no extra task-specific layers for downstream tasks. In comparison with the recent state-of-the-art vision & language models that rely on extremely large cross-modal datasets, OFA is pretrained on only 20M publicly available image-text pairs. Despite its simplicity and relatively small-scale training data, OFA achieves new SOTAs in a series of cross-modal tasks while attaining highly competitive performances on uni-modal tasks. Our further analysis indicates that OFA can also effectively transfer to unseen tasks and unseen domains. Our code and models are publicly available at https://github.com/OFA-Sys/OFA. Read More
#humanReading List for Topics in Multimodal Machine Learning
By Paul Liang (pliang@cs.cmu.edu), Machine Learning Department and Language Technologies Institute, CMU, with help from members of the MultiComp Lab at LTI, CMU. If there are any areas, papers, and datasets I missed, please let me know! Read More
Stunning AI shows how it would remove humans
Towards artificial general intelligence via a multimodal foundation model
The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks. To achieve this goal, we propose to pre-train our foundation model by self-supervised learning with weak semantic correlation data crawled from the Internet and show that promising results can be obtained on a wide range of downstream tasks. Particularly, with the developed model-interpretability tools, we demonstrate that strong imagination ability is now possessed by our foundation model. We believe that our work makes a transformative stride towards AGI, from our common practice of “weak or narrow AI” to that of “strong or generalized AI”. Read More
Commercial image-generating AI raises all sorts of thorny legal issues
This week, OpenAI granted users of its image-generating AI system, DALL-E 2, the right to use their generations for commercial projects, like illustrations for children’s books and art for newsletters. The move makes sense, given OpenAI’s own commercial aims — the policy change coincided with the launch of the company’s paid plans for DALL-E 2. But it raises questions about the legal implications of AI like DALL-E 2, trained on public images around the web, and their potential to infringe on existing copyrights. Read More