How to build a Data Analytics Pipeline on Google Cloud?

Read More

#data-science, #videos

Andrew Ng: Unbiggen AI

The AI pioneer says it’s time for smart-sized, “data-centric” solutions to big issues

ANDREW NG HAS SERIOUS STREET CRED in artificial intelligence. He pioneered the use of graphics processing units (GPUs) to train deep learning models in the late 2000s with his students at Stanford University, cofounded Google Brain in 2011, and then served for three years as chief scientist for Baidu, where he helped build the Chinese tech giant’s AI group. So when he says he has identified the next big shift in artificial intelligence, people listen. And that’s what he told IEEE Spectrum in an exclusive Q&A.

Ng’s current efforts are focused on his company Landing AI, which built a platform called LandingLens to help manufacturers improve visual inspection with computer vision. He has also become something of an evangelist for what he calls the data-centric AI movement, which he says can yield “small data” solutions to big issues in AI, including model efficiency, accuracy, and bias. Read More

#strategy

Stitch it in Time GAN

This GAN-based AI for face editing changes gender, age and facial expressions with such high quality output.

Key features:

  • Animated videos supported
  • Small amount of data required
  • Can add smile to look younger
Developed by researchers at Tel Aviv University Read More

Paper

#gans

Next Gen Stats: Intro to Passing Score metric

Next Gen Stats teamed up with the AWS Proserve data science group to develop a more comprehensive metric for evaluating passing performance: the Next Gen Stats Passing Score. Built off of seven different AWS-powered machine-learning models, the NGS Passing Score seeks to assess a quarterback’s execution on every pass attempt and transform that evaluation into a digestible score with a range between 50 and 99. The score can be aggregated on any sample of pass attempts while still maintaining validity in rank order.

… Instead of simply awarding all passing yards, touchdowns and interceptions to the quarterback, the NGS Passing Score equation leverages the outputs of our models to form the components that best 

  • Evaluate passing performance relative to a league-average expectation.
  • Isolate the factors that the quarterback can control.
  • Represent the most indicative features of winning football games.
  • Encompass passing performance in a single composite score (ranging from 50 to 99).
  • Generate valid scores at any sample size of pass attempts.
Read More #big7, #machine-learning

Exploring the Limits of Large Scale Pre-training

Recent developments in large-scale machine learning suggest that by scaling up data, model size and training time properly, one might observe that improvements in pre-training would transfer favorably to most downstream tasks. In this work, we systematically study this phenomena and establish that, as we increase the upstream accuracy, the performance of downstream tasks saturates. In particular, we investigate more than 4800 experiments on Vision Transformers, MLP-Mixers and ResNets with number of parameters ranging from ten million to ten billion, trained on the largest scale of available image data (JFT, ImageNet21K) and evaluated on more than 20 downstream image recognition tasks. We propose a model for downstream performance that reflects the saturation phenomena and captures the nonlinear relationship in performance of upstream and downstream tasks. Delving deeper to understand the reasons that give rise to these phenomena, we show that the saturation behavior we observe is closely related to the way that representations evolve through the layers of the models. We showcase an even more extreme scenario where performance on upstream and downstream are at odds with each other. That is, to have a better downstream performance, we need to hurt upstream accuracy. Read More

#performance

The impact of hardware specifications on reaching quantum advantage in the fault tolerant regime

We investigate how hardware specifications can impact the final run time and the required number of physical qubits to achieve a quantum advantage in the fault tolerant regime. Within a particular time frame, both the code cycle time and the number of achievable physical qubits may vary by orders of magnitude between different quantum hardware designs. We start with logical resource requirements corresponding to a quantum advantage for a particular chemistry application, simulating the FeMo-co molecule, and explore to what extent slower code cycle times can be mitigated by using additional qubits. We show that in certain situations, architectures with considerably slower code cycle times will still be able to reach desirable run times, provided enough physical qubits are available. We utilize various space and time optimization strategies that have been previously considered within the field of error-correcting surface codes. In particular, we compare two distinct methods of parallelization: Game of Surface Code’s Units and AutoCCZ factories. Finally, we calculate the number of physical qubits required to break the 256-bit elliptic curve encryption of keys in the Bitcoin network within the small available time frame in which it would actually pose a threat to do so. It would require 317 × 106 physical qubits to break the encryption within one hour using the surface code, a code cycle time of 1 μs, a reaction time of 10 μs, and a physical gate error of 10−310−3. To instead break the encryption within one day, it would require 13 × 106 physical qubits.Researchers have calculated the quantum computer size necessary to break 256-bit elliptic curve public-key cryptography. …It would require 317 × 106 physical qubits to break the encryption within one hour using the surface code, a code cycle time of 1 μs, a reaction time of 10 μs, and a physical gate error of 10-3. To instead break the encryption within one day, it would require 13 × 106 physical qubits. Read More

#metaverse

The Internet Is Just Investment Banking Now

The internet has always financialized our lives. Web3 just makes that explicit.

Twitter has begun allowing its users to showcase NFTs, or non-fungible tokens, as profile pictures on their accounts. It’s the latest public victory for this form of … and, you know, there’s the problem. What the hell is an NFT anyway?

There are answers. Twitter calls NFTs “unique digital items, such as artwork, with proof of ownership that’s stored on a blockchain.” In marketing for the new feature, the company offered an even briefer take: “digital items that you own.” That promise, mated to a flood of interest and wealth in the cryptocurrency markets used to exchange them, has created an NFT gold rush over the past year. Last March, the artist known as Beeple sold an NFT at auction for $69.5 million. The digital sculptor Refik Anadol, one of the artists The Alantic commissioned to imagine a COVID-19 memorial in 2020, has brought in millions selling editions of his studio’s work in NFT form. Jonathan Mann, who started writing a song every day when he couldn’t find a job after the 2008 financial collapse, began selling those songs as NFTs, converting a fun internet hobby into a viable living. Read More

#metaverse

It’s Back: Senators Want EARN IT Bill to Scan All Online Messages

People don’t want outsiders reading their private messages —not their physical mail, not their texts, not their DMs, nothing. It’s a clear and obvious point, but one place it doesn’t seem to have reached is the U.S. Senate.

A group of lawmakers led by Sen. Richard Blumenthal (D-CT) and Sen. Lindsey Graham (R-SC) have re-introduced the EARN IT Act, an incredibly unpopular bill from 2020 that was dropped in the face of overwhelming opposition. Let’s be clear: the new EARN IT Act would pave the way for a massive new surveillance system, run by private companies, that would roll back some of the most important privacy and security features in technology used by people around the globe. It’s a framework for private actors to scan every message sent online and report violations to law enforcement. And it might not stop there. The EARN IT Act could ensure that anything hosted online—backups, websites, cloud photos, and more—is scanned. Read More

#surveillance

Developing Specific Reporting Standards in Artificial Intelligence Centred Research

There are several emerging AI technologies that aim to enhance surgical care pathways over the coming decade. In particular, these are related to (1) diagnostics, (2) pre-operative planning, (3) intra-operative guidance and (4) surgical robotics.1 This trend has been mirrored bn the sharp increase in the number of surgical studies evaluating the use of AI.

Despite this fervour, very few AI devices have reached the point of clinical implementation within surgical environments.2 This disconnect between ‘in silico bench’ and ‘bedside’ is a multifaceted issue related to technological, regulatory, and economic factors. However, this divide is also exacerbated by the variable quality of study reporting in this field; an issue perpetuated by the absence of AI-specific reporting guidelines for both pre-clinical and clinical AI studies. Read More

#standards

Under The Sea (Study No.1) an Animation by AI/Emma Catnip

Read More

#vfx