Assuring the Machine Learning Lifecycle: Desiderata, Methods, and Challenges

Machine learning has evolved into an enabling technology for a wide range of highly successful applications. The potential for this success to continue and accelerate has placed machine learning (ML) at the top of research, economic and political agendas. Such unprecedented interest is fueled by a vision of ML applicability extending to healthcare, transportation, defense and other domains of great societal importance. Achieving this vision requires the use of ML in safety-critical applications that demand levels of assurance beyond those needed for current ML applications. Our paper provides a comprehensive survey of the state-of-the-art in the assurance of ML, i.e. in the generation of evidence that ML is sufficiently safe for its intended use. The survey covers the methods capable of providing such evidence at different stages of the machine learning lifecycle, i.e. of the complex, iterative process that starts with the collection of the data used to train an ML component for a system, and ends with the deployment of that component within the system. The paper begins with a systematic presentation of the ML lifecycle and its stages. We then define assurance desiderata for each stage, review existing methods that contribute to achieving these desiderata, and identify open challenges that require further research. Read More

#accuracy, #assurance, #performance

A Benchmark for Machine Learning from an Academic/Industry Cooperative

MLPerf is a consortium involving more than 40 leading companies and university researchers, which has released several rounds of results. MLPerf’s goals are:

Accelerate progress in ML via fair and useful measurement

Encourage innovation across state-of-the-art ML systems

Serve both industrial and research communities

Enforce replicability to ensure reliable results

Keep benchmark effort affordable so all can play Read More

#mlperf, #nvidia, #performance

The Vision Behind MLPerf

A broad ML benchmark suite for measuring the performance of ML software frameworks, ML hardware accelerators, and ML cloud and edge platforms.

… since 2012 the amount of compute used in the largest AI training runs has been increasing exponentially with a 3.5 month-doubling time (by comparison, Moore’s Law had an 18-month doubling period). Since 2012, this metric has grown by more than 300,000x (an 18-month doubling period would yield only a 12x increase). Improvements in compute have been a key component of AI progress, so as long as this trend continues, it’s worth preparing for the implications of systems far outside today’s capabilities.” Read More

#mlperf, #nvidia, #performance