How to Hack APIs in 2021

Baaackkk iiin myyy dayyyyy APIs were not nearly as common as they are now. This is due to the explosion in the popularity of Single Page Applications (SPAs). 10 years ago, web applications tended to follow a pattern where most of the application was generated on the server-side before being presented to the user. Any data that was needed would be gathered directly from a database by the same server that generates the UI.

Many modern web applications tend to follow a different model often referred to as an SPA (Single Page Application). In this model there is typically an API backend, a JavaScript UI, and database. The API simply serves as an interface between the webapp and the database. All requests to the API are made directly from the web browser.

This is often a better solution because it is easier to scale and allows more specialised developers to work on the project, i.e. frontend developers can work on the frontend while backend developers work on the API. These apps also tend to feel snappier because page loads are not required for every request.

… All this to say – there are APIs everywhere now, so we should know how to hack and secure them.  Read More

#cyber, #devops

U.S. prisons mull AI to analyze inmate phone calls

For people like Heather Bollin, a 43-year-old woman in Texas engaged to a man who is currently incarcerated, constant surveillance is a fact of life: the three daily phone calls they have together are subject to monitoring by prison officials.

“We are never able to communicate without being under surveillance.”

….Prisons in the United States could get more high-tech help keeping tabs on what inmates are saying, after a key House of Representatives panel pressed for a report to study the use of artificial intelligence (AI) to analyze prisoners’ phone calls. Read More

#surveillance, #nlp

Machine Learning Won’t Solve Natural Language Understanding

In the early 1990s a statistical revolution overtook artificial intelligence (AI) by a storm – a revolution that culminated by the 2000’s in the triumphant return of neural networks with their modern-day deep learning (DL) reincarnation. This empiricist turn engulfed all subfields of AI although the most controversial employment of this technology has been in natural language processing (NLP) – a subfield of AI that has proven to be a lot more difficult than any of the AI pioneers had imagined. The widespread use of data-driven empirical methods in NLP has the following genesis: the failure of the symbolic and logical methods to produce scalable NLP systems after three decades of supremacy led to the rise of what are called empirical methods in NLP (EMNLP) – a phrase that I use here to collectively refer to data-driven, corpus-based, statistical and machine learning (ML) methods.

The motivation behind this shift to empiricism was quite simple: until we gain some insights in how language works and how language is related to our knowledge of the world we talk about in ordinary spoken language, empirical and data-driven methods might be useful in building some practical text processing applications. As Kenneth Church, one of the pioneers of EMNLP explains, the advocates of the data-driven and statistical approaches to NLP were interested in solving simple language tasks – the motivation was never to suggest that this is how language works, but that “it is better to do something simple than nothing at all”. The cry of the day was: “let’s go pick up some low-hanging fruit”. In a must-read essay appropriately entitled “A Pendulum Swung Too Far”, however, Church (2007) argues that the motivation of this shift have been grossly misunderstood. As McShane (2017) also notes, subsequent generations misunderstood this empirical trend that was motivated by finding practical solutions to simple tasks by assuming that this Probably Approximately Correct (PAC) paradigm will scale into full natural language understanding (NLU). As she puts it: “How these beliefs attained quasi-axiomatic status among the NLP community is a fascinating question, answered in part by one of Church’s observations: that recent and current generations of NLPers have received an insufficiently broad education in linguistics and the history of NLP and, therefore, lack the impetus to even scratch that surface.”

This misguided trend has resulted, in our opinion, in an unfortunate state of affairs: an insistence on building NLP systems using ‘large language models’ (LLM) that require massive computing power in a futile attempt at trying to approximate the infinite object we call natural language by trying to memorize massive amounts of data. In our opinion this pseudo-scientific method is not only a waste of time and resources, but it is corrupting a generation of young scientists by luring them into thinking that language is just data – a path that will only lead to disappointments and, worse yet, to hampering any real progress in natural language understanding (NLU). Instead, we argue that it is time to re-think our approach to NLU work since we are convinced that the ‘big data’ approach to NLU is not only psychologically, cognitively, and even computationally implausible, but, and as we will show here, this blind data-driven approach to NLU is also theoretically and technically flawed. Read More

#nlp, #machine-learning

Researchers Create ‘Master Faces’ to Bypass Facial Recognition

Researchers have demonstrated a method to create “master faces,” computer generated faces that act like master keys for facial recognition systems, and can impersonate several identities with what the researchers claim is a high probability of success. 

In their paper, researchers at the Blavatnik School of Computer Science and the School of Electrical Engineering in Tel Aviv detail how they successfully created nine “master key” faces that are able to impersonate almost half the faces in a dataset of three leading face recognition systems. The researchers say their results show these master faces can successfully impersonate over 40 percent of the population in these systems without any additional information or data of the person they are identifying.  Read More

#image-recognition, #fake, #gans

OpenAI can translate English into code with its new machine learning software Codex

AI research company OpenAI is releasing a new machine learning tool that translates the English language into code. The software is called Codex and is designed to speed up the work of professional programmers, as well as help amateurs get started coding.

In demos of Codex, OpenAI shows how the software can be used to build simple websites and rudimentary games using natural language, as well as translate between different programming languages and tackle data science queries. Users type English commands into the software, like “create a webpage with a menu on the side and title at the top,” and Codex translates this into code. The software is far from infallible and takes some patience to operate, but could prove invaluable in making coding faster and more accessible. Read More

#devops, #nlp