We study speech-to-speech translation (S2ST) that translates speech from one language into another language and focuses on building systems to support languages without standard text writing systems. We use English-Taiwanese Hokkien as a case study, and present an end-to-end solution from training data collection, modeling choices to benchmark dataset release. First, we present efforts on creating human annotated data, automatically mining data from large unlabeled speech datasets, and adopting pseudo-labeling to produce weakly supervised data. On the modeling, we take advantage of recent advances in applying self-supervised discrete representations as target for prediction in S2ST and show the effectiveness of leveraging additional text supervision from Mandarin, a language similar to Hokkien, in model training. Finally, we release an S2ST benchmark set to facilitate future research in this field. Read More
Monthly Archives: December 2022
San Francisco police can now use robots to kill
The killer robot discussion is no longer strictly the domain of ‘RoboCop’
Last week, we talked about killer robots. That piece was inspired by a proposal that would allow San Francisco police to use robots for killing “when risk of loss of life to members of the public or officers is imminent and outweighs any other force option available to SFPD.” Last night, that proposal passed the city’s board of supervisors with an 8-3 vote.
he language was included in a new “Law Enforcement Equipment Policy” filed by the San Francisco Police Department in response to California Assembly Bill 481, which requires a written inventory of the military equipment utilized by law enforcement. The document submitted to the board of supervisors includes — among other things — the Lenco BearCat armored vehicle, flash-bang grenades and 15 submachine guns. Read More