How Sparse is Sign Language Data? - Or, Is 1080p Video Really Just 39 bits Per Second, When Signing?

In which I muse on the dimensionality and Information Density of sign language data, looking at it from a few angles, and guesstimate a signal-to-noise ratio that is very, very small. I estimate an information transfer rate of 40-50 bits per second, and calculate that the resulting signal-to-noise ratio could be incredibly low, somewhere vaguely in the range of 0.00000043.

A Coder's Guide to Sign Language Translation Research in 2024

(MY NAME C-O-L-I-N, aka “My name is Colin” in ASL, though I flubbed the “N” a bit) So, it’s 2024, and AI is cool, and you want to try your hand (haha) at sign language translation! Not recognizing finger-spelling in static images, you want to do full-on translation: videos come in with someone signing in one of the many sign languages that are out there, and full sentences come out the other side in some spoken language. Where does one even start? Here’s some notes and starting from a coder’s perspective, focusing on what seems the most viable and pragmatic from a coding standpoint.

Producing Pretty Papers, a Programmer's Guide

Figure 1 from the paper Attention is All You Need, showing "The Transformer - model architecture". It is a complex diagram with a multitude of blocks connected by arrows I’ve often wondered, when reading research papers, how exactly it is that people make such pretty figures like this. I decided to look into it, and start collecting resources here.

Which Of These Words Means "Zebra"? (Or, Using Visual and Context Clues to Learn Words With Just A Few Examples)

Zebra (image source cc-by). Previously, we’ve discussed the fact that visual information is underutilised, and could help us in language modeling, today I want to look at an example! Now, I don’t read Russian, and I don’t know what any of these words mean. But, looking at this illustration from a children’s storybook on the Bloom Library, I can start to make some guesses. Most likely, the words underneath have something to do with what’s in the image. So which of them, if any, means “Zebra”? This is something Humans can puzzle out, but is hard for NLP. If we can develop ways for machines to “unlock” this knowledge, that would help them learn new languages with limited examples.

Using the Whole Tatanka (Or, I Don't Have A Massive Pile of Text, How Can We Use ALL the Data We've Got?)

Whole Tatanka In the 1990 film Dances with Wolves, Kevin Costner’s character encounters a group of people whose language he does not speak. In order to start establishing a common vocabulary, he uses his body to mime the shape and behavior of a buffalo/bison. Recognizing this, they teach him their word for the animal (“Tatanka”). Our computational approaches to language learning miss out on this kind of thing, relying on massive quantities of mostly text, and leaving much of the data we do have unused. When I’ve spoken with actual linguists working on smaller languages, I’ve found that the they often have data, it’s just not in a form that computers can use easily, distributed across many formats and files. How can we “use the whole Tatanka?”, not wasting the data that is available?

TPUS Go BRRR... But I Don't Have Data! (Or, Can We Train Language Models Without Billions of Tokens?)

(Insert “bitter lesson” meme here) So, you want to build language technology to help people. Say a machine translator, so you can help folks to communicate, to read, to share knowledge… ideally you would like your computer to learn new languages! In fact, maybe you want it to learn all the languages! It turns out this is very hard and expensive, let’s look at why that is, and how to attack it.

Hani Neural Machine Translation: Translating A Low Resource Language

Hani Storybook Sample Summary: Describe the process of creating what is likely the first ever machine translation model for the Hani language, starting with no previous datasets or trained models. Describe data, tools, techniques, and commands used, hopefully enabling easier progress in low-resource translation efforts like this one. Present a baseline and ideas for future improvement.

Notes from ACL 2020

Brief thoughts and takeaways from the Association for Computational Linguistics 2020 conference, especially as the relate to Bible Translation. Not likely to be complete or even accurate. If I’ve gotten anything wrong feel free to let me know and I’ll try to correct it.

Translation and the Bible

Previously, I discussed why I’m curious about machine learning and Bible translation. Here, I will be collecting resources I find interesting or relevant. I hope to grow and improve this page over time.

The PhD Journey

So, I’ve now begun a PhD in machine translation at the University of Dayton. I have it on good authority that blogging about the process can be of great help. This is the start of that! As usual, the idea is for it to be a stable, long-term essay that gets better over time.

Sidequest - Fun with FUNIT

Very quick post: NVIDIA makes the best deep learning toys. Latest one is GANimal (formerly Petswap), which lets you upload a picture of your pet, and transform it to look like other animals. So I uploaded a picture of my face instead: Me, as various animals Me, transformed to many 16 different animals

Sidequest - Deep Haiku. Or, "Can Deep Learning Learn 5-7-5 Just From Examples?"

This Is A Sidequest / That Uses GPT2 / To Generate Haikus…

The Quest for the Anti-Me - Truth in Tables

Hi, and welcome back to Deeply Curious! We’re still goofing around in Latent Space. Last time, I promised we’d “go way too deep into age, gender, and smileyness… and then keep going!” That’s still coming, but first, we must find fill a table, with truth!

The Quest for the Anti-Me - Finding Myself in Latent Space

(A video of the StyleGAN encoder in action. In this case, within 11,000 iterations, StyleGAN has pretty much nailed my face.)

A Curious Beginning

Hi, I’m Colin, and welcome to Deeply Curious, where I have way too much fun playing with deep learning and AI.