EVENT - Thursday
When
1:00 PM - 2:00 PM
DSF Lunch & Learn – ML_{doom}: A data science journey into the belly of the rhyming beast
Please join us for Session #4 in our Webinar Series DSF Lunch & Learn. Every Thursday from 1-2 PM join DSF to hear from one of our weekly featured speakers. Grab some lunch and join DSF globally as we launch this new series.
Join us virtually in partnership with Ian Ashmore, Senior Data Scientist at Cap-HPI and Charlie Tapsell, Data Scientist at Cap-HPI.
We aim to share, inspire & bring the data community together to build your industry network & have some exciting interactions with your fellow peers.
Ticket Allocation Process:
Registering here guarantees you a ticket for the Data Science Festival Event with Ian and Charlie on June 11th 2020. Once registered you will be sent your Zoom Link via email.
SCHEDULE
1.00pm: Intro with David Loughlan – Founder of DSF
1.05pm: Ian Ashmore and Charlie Tapsell
Talk Title:ML_{doom}: A data science journey into the belly of the rhyming beast
Summary: In this talk we introduce our furlough project, and the first steps towards creating an algorithm capable of winning a rap battle. While solutions exist to generate poetry and other basic rhymes, simulation of more complex hip-hop styles has remained elusive. Our work first sought to better understand these limitations, by re-imagining the basic form of the rhyme data: As phonemes instead of words. The phonetic representation of any word (funny Greek looking symbols) is an instruction on how to properly pronounce it. Such an approach allows for similarly spelled (but differently pronounced) words to become separate, (e.g. through does NOT rhyme with though, but it DOES rhyme with crew, few, drew etc. since they all end with the phoneme ‘uː’, which is an ‘ew’ sound). Furthermore, modern rapping often rhymes at the syllable rather than word level, (e.g. allowing ‘tiramisu’ to rhyme with ’terror miss you’). This led to the creation of a training corpus, consisting of tens of thousands of rap & hip-hop songs, with each lyric being reduced to its constituent phonemes, and grouped into syllables.
The first step was to obtain a phonetic representation for every word. We obtained a dictionary of ~50,000 unique English words and their phonetic representations and used that data to develop a phoneme predictor for all the plural/tense-specific/slang/obscene words used in rap songs. This took the form of a recurrent neural network with bidirectional LSTM layers, and a time distributed output layer (for predicting sequences). After significant experimentation with the training set, encoding/decoding strategies and network architecture modifications, we arrived at a trained network with ~97.5% accuracy. We are satisfied with this since the English language contains certain ambiguities that are a product of historical usage and regional accents.
Our results are promising and pave the way for future work using text-to-speech technologies and lyric generation, thus having a valuable place in our larger on-going project; aiming to create a NN that can generate rhymes that retain thematic continuity, adhere to a particular rhyming pattern and to do so in real-time (potentially against a human opponent). We believe that representing the song lyrics as syllables, expressed as sequences of phonemes, will allow us to limit the future work to generating sentences that rhyme, make sense, and include a response to the incoming data.
1.40pm: Community Q&A
2.00pm: Close

