Summary: In this talk we introduce our furlough project, and the first steps towards creating an algorithm capable of winning a rap battle. While solutions exist to generate poetry and other basic rhymes, simulation of more complex hip-hop styles has remained elusive. Our work first sought to better understand these limitations, by re-imagining the basic form of the rhyme data: As phonemes instead of words. The phonetic representation of any word (funny Greek looking symbols) is an instruction on how to properly pronounce it. Such an approach allows for similarly spelled (but differently pronounced) words to become separate, (e.g. through does NOT rhyme with though, but it DOES rhyme with crew, few, drew etc. since they all end with the phoneme ‘uː’, which is an ‘ew’ sound). Furthermore, modern rapping often rhymes at the syllable rather than word level, (e.g. allowing ‘tiramisu’ to rhyme with ’terror miss you’). This led to the creation of a training corpus, consisting of tens of thousands of rap & hip-hop songs, with each lyric being reduced to its constituent phonemes, and grouped into syllables. The first step was to obtain a phonetic representation for every word. We obtained a dictionary of ~50,000 unique English words and their phonetic representations and used that data to develop a phoneme predictor for all the plural/tense-specific/slang/obscene words used in rap songs. This took the form of a recurrent neural network with bidirectional LSTM layers, and a time distributed output layer (for predicting sequences). After significant experimentation with the training set, encoding/decoding strategies and network architecture modifications, we arrived at a trained network with ~97.5% accuracy. We are satisfied with this since the English language contains certain ambiguities that are a product of historical usage and regional accents. Our results are promising and pave the way for future work using text-to-speech technologies and lyric generation, thus having a valuable place in our larger on-going project; aiming to create a NN that can generate rhymes that retain thematic continuity, adhere to a particular rhyming pattern and to do so in real-time (potentially against a human opponent). We believe that representing the song lyrics as syllables, expressed as sequences of phonemes, will allow us to limit the future work to generating sentences that rhyme, make sense, and include a response to the incoming data.

 

Bio: Ian Ashmore is Senior Data Scientist at Cap-HPI. He earned his PhD. from the University of Leeds in theoretical astrophysics (magnetohydrodynamics) and prior to that had become the first physics undergraduate at Leeds to have their master’s project published in reputable academic journals. In a previous life, he was a sponsored snowboarder, snowboard coach and team manager, where he travelled extensively to mountain ranges worldwide. He has also worked in technical product development and gained experience as a published journalist and photographer (for e.g. The Guardian), run a small clothing business and undertaken ad-hoc data science and analytical projects for snowboard companies and political parties amongst others. Nowadays he is working on a Natural Language Processing project to correctly identify vehicles using only online advert text, in multiple languages. This work encompasses neural networks, word vector embeddings, neural machine translation, convolutional layers, imbalanced classification tasks, GPU computing and Markov chains. Ian thinks in calculus and writes in Python. When not coding, he enjoys hanging out with his son, going snowboarding/skateboarding and making things.

 

Bio: Charlie is a physics graduate (BSc from Lancaster, MSc with distinction from Leeds). Most interesting modules included Cosmology, Quantum Computing and Astrophysical Fluid Dynamics (mega space explosions). MSc project centred around the formation of high-mass stars. President of the University of Leeds Science Magazine. Designed the logo for the University of Leeds Physics Society (still being used I believe). Have had all manor of jobs from Data Administration/analysis, to Go-Kart Marshal to name a few. Currently employed as a Data Scientist. Have worked with object detection using convolutional, deep NNs and experimented with synthetic data generation. Current project in profiling customers and predicting their behaviours. Personally studying NLP and phonetic translations with Ian to develop the aforementioned rap AI (ML_Doom). Avid guitarist.