arne rubehn

me.jpg

Hi! I am a computational linguist, currently pursuing my PhD at the Chair of Multilingual Computational Linguistics at the University of Passau. I focus on computer-assisted, data-driven methods for historical linguistics with the goal of advancing comparative historical linguistics by the means of intelligent algorithmic methods, alleviating researchers’ workload by processing large-scale data efficiently.

I have studied Computational Linguistics, General Linguistics, and Latin at the University of Tübingen. Within my MA thesis project I have trained a neural network that estimates global probabilities for arbitrary sound changes. Additionally, I have years of working experience as a software developer for EtInEn (Etymological Inference Engine), a software for historical linguists that is being developed at the Linguistic Department in Tübingen. Based on statistical methods, EtInEn automates several routine tasks like (among others) cognate detection, sound law inference, or phonetic reconstruction, and interactively assists users in exploring ideas and developing etymological theories.

research interests

My research usually concerns the computational modelling of linguistic questions, especially within the domains of:

  • historical linguistics
  • phonetics and phonology
  • lexical semantics and word formation
  • typology

Instead of focusing on individual languages or families, I aim at developing “generalist” models and methods in the light of large-scale, cross-linguistic applications.

news

Sep 19, 2024 New publication: Extracting Tuscan phonetic correspondences from dialect pronunciations automatically (with Simonetta Montemagni and John Nerbonne). In Language Dynamics and Change.
Sep 11, 2024 Talk held: Automatically Segmenting Words into Morphemes: A Detailed Comparison of Unsupervised Approaches Applied to Monolingual Wordlists from Different Languages. 21st International Congress of Linguistics, Poznań, Poland.
Aug 05, 2024 New blogpost: Generating Phonological Feature Vectors with SoundVectors and CLTS. In Computer-Assisted Language Comparison in Practice, available online at Hypotheses or as an article via its DOI.