arne rubehn
Hi! I am a computational linguist, currently pursuing my PhD at the Chair of Multilingual Computational Linguistics at the University of Passau. I focus on computer-assisted, data-driven methods for historical linguistics with the goal of advancing comparative historical linguistics by the means of intelligent algorithmic methods, alleviating researchers’ workload by processing large-scale data efficiently.
I have studied Computational Linguistics, General Linguistics, and Latin at the University of Tübingen. Within my MA thesis project I have trained a neural network that estimates global probabilities for arbitrary sound changes. Additionally, I have years of working experience as a software developer for EtInEn (Etymological Inference Engine), a software for historical linguists that is being developed at the Linguistic Department in Tübingen. Based on statistical methods, EtInEn automates several routine tasks like (among others) cognate detection, sound law inference, or phonetic reconstruction, and interactively assists users in exploring ideas and developing etymological theories.
research interests
My research usually concerns the computational modelling of linguistic questions, especially within the domains of:
- historical linguistics
- phonetics and phonology
- lexical semantics and word formation
- typology
Instead of focusing on individual languages or families, I aim at developing “generalist” models and methods in the light of large-scale, cross-linguistic applications.
news
Aug 05, 2024 | New blogpost: Generating Phonological Feature Vectors with SoundVectors and CLTS. In Computer-Assisted Language Comparison in Practice, available online at Hypotheses or as an article via its DOI. |
---|---|
Jul 22, 2024 | Accepted paper: Extracting Tuscan phonetic correspondences from dialect pronunciations automatically (with Simonetta Montemagni and John Nerbonne). To appear in Language Dynamics and Change. |
Jun 27, 2024 | New publication: Generating Feature Vectors from Phonetic Transcriptions in Cross-Linguistic Data Formats (with Jessica Nieder, Robert Forkel, and Johann-Mattis List). In Proceedings of the 2024 Meeting of the Society for Computation in Linguistics (SCiL). |