Andrea
Ceolin

About me

I'm currently an AI Engineer at IQVIA, where I work on information extraction from unstructured and structured medical records and scientific papers.

Before joining IQVIA, I was a post-doc researcher at the department of Communication and Economics at Università di Modena e Reggio Emilia.

I received my PhD in Linguistics at the University of Pennsylvania. My dissertation can be found here.

My research interests are: Computational models of Language Change and Language Acquisition, Sentiment Analysis and Low-resource Language Identification using Deep Learning.

Research

PCM

The most up-to-date research on the Parametric Comparison Method.

WikiTalkEdit

My most recent NLP project on sentiment analysis, presented at NAACL 2021.

Language ID using Neural Networks

My most recent NLP project on language identification, presented at COLING 2022 (VarDial).

Skills

Python 95%
R 90%
Tensorflow 80%
PyTorch 75%

Publications

  • Ceolin, A. (2022). Neural Networks for Cross-domain Language Identification. Phlyers@ Vardial 2022. In Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects, 99-108. COLING 2022 [PDF] [GitHub]

  • Jaidka, K., Ceolin, A., Singh, I., Chhaya, N., Ungar, L. H. (2021). WikiTalkEdit: A Dataset for modeling Editors' behaviors on Wikipedia. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2191–2200. NAACL 2021 [PDF] [GitHub]

  • Ceolin, A. (2021). Comparing the Performance of CNNs and Shallow Models for Language Identification. In Proceedings of the 8th Workshop on NLP for Similar Languages, Varieties and Dialects, 102-112. EACL 2021 [PDF] [GitHub]

  • Ceolin, A. (2021). Constraints on Old English genitive variation. Journal of Historical Syntax, 5, 1-13. https://doi.org/10.18148/hs/2021.v5i1-13.34 [PDF] [GitHub]

  • Ceolin, A., Guardiano, C., Longobardi, G., Irimia, M. A., Bortolussi, L., Sgarro A. (2021). At the boundaries of syntactic prehistory. Philosophical Transactions of the Royal Society B, 376: 20200197. https://doi.org/10.1098/rstb.2020.0197. [GitHub]

  • Ceolin, A. & Zhang, H. (2020). Discriminating between standard Romanian and Moldavian tweets using filtered character ngrams. In Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and Dialects, 265-272. COLING 2020 [PDF] [GitHub]

  • Ceolin, A., Guardiano, C., Irimia, M. A., & Longobardi, G. (2020). Formal syntax and deep history. Frontiers in Psychology, 11, 2384. https://doi.org/10.3389/fpsyg.2020.488871. [PDF] [GitHub]

  • Ceolin, A. (2020). On Functional Load and its Relation to the Actuation Problem. University of Pennsylvania Working Papers in Linguistics, 26(2). Available at: https://repository.upenn.edu/pwpl/vol26/iss2/6 [PDF] [GitHub]

  • Santos, P., Gonzàlez-Fortes, G., Trucchi, E., Ceolin, A., Cordoni, G., Guardiano, C., Longobardi, G., Barbujani, G. (2020). More Rule than Exception: Parallel Evidence of Ancient Migrations in Grammars and Genomes of Finno-Ugric Speakers. Genes, 11(12), 1491. https://doi.org/10.3390/genes11121491. [PDF]

  • Ceolin, A. (2019). Significant testing of the Altaic Family. Diachronica, 36(3), 299-336. https://doi.org/10.1075/dia.17007.ceo. [PDF] [GitHub]

  • Ceolin, A. & Sayeed, O. (2019). Modeling Markedness with a Split-and-Merger Model of Sound Change. In Tahmasebi, N., Borin, L., Jatowt, A. & Xu, Y. (eds) Proceedings of the 1st International Workshop on Computational Approaches to Historical Language Change. ACL 2019 [PDF]

  • Sayeed, O. & Ceolin, A. (2019). `Markedness' is an epiphenomenon of phonetically grounded sound change. In Hout, K., Mai, A., McCollum, A., Sharon, R. & Zaslansky, M. (eds), Supplementary Proceedings of the Annual Meeting on Phonology 2018, Washington DC: Linguistic Society of America. [PDF] [GitHub]

  • Ceolin, A. (2018). Explaining Cross-linguistic Differences in Article Omission through an Acquisition Model. In Bertolini, A. B. & Kaplan, M. G. (eds) Proceedings of the 42nd annual Boston University Conference on Language Development. Cascadilla Press. 100-113. [PDF] [GitHub]

  • Kazakov, D., Cordoni, G., Algathani, E., Ceolin, A., Irimia, M. A., Kim, S. S., Michelioudakis, D., Radkevich, N., Guardiano, C. & Longobardi, G. (2017). Learning implicational models of Universal Grammar parameters. In: Cuskley, C., Flaherty, M., Little, H., McCrohon, L., Ravignani, A. & Verhoef, T. (eds.) The Evolution of Language: Proceedings of the 12th International Conference (EVOLANGXII) [PDF]

  • Kazakov, D., Cordoni, G., Ceolin, A., Irimia, M. A., Kim, S. S., Michelioudakis, D., Radkevich, N., Guardiano, C. & Longobardi, G. (2017). Machine Learning Models of Universal Grammar Parameter Dependencies. RANLP 2017 Workshop on Knowledge Resources for the Socio-Economic Sciences and Humanities (KnowRSH). Varna, Bulgaria. [PDF]

  • Guardiano, C., Michelioudakis, D., Ceolin, A., Irimia, M., Longobardi, G., Radkevich, N., Sitaridou, I. & Silvestri, G. (2016). South by Southeast. A Syntactic Approach to Greek and Romance Microvariation. L'Italia Dialettale, 77, 95-166. [PDF]

  • Longobardi, G., Michelioudakis, D., Irimia, M. A., Radkevich, N., Guardiano, C., & Ceolin, A. (2016). Formal linguistics as a cue to demographic history. Journal of anthropological sciences, 147-155. [PDF]

  • Longobardi, G., Buch A., Ceolin A., Ecay A., Guardiano C., Irimia M., Michelioudakis D., Radkevich N. and Jaeger G. (2016). Correlated Evolution Or Not? Phylogenetic Linguistics With Syntactic, Cognacy, And Phonetic Data. In Roberts, S.G., Cuskley, C., McCrohon, L., Barceló-Coblijn, L., Fehér, O. & Verhoef, T. (eds) The Evolution of Language: Proceedings of the 11th International Conference (EVOLANGXI) [PDF]

  • Longobardi, G., Ceolin, A., Bortolussi, L., Guardiano, C., Irimia, M. A., Michelioudakis, D., Radkevich N. & Sgarro, A. (2016). Mathematical modeling of grammatical diversity supports the historical reality of formal syntax. Universitätsbibliothek Tübingen. [PDF]

  • Longobardi, G., Ghirotto, S., Guardiano, C., Tassi, F., Benazzo, A., Ceolin, A., & Barbujani, G. (2015). Across Language Families: Genome diversity mirrors linguistic variation within Europe. American journal of physical anthropology, 157(4), 630-640. [PDF]

  • Longobardi, G., Guardiano, C., Silvestri, G., Boattini, A., & Ceolin, A. (2013). Toward a syntactic phylogeny of modern Indo-European languages. Journal of Historical Linguistics, 3(1), 122-152. [PDF]

Top