Maxime Fily

Linguist and Engineer/Physicist

Employment history

Since 2023: Research fellow at CNRS, LLF, Laboratoire de Linguistique Formelle and Lacito, Langues et Civilisations à Tradition Orale.
2023: Supervision of BA student Berthilde Biard (Université Sorbonne Nouvelle). Data extraction and analysis for the construction of verbal and nominal paradigms in Naish languages (3 months).
2018 - 2022: Doctoral student at Lacito (Villejuif) and Gipsa-lab (Grenoble Image Parole Signal Automatique).
2020 - 2021: Mandarin teacher, private classes, 2h/week.
2018 - 2019: Temporary lecturer for the class "Introduction to linguistics and language families", Université Grenoble Alpes (24 hours).
2017 - 2018: Research Intern at Gipsa-lab in the frame of a Linguistics MA. Oral-nasal signals analysis for the acoustic study of the oral and nasal cues of nasalised sounds in French and Taiwan Mandarin.
2014 - 2017: Confirmed Engineer at Areva NP. Neutronic design of nuclear fuel assemblies, Lyon, France.
2011 - 2014: Confirmed Engineer at Wecan, an Areva-CGN (China General Nuclear) joint-venture specializing in nuclear design and safety, Shenzhen, China.
2007 - 2011: Engineer at Areva NP, in nuclear safety analyses, Paris, France.

University Education

2018 - 2022: PhD in Phonetics, Phonology and Speech Sciences, Université Paris III Sorbonne Nouvelle.
2017 - 2018: Master's degree in Linguistics, with a specialization in the field of experimental phonetics and phonology, Université Grenoble Alpes, obtained with honors.
2014 - 2015: Level 3 University degree (DU niveau 3) in Chinese language and culture, Université Lyon 3, obtained with honors.
2001 - 2007: Master of Engineering at École Nationale Supérieure de Physique de Grenoble (PHELMA, Institut National Polytechnique de Grenoble).

Languages spoken

Computer skills

General IT: GNU/Linux systems administration (Ubuntu, Mint), Knowledge in GPU installations, Pytorch, LaTeX, Libre Office, MS Office
Programming: Linux/Unix: Bash, csh, Ksh ; General: Python, C, XML, Jupyter, Speech processing: Praat, NLP: Transformers-based neural networks (Wav2Vec2.0, XLSR), Statistics: R, seaborn

Foreign collaborations

2019: Research trip to Kunming Yunnan Minzu University, organized at the invitation of M. He Likun and M. Liu Jinrong (School of Ethnic Cultures). During this visit I attended seminars that covered a range of descriptive works on Yunnan languages. Additionally, I gave a seminar on the phonological system of Shekua Na.

Technical achievements

2021: Creation of a fully customizable keyboard for Linux users interested in writing with the International Phonetic Alphabet. See link
2020: Realization of a solution to allow the conversion of Praat textgrids to XML format to accelerate Pangloss deposits. See GitHub
2018: Design of the acquisition module for a separate recording of oral and nasal tract output, by modifying the Glottal Enterprise nasalance plate.

Grants and projects

2019: International mobility grant (3,750€), obtained from the UGA IDEX International Mobility Commission.

Publications

Documentation work

Since the beginning of my work as a field linguist, I have focused on the Shekua variety of the Na language, also known as Lataddi Narua. Shekua is a small village situated in proximity to the Grass Sea of lake Lugu. The speakers in this village use a variety that is closely related to Yongning Narua, whose tone system has been described in Michaud (2017). The detail of my fieldwork experience is outlined below:
2023: Interviews of Shekua Na speakers : narratives, phonological confirmation paradigms, dialogues (Yunnan, 2 months)
2019: Interviews of Shekua Na speakers : phonetics and phonology, narratives (Sichuan and Yunnan, 3 months)
2018: Transcription, translation and archiving or unpublished recordings by ā huì (MA, Yunnan University).

Ongoing research

Comparative work on Na dialects: The study relies on nouns and verbs in tonal paradigms to characterize the tonal morphology of these lexemes. A preliminary study included ~ 100 cognates. Data collection permitted to rise up to ~200 cognates. Post-processings are done via : Sankey diagrams, Agglomerative Hierarchical Clustering.
NLP for the less documented languages: Work started with classic ASR for low-resource languages (fine-tuning a multilingual model). A follow-up article investigates the role of rogue dimensions in the representations and the best setting to represent the phonetic segments in the vector space (VarDial 2025, submitted).