Hostname: page-component-68c7f8b79f-gnk9b Total loading time: 0 Render date: 2026-01-04T01:01:43.123Z Has data issue: false hasContentIssue false

Introducing maxent.ot: an R package for Maximum Entropy constraint grammars

Published online by Cambridge University Press:  01 January 2026

Connor Mayer*
Affiliation:
University of California, Irvine
Adeline Tan*
Affiliation:
University of California, Los Angeles
Kie Zuraw*
Affiliation:
University of California, Los Angeles
Get access

Abstract

This paper presents maxent.ot, a package for doing phonological analysis using Maximum Entropy Optimality Theory written in the statistical programming language R. R has become the de facto standard for doing statistical analysis in linguistic research, and this package allows phonologists to create and disseminate MaxEnt OT analyses in R. A central goal of the package is to support reproducible research and to allow the crucial components of a MaxEnt analysis to be performed conveniently and with only a basic knowledge of R programming. The paper first presents a tutorial on MaxEnt constraint grammars and how to use maxent.ot to perform a simple analysis. We then turn to more advanced features of the package, including model comparison, regularization, and cross-validation.

Information

Type
Research Article
Copyright
Copyright © The Author(s), 2024

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

*

Thanks to the audience at the 2022 Annual Meeting on Phonology at UCLA for their useful feedback and discussion. Thanks also to the valuable feedback from two anonymous reviewers.

References

Ackley, D., Hinton, G. & Sejnowski, T. 1985. A learning algorithm for Boltzmann machines. Cognitive Science 9. 147169.Google Scholar
Akaike, Hirotugu. 1974. A new look at the statistical model identification. IEEE Transactions on Automatic Control 19. 716723. https://doi.org/10.1007/978-1-4612-1694-0_16.CrossRefGoogle Scholar
Albright, Adam & Hayes, Bruce. 2003. Rules vs. analogy in English past tenses: a computational/experimental study. Cognition 90 (2). 119161. https://doi.org/10.1016/S0010-0277(03)00146-X.CrossRefGoogle ScholarPubMed
Allaire, JJ, Xie, Yihui, McPherson, Jonathan, Luraschi, Javier, Ushey, Kevin, Atkins, Aron, Wickham, Hadley, Cheng, Joe, Chang, Winston & Iannone, Richard. 2021. rmarkdown: Dynamic Documents for R. R package version 2.7. https://rmarkdown.rstudio.com.Google Scholar
Anttila, Arto, Borgeson, Scott & Magri, Giorgio. 2019. Equiprobable mappings in weighted constraint grammars. In Nicolai, Garrett & Cotterell, Ryan (eds.), SIGMORPHON 2019: Proceedings of the 16th ACL Workshop on Computational Research in Phonetics, Phonology, and Morphology, 125134. Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-4215.CrossRefGoogle Scholar
Beckman, Jill N. 1998. Positional faithfulness: University of Massachusetts Amherst dissertation. https://doi.org/10.1002/9780470756171.ch16.CrossRefGoogle Scholar
Berko, Jean. 1958. The child's learning of English morphology. Word 14 (2-3). 150177. https://doi.org/10.1080/00437956.1958.11659661.CrossRefGoogle Scholar
Bisong, Ekaba. 2019. Google colaboratory 5964. Berkeley, CA: Apress. https://doi.org/10.1007/978-1-4842-4470-8_7.Google Scholar
Boersma, Paul & Pater, Joe. 2016. Convergence properties of a gradual learning algorithm for Harmonic Grammar. In McCarthy, John J. & Pater, Joe (eds.), Harmonic Serialism and Harmonic Grammar, 389434. Sheffield: Equinox.Google Scholar
Breiss, Canaan. 2020. Constraint cumulativity in phonotactics: evidence from Artificial Grammar Learning studies. Phonology 37 (4). 551576. https://doi.org/10.1017/S0952675720000275.CrossRefGoogle Scholar
Bresnan, Joan, Cueni, Anna, Nikitina, Tatiana & Baayen, R Harald. 2007. Predicting the dative alternation. In Cognitive foundations of interpretation, 6994. Amsterdam: KNAW.Google Scholar
Bridle, John. 1989. Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters. Advances in Neural Information Processing Systems 2.Google Scholar
Browne, Michael W. 2000. Cross-validation methods. Journal of Mathematical Psychology 44 (1). 108132. https://doi.org/10.1006/jmps.1999.1279.CrossRefGoogle ScholarPubMed
Burnham, Kenneth P. & Anderson, David R.. 2004. Multimodal inference: Understanding AIC and BIC in model selection. Sociological Methods & Research 33 (2). 261304. https://doi.org/10.1177/0049124104268644.CrossRefGoogle Scholar
Carlisle, Robert. 1991. The influence of environment on vowel epenthesis in Spanish/English interphonology. Applied Linguistics 12(76-95). https://doi.org/10.1093/applin/12.1.76.CrossRefGoogle Scholar
Carlisle, Robert S. 2001. Syllable structure universals and second language acquisition. International Journal of English Studies 1 (1). 119. https://doi.org/20.500.12680/mk61rk413.Google Scholar
Chen, Stanley F. & Rosenfeld, Ronald. 1999. A Gaussian prior forsmoothing maximum entropy models. Tech. rep. Canergie Mellon University.CrossRefGoogle Scholar
Cho, Young-mee Yu & King, Tracy Holloway. 2003. Semisyllables and universal syllabification. In Féry, Caroline & van, Ruben Vijver, de (eds.), The syllable in Optimality Theory, 183212. New York: Cambridge University Press. https://doi.org/10.1017/CBO9780511497926.008.CrossRefGoogle Scholar
Daland, Robert. 2015. Long words in maximum entropy phonotactic grammars. Phonology 32 (3). 353383. https://doi.org/10.1017/S0952675715000251.CrossRefGoogle Scholar
Della Pietra, Stephen A., Pietra, Vincent J. Della & Lafferty, John. 1997. Inducing features of random fields. IEEE Transactions: Pattern Analysis and Machine Intelligence 19(4). 380393.Google Scholar
Flemming, Edward. 2021. Comparing MaxEnt and Noisy Harmonic Grammar. Glossa 6 (1). 142. https://doi.org/10.16995/glossa.5775.Google Scholar
Fylstra, Daniel, Lasdon, Leon, Watson, John & Waren, Allan. 1998. Design and Use of the Microsoft Excel Solver. INTERFACES 28 (5). 2955. https://doi.org/10.1287/inte.28.5.29.CrossRefGoogle Scholar
Geisser, Seymour. 1975. The predictive sample reuse method with applications. Journal of the American statistical Association 70 (350). 320328. https://doi.org/10.1080/01621459.1975.10479865.CrossRefGoogle Scholar
Gilani, Saiem & Hutchinson, Geoff. 2021. wehoop: The sportsdataverse's r package for women's basketball data. https://wehoop.sportsdataverse.org.Google Scholar
Goldwater, Sharon & Johnson, Mark. 2003. Learning OT constraint rankings using a maximum entropy model. In Spenader, Jennifer, Eriksson, Anders & Dahl, Osten (eds.), Proceedings of the Stockholm Workshop on Variation within Optimality Theory, 111120. Stockholm: Stockholm University, Department of Linguistics.Google Scholar
Hayes, Bruce, Tesar, Bruce & Zuraw, Kie. 2013. OTSoft.Google Scholar
Hayes, Bruce & Wilson, Colin. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39 (3). 379440. https://doi.org/10.1162/ling.2008.39.3.379.CrossRefGoogle Scholar
Hayes, Bruce, Wilson, Colin & George, Ben. 2009. Maxent Grammar Tool. http://www.linguistics.ucla.edu/people/hayes/MaxentGrammarTool/.Google Scholar
Hayes, Bruce, Wilson, Colin & Shisko, Anne. 2012. Maxent grammars for the metrics of Shakespeare and Milton. Language 88 (4). 691731. https://doi.org/10.1353/lan.2012.0089.CrossRefGoogle Scholar
Heeringa, Wilbert & Van, Hans Velde, de. 2018. Visible vowels: a tool for the visualization of vowel variation. In Proceedings of CLARIN Annual Conference 2018, 8 -10 October, Pisa, Italy, 120123. CLARIN ERIC. https://office.clarin.eu/v/CE-2018-1292-CLARIN2018_ConferenceProceedings.pdf.Google Scholar
Henriksson, Erik. 2022. Greek meter: An approach using metrical grids and maxent: University of Helsinki dissertation.Google Scholar
Hughto, Coral, Lamont, Andrew, Prickett, Brandon & Jarosz, Gaja. 2019. Learning exceptionality and variation with lexically scaled MaxEnt. In Proceedings of the Society for Computation in Linguistics 2019, 91101. https://doi.org/https://doi.org/10.7275/y68s-kh12.CrossRefGoogle Scholar
Jager, G. & Rosenbach, A.. 2006. The winner takes all - almost: Cumulativity in grammatical variation. Linguistics 44 (5). 937971. https://doi.org/10.1515/LING.2006.031.CrossRefGoogle Scholar
Jarosz, Gaja. 2017. Defying the stimulus: Acquisition of complex onsets in Polish. Phonology 34 (2). 269298. https://doi.org/10.1017/S0952675717000148.CrossRefGoogle Scholar
Kaplan, Aaron. 2018. Positional licensing, asymmetric trade-offs and gradient constraints in Harmonic Grammar. Phonology 35. 247286. https://doi.org/10.1017/S0952675718000040.CrossRefGoogle Scholar
Kimper, Wendell. 2011. Competing triggers: transparency and opacity in vowel harmony: University of Massachusetts Amherst dissertation.Google Scholar
Kluyver, Thomas, Ragan-Kelley, Benjamin, Perez, Fernando, Granger, Brian, Bussonnier, Matthias, Frederic, Jonathan, Kelley, Kyle, Hamrick, Jessica, Grout, Jason, Corlay, Sylvain, Ivanov, Paul, Avila, Damian, Abdalla, Safia & Willing, Carol. 2016. Jupyter notebooks - a publishing format for reproducible computational workflows. In Loizides, F. & Schmidt, B. (eds.), Positioning and power in academic publishing: Players, agents and agendas, 8790. IOS Press.Google Scholar
Knuth, Donald E. 1992. Literate programming. Standford, CA: Center for the Study of Language and Information. https://doi.org/10.1093/comjnl/27.2.97.Google Scholar
Legendre, Géraldine, Miyata, Yoshiro & Smolensky, Paul. 1990. Harmonic grammar - a formal multi-level connectionist theory of linguistic well-formedness: theoretical foundations. Tech. Rep. 90-5 Institute of Cognitive Science, University of Colorado at Boulder.Google Scholar
Luce, R Duncan. 1959. Individual choice behavior.Google Scholar
Magri, Giorgio & Anttila, Arto. 2022. Paradoxes of MaxEnt markedness. In Elkins, Noah, Hayes, Bruce, Jo, Jinyoung & Siah, Jian-Leat (eds.), AMP 2022: Supplemental Proceedings of the 2022 Annual Meeting on Phonology, Washington, DC: Linguistic Society of America. https://doi.org/10.3765/amp.v10i0.5445.Google Scholar
Martin, Andrew. 2007. The evolving lexicon: University of California, Los Angeles dissertation.Google Scholar
Martin, Andrew. 2011. Grammars leak: modeling how phonotactic generalizations interact with the grammar. Language 87 (4). 751770. https://doi.org/10.1353/lan.2011.0096.CrossRefGoogle Scholar
Mayer, Connor. 2021. Issues in Uyghur backness harmony: Corpus, experimental and computational studies: University of California, Los Angeles dissertation.Google Scholar
McCarthy, John J. & Prince, Alan. 1994. The emergence of the unmarked: Optimality in prosodic morphology. In Proceedings of the North East Linguistics Society 24, 18.Google Scholar
Pater, Joe. 2009. Weighted constraints in generative linguistics. Cognitive Science 33 (6). 9991035. https://doi.org/10.1111/j.1551-6709.2009.01047.x.CrossRefGoogle ScholarPubMed
Peng, Roger D. 2011. Reproducible research in computational science. Science 334 (6060). 12261227. https://doi.org/10.1126/science.1213847.CrossRefGoogle ScholarPubMed
Pereira, R.H.M., Goncalves, C.N. et al. 2019. geobr: Loads shapefiles of official spatial data sets of Brazil. https://github.com/ipeaGIT/geobr.10.32614/CRAN.package.geobrCrossRefGoogle Scholar
Prince, Alan & Smolensky, Paul. 1993/2004. Optimality theory: Constraint interaction in generative grammar. Cambridge, MA: Blackwell. https://doi.org/10.1002/9780470759400.CrossRefGoogle Scholar
R Core Team. 2022. R: A language and environment for statistical computing. R Foundation for Statistical Computing Vienna, Austria. https://www.R-project.org/.Google Scholar
Raftery, Adrian E. 1995. Bayesian model selection in social research. Sociological Methodology 25. 111163. https://doi.org/10.2307/271063.CrossRefGoogle Scholar
Rose, Yvan. 2002. Relations between segmental and prosodic structure in first language acquisition. The Annual Review of Language Acquisition 2 (1). 117155. https://doi.org/10.1075/arla.2.06ros.CrossRefGoogle Scholar
Team, RStudio. 2020. Rstudio: Integrated development environment for r. RStudio, PBC. Boston, MA. http://www.rstudio.com/.Google Scholar
Schutz, Frederic & Zollinger, Alix. 2018. ABPS: An R Package for Calculating the Abnormal Blood Profile Score. Frontiers in Physiology 9(1638). https://doi.org/10.3389/fphys.2018.01638.CrossRefGoogle Scholar
Schwab, Matthias, Karrenbach, N & Claerbout, Jon. 2000. Making scientific computations reproducible. Computing in Science & Engineering 2 (6). 6167. https://doi.org/10.1109/5992.881708.CrossRefGoogle Scholar
Schwarz, Gideon. 1978. Estimating the dimension of a model. Annals of Statistics 6 (2). 461464. https://doi.org/10.1214/aos/1176344136.CrossRefGoogle Scholar
Selkirk, Elisabeth. 1984. On the major class features and syllable theory. In Aronoff, Mark & Oehrle, Richard T. (eds.), Language sound structure, 107136. Cambridge, MA: MIT Press.Google Scholar
Shih, Stephanie. 2017. Constraint conjunction in weighted probabilistic grammar. Phonology 34 (2). 243268. https://doi.org/10.1017/S0952675717000136.CrossRefGoogle Scholar
Smolensky, Paul. 1986. Information processing in dynamical systems: Foundations of harmony theory. In Rumelhart, David E., McClelland, John L. & The, PDP Group, Research (eds.), Parallel distributed processing: Explorations in the microstructure of cognition, 194281. Cambridge, MA: MIT Press/Bradford Books.Google Scholar
Staubs, Robert. 2011. Harmonic Grammar in R (hgR). https://websites.umass.edu/hgr/.Google Scholar
Steriade, Donca. 1982. Greek prosodies and the nature of syllabification: Massachusetts Institute of Technology dissertation.Google Scholar
Stodden, Victoria, Leisch, Friedrich & Peng, Roger D.. 2014. Implementing reproducible research. New York: CRC Press. https://doi.org/10.1201/9781315373461.CrossRefGoogle Scholar
Stone, Mervyn. 1974. Cross-validatory choice and assessment of statistical predictions. Journal of the royal statistical society: Series B (Methodological) 36 (2). 111133. https://doi.org/10.1111/j.2517-6161.1974.tb00994.x.CrossRefGoogle Scholar
Tilsen, Sam. 2023. Probability and randomness in phonology: Deep vs. shallow stochasticity. Studies in Phonetics, Phonology, and Morphology 29(2). https://doi.org/10.17959/sppm.2023.29.2.247.Google Scholar
Uffmann, Christian. 2007. Vowel epenthesis in loanword adaptation. Tübingen: Max Niemeyer Verlag. https://doi.org/10.1515/9783110934823.CrossRefGoogle Scholar
Van Rossum, Guido & Drake, Fred L.. 2009. Python 3 reference manual. Scotts Valley, CA: CreateSpace.Google Scholar
Vrieze, Scott I. 2012. Model selection and psychological theory: A discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychological Methods 17 (2). 228243. https://doi.org/10.1037/a0027127.CrossRefGoogle ScholarPubMed
Wagenmakers, Eric-Jan & Farrell, Simon. 2004. AIC model selection using Akaike weights. Psychonomic Bulletin & Review 11 (1). 192196. https://doi.org/10.3758/BF03206482.CrossRefGoogle ScholarPubMed
White, Jamie. 2013. Bias in phonological learning: Evidence from saltation: University of California, Los Angeles dissertation.Google Scholar
Wickham, Hadley, Averick, Mara, Bryan, Jennifer, Chang, Winston, McGowan, Lucy D’Agostino, Franois, Romain, Grolemund, Garrett, Hayes, Alex, Henry, Lionel, Hester, Jim, Kuhn, Max, Pedersen, Thomas Lin, Miller, Evan, Bache, Stephan Milton, Mller, Kirill, Ooms, Jeroen, Robinson, David, Seidel, Dana Paige, Spinu, Vitalie, Takahashi, Kohske, Vaughan, Davis, Wilke, Claus, Woo, Kara & Yutani, Hiroaki. 2019. Welcome to the tidyverse. Journal of Open Source Software 4(43). 1686. https://doi.org/10.21105/joss.01686.CrossRefGoogle Scholar
Wilson, Colin. 2006. Learning phonology with substantive bias: An experimental and computational study of velar palatalization. Cognitive Science 30 (5). 945982. https://doi.org/10.1207/s15516709cog0000_89.CrossRefGoogle ScholarPubMed
Xie, Yihui. 2014. knitr: A comprehensive tool for reproducible research in R. In Stodden, Victoria, Leisch, Friedrich & Peng, Roger D. (eds.), Implementing reproducible computational research, Boca Raton, Florida: Chapman and Hall. https://doi.org/10.1201/9781315373461-1. ISBN 978-1466561595.Google Scholar
Xie, Yihui, Allaire, J. J. & Grolemund, Garrett. 2018. R markdown: The definitive guide. Boca Raton, Florida: Chapman & Hall. https://doi.org/10.1201/9781138359444.CrossRefGoogle Scholar
Xie, Yihui, Dervieux, Christophe & Riederer, Emily. 2020. R markdown cookbook. Boca Raton, Florida: Chapman & Hall. https://doi.org/10.1201/9781003097471.CrossRefGoogle Scholar
Yavaş, Mehmet, Ben-David, Avivit, Gerrits, Ellen, Kristoffersen, Kristian E. & Simonsen, Hanne G.. 2009. Sonority and cross-linguistic acquisition of initial s-clusters. Clinical Linguistics and Phonetics 22 (6). 421441. https://doi.org/10.1080/02699200701875864.CrossRefGoogle Scholar