Skip to main content Accessibility help
×
Hostname: page-component-68c7f8b79f-mk7jb Total loading time: 0 Render date: 2025-12-22T05:59:36.487Z Has data issue: false hasContentIssue false

Syntactic Variation from Individuals to Populations

Language as a Complex System

Published online by Cambridge University Press:  22 December 2025

Jonathan Dunn
Affiliation:
University of Illinois Urbana-Champaign

Summary

This Element presents a computational theory of syntactic variation that brings together (i) models of individual differences across distinct speakers, (ii) models of dialectal differences across distinct populations, and (iii) models of register differences across distinct contexts. This computational theory is based in Construction Grammar (CxG) because its usage-based representations can capture differences in productivity across multiple levels of abstraction. Drawing on corpora representing over 300 local dialects across fourteen countries, this Element undertakes three data-driven case-studies to show how variation unfolds across the entire grammar. These case-studies are reproducible given supplementary material that accompanies the Element. Rather than focus on discrete variables in isolation, we view the grammar as a complex system. The essential advantage of this computational approach is scale: we can observe an entire grammar across many thousands of speakers representing dozens of local populations.
Get access

Information

Type
Element
Information
Online ISBN: 9781009420280
Publisher: Cambridge University Press
Print publication: 31 January 2026

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Element purchase

Temporarily unavailable

References

Alishahi, A., & Stevenson, S. (2008). A computational model of early argument structure acquisition. Cognitive Science, 32(5), 789834.CrossRefGoogle ScholarPubMed
Bamman, D., Eisenstein, J., & Schnoebelen, T. (2014). Gender identity and lexical variation in social media. Journal of Sociolinguistics, 18(2), 135160.CrossRefGoogle Scholar
Barak, L., & Goldberg, A. (2017). Modeling the partial productivity of constructions. In Proceedings of the 2017 spring symposium on computational construction grammar and natural language understanding (pp. 131138). Association for the Advancement of Artificial Intelligence.Google Scholar
Beckner, C., Ellis, N., Blythe, R., et al. (2009). Language is a complex adaptive system: position paper. Language Learning, 59, 126.Google Scholar
Beuls, K., & Van Eecke, P. (2023). Fluid construction grammar: State of the art and future outlook. In Proceedings of the first international workshop on construction grammars and NLP (pp. 4150). Association for Computational Linguistics.Google Scholar
Biber, D. (2012). Register as a predictor of linguistic variation. Corpus Linguistics and Linguistic Theory, 8(1), 937.CrossRefGoogle Scholar
Biber, D., & Conrad, S. (2009). Register, genre, and style. Cambridge University Press.CrossRefGoogle Scholar
Biber, D., Egbert, J., & Keller, D. (2020). Reconceptualizing register in a continuous situational space. Corpus Linguistics and Linguistic Theory, 16(3), 581616.CrossRefGoogle Scholar
Campello, R., Moulavi, D., & Sander, J. (2013). Advances in knowledge discovery and data mining. PAKDD 2013. In Pei, J., Tseng, V. S., Cao, L., Motoda, H., & Xu, G. (Eds.), (Vol. 7819, pp. 160172). Springer.CrossRefGoogle Scholar
Campello, R., Moulavi, D., Zimek, A., & Sandler, J. (2015). Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Transactions on Knowledge Discovery from Data, 10(1), pp. 151.CrossRefGoogle Scholar
Collins, P. (2012). Singular agreement in there existentials: An intervarietal corpus-based study. English World-Wide, 33(1), 5368.CrossRefGoogle Scholar
Cook, P., & Brinton, J. (2017). Building and evaluating web corpora representing national varieties of English. Language Resources and Evaluation, 51(3), 643662.CrossRefGoogle Scholar
Croft, W. (2013). Radical construction grammar. In Hoffmann, T. & Trousdale, G. (Eds.), The Oxford handbook of construction grammar (pp. 211232). Oxford University Press.Google Scholar
Davies, M., & Fuchs, R. (2015). Expanding horizons in the study of World Englishes with the 1.9 billion word Global Web-based English Corpus (GloWbE). English World-Wide, 36(1), 128.CrossRefGoogle Scholar
Diessel, H. (2023). The constructicon: Taxonomies and networks. Cambridge University Press.CrossRefGoogle Scholar
Donoso, G., Sánchez, D., & Sanchez, D. (2017). Dialectometric analysis of language variation in Twitter. In Proceedings of the workshop on NLP for similar languages, varieties and dialects (pp. 1625). Association for Computational Linguistics.CrossRefGoogle Scholar
Doumen, J., Beuls, K., & Van Eecke, P. (2023). Modelling language acquisition through syntactico-semantic pattern finding. In Vlachos, A. & Augenstein, I. (Eds.), Findings of the association for computational linguistics: Eacl 2023 (pp. 13471357). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J. (2017). Computational learning of construction grammars. Language & Cognition, 9(2), 254292.CrossRefGoogle Scholar
Dunn, J. (2018a). Finding variants for construction-based dialectometry: A corpus-based approach to regional cxgs. Cognitive Linguistics, 29(2), 275311.CrossRefGoogle Scholar
Dunn, J. (2018b). Modeling the complexity and descriptive adequacy of construction grammars. In proceedings of the society for computation in linguistics (pp.8190), Association for Computational Linguistics.Google Scholar
Dunn, J. (2018c). Multi-unit directional measures of association moving beyond Pairs of words. International Journal of Corpus Linguistics, 23(2), 183215.CrossRefGoogle Scholar
Dunn, J. (2019a). Frequency vs. Association for constraint selection in usage-based construction grammar. In Proceedings of the workshop on cognitive modeling and computational linguistics (p. 117128). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J. (2019b). Global syntactic variation in seven languages: Toward a computational dialectology. Frontiers in Artificial Intelligence, 2, 15.CrossRefGoogle Scholar
Dunn, J. (2020). Mapping languages: The corpus of global language use. Language Resources and Evaluation, 54, 9991018.CrossRefGoogle Scholar
Dunn, J. (2022). Exposure and emergence in usage-based grammar: Computational experiments in 35 languages. Cognitive Linguistics, 33, 659699.CrossRefGoogle Scholar
Dunn, J. (2023a). Exploring the constructicon: Linguistic analysis of a computational CxG. In Proceedings of the first international workshop on construction grammars and nlp (pp. 111). Association for Computational Linguistics.Google Scholar
Dunn, J. (2023b). Syntactic variation across the grammar: Modelling a complex adaptive system. Frontiers in Complex Systems, Volume 1.CrossRefGoogle Scholar
Dunn, J. (2024). Computational construction grammar: A usage-based approach. Cambridge University Press.CrossRefGoogle Scholar
Dunn, J., Coupe, T., & Adams, B. (2020). Measuring linguistic diversity during COVID-19. In Proceedings of the fourth workshop on natural language processing and computational social science (pp. 110). Association for Computational Linguistics.Google Scholar
Dunn, J., & Nijhof, W. (2022). Language identification for austronesian languages. In Proceedings of the 13th international conference on language resources and evaluation (pp. 65306539). European Language Resources Association.Google Scholar
Dunn, J., & Nini, A. (2021). Production vs perception: The role of individuality in usage-based grammar induction. In Proceedings of the workshop on cognitive modeling and computational linguistics (pp. 149159). Association for Computational Linguistics.CrossRefGoogle Scholar
Dunn, J., & Tayyar Madabushi, H. (2021). Learned construction grammars converge across registers given increased exposure. In Conference on computational natural language learning (pp. 268278). Association for Computational Linguistics.Google Scholar
Dąbrowska, E. (2021). How writing changes languages. In Language change: The impact of english as a lingua franca (pp. 7594). Cambridge University Press.Google Scholar
Egbert, J., Biber, D., & Davies, M. (2015). Developing a bottom-up, user-based method of web register classification. Journal of the Association for Information Science and Technology, 66(9), 18171831.CrossRefGoogle Scholar
Eisenstein, J., O’Connor, B., Smith, N., & Xing, E. (2014). Diffusion of lexical change in social media. PloSOne, 10, 1371.Google Scholar
Fagyal, Z., Swarup, S., Escobar, A. M., Gasser, L., & Lakkaraju, K. (2010). Centers and peripheries: Network roles in language change. Lingua, 120(8), 20612079.CrossRefGoogle Scholar
Gentzkow, M., Shapiro, J., & Taddy, M. (2018). Congressional record for the 43rd–114th congresses: Parsed speeches and phrase counts (Tech. Rep.). Stanford Libraries.Google Scholar
Goldberg, A. (2006). Constructions at work: The nature of generalization in language. Oxford University Press.Google Scholar
Goldsmith, J. (2015). Towards a new empiricism for linguistics. In Chater, N., Clark, A., Goldsmith, J., & Perfors, A. (Eds.), Empiricism and language learnability (pp. 58105). Oxford University Press.Google Scholar
Gonçalves, B., Loureiro-Porto, L., Ramasco, J. J., & Sánchez, D. (2018). Mapping the americanization of english in space and time. PLOS ONE, 13(5), 115.CrossRefGoogle Scholar
Gonçalves, B., & Sánchez, D. (2014). Crowdsourcing dialect characterization through twitter. PLOS ONE, 9(11), 16.CrossRefGoogle ScholarPubMed
Grafmiller, J., & Szmrecsanyi, B. (2018). Mapping out particle placement in Englishes around the world A study in comparative sociolinguistic analysis. Language Variation and Change, 30(3), 385412.CrossRefGoogle Scholar
Greenbaum, S. (1996). Comparing english worldwide: The international corpus of English. Clarendon Press.CrossRefGoogle Scholar
Grieve, J. (2011). A regional analysis of contraction rate in written Standard American English. International Journal of Corpus Linguistics, 16(4), 514546.CrossRefGoogle Scholar
Grieve, J. (2012). A statistical analysis of regional variation in adverb position in a corpus of written Standard American English. Corpus Linguistics and Linguistic Theory, 8(1), 3972.CrossRefGoogle Scholar
Grieve, J. (2016). Regional variation in written American English. Cambridge University Press.CrossRefGoogle Scholar
Grieve, J., Montgomery, C., Nini, A., Murakami, A., & Guo, D. (2019). Mapping lexical dialect variation in British English using Twitter. Frontiers in Artificial Intelligence, 2, 11.CrossRefGoogle ScholarPubMed
Grünwald, P. (2007). The minimum description length principle. MIT Press.CrossRefGoogle Scholar
Hollmann, W., & Siewierska, A. (2011). The status of frequency, schemas, and identity in cognitive sociolinguistics A case study on definite article reduction. Cognitive Linguistics, 22(1), 2554.CrossRefGoogle Scholar
Huang, Z., Wu, X., Jarcia, A., Fik, T., & Tatem, A. (2013). An open-access modeled passenger flow matrix for the global air network in 2010. PlosONE, 8(5), e64317.CrossRefGoogle ScholarPubMed
Kachru, B. E. (1982). The Other tongue: English across cultures. University of Illinois Press.Google Scholar
Kesarwani, A. (2018). New york times comments. Kaggle.Google Scholar
Kidd, E., & Donnelly, S. (2020). Individual differences in first language acquisition. Annual Review of Linguistics, 6, 319340.CrossRefGoogle Scholar
Laitinen, M., & Fatemi, M. (2022). Big and rich social networks in computational sociolinguistics. In Rautionaho, P., Parviainen, H., Kaunisto, M., & Nurmi, A. (Eds.), Social and regional variation in world Englishes: Local and global perspectives (pp. 125). Routledge.Google Scholar
Laitinen, M., Fatemi, M., & Lundberg, J. (2020). Size matters: Digital social networks and language change. Frontiers in Artificial Intelligence, 3.Google ScholarPubMed
Langacker, R. (1987). Foundations of cognitive grammar. Stanford University Press.Google Scholar
Leclercq, B., & Morin, C. (2023). No equivalence: A new principle of no synonymy. Constructions, 15.Google Scholar
Li, J. (2012). Hotel reviews dataset (Tech. Rep.). Carnegie Mellon University.Google Scholar
Lison, P., & Tiedemann, J. (2016). OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the international conference on language resources and evaluation (pp. 923929). European Language Resources Association.Google Scholar
Lucy, L., & Bamman, D. (2021). Characterizing english variation across social media communities with bert. Transactions of the Association for Computational Linguistics, 9, 538556.CrossRefGoogle Scholar
McKenzie, G., & Adams, B. (2018). A data-driven approach to exploring similarities of tourist attractions through online reviews. Journal of Location Based Services, 12(2), 94118.CrossRefGoogle Scholar
Mocanu, D., Baronchelli, A., Perra, N., et al. (2013). The Twitter of babel: Mapping world languages through microblogging platforms. PLOSOne, 10, 1371.Google Scholar
Nevens, J., Doumen, J., Van Eecke, P., & Beuls, K. (2022). Language acquisition through intention reading and pattern finding. In Proceedings of the 29th international conference on computational linguistics (pp. 1525). International Committee on Computational Linguistics.Google Scholar
Nini, A. (2023). A theory of linguistic individuality for authorship analysis. Cambridge University Press.CrossRefGoogle Scholar
Ortman, M. (2018). Wikipedia sentences. Kaggle.Google Scholar
Parsons, A. (2019). NY Times Article Lead Paragraphs 1851–2017 (Tech. Rep.). Kaggle.Google Scholar
Perek, F., & Patten, A. L. (2019). Towards an English constructicon using patterns and frames. International Journal of Corpus Linguistics, 24(3), 354384.CrossRefGoogle Scholar
Rae, J. W., Potapenko, A., Jayakumar, S. M., & Lillicrap, T. P. (2020). Compressive transformers for long-range sequence modelling. In International conference on learning representations (pp. 19). ICRL.Google Scholar
Reimers, N., & Gurevych, I. (2020). Making monolingual sentence embeddings multilingual using knowledge distillation. In Proceedings of the conference on empirical methods in natural language processing (pp. 45124525). Association for Computational Linguistics.Google Scholar
Schler, J., Koppel, M., Argamon, S., & Pennebaker, J. (2006). Effects of age and gender on blogging. In Proceedings of aaai spring symposium on computational approaches for analyzing weblogs. Association for the Advancement of Artificial Intelligence (pp. 17).Google Scholar
Schneider, E. W. (2020). Calling englishes as complex dynamic systems: Diffusion and restructuring. In Mauranen, A. & Vetchinnikova, S. (Eds.), Language change: The impact of english as a lingua franca (pp. 1543). Cambridge University Press.CrossRefGoogle Scholar
Soares, F., Moreira, V., & Becker, K. (2018). A large parallel corpus of full-text scientific articles. In Proceedings of the international conference on language resources and evaluation. European Language Resource Association.Google Scholar
Szmrecsanyi, B. (2013). Grammatical variation in British English dialects: A study in corpus-based dialectometry. Cambridge University Press.Google Scholar
Szmrecsanyi, B., & Grafmiller, J. (2023). Comparative variation analysis: Grammatical alternations in world englishes. Cambridge University Press.CrossRefGoogle Scholar
Szmrecsanyi, B., Grafmiller, J., Heller, B., & Rothlisberger, M. (2016). Around the world in three alternations Modeling syntactic variation in varieties of English. English World-Wide, 37(2), 109137.CrossRefGoogle Scholar
Szmrecsanyi, B., Grafmiller, J., & Rosseel, L. (2019). Variation-based distance and similarity modeling: A case study in world Englishes. Frontiers in Artificial Intelligence, 2, 23.CrossRefGoogle ScholarPubMed
Tiedemann, J. (2012). Parallel data, tools and interfaces in OPUS. In Proceedings of the international conference on language resources and evaluation (pp. 22142218). European Language Resources Association.Google Scholar
Trudgill, P. (2014). Diffusion, drift, and the irrelevance of media influence. Journal of Sociolinguistics, 18(2), 213222.CrossRefGoogle Scholar
Wible, D., & Tsao, N. (2010). StringNet as a computational resource for discovering and investigating linguistic constructions. In Proceedings of the workshop on extracting and using constructions in computational linguistics (pp. 2531). Association for Computational Linguistics.Google Scholar
Wible, D., & Tsao, N. (2020). Constructions and the problem of discovery: A case for the paradigmatic: Corpus Linguistics and Linguistic Theory, 16(1), 6793.CrossRefGoogle Scholar
Wieling, M., Nerbonne, J., & Baayen, R. H. (2011). Quantitative social dialectology: Explaining linguistic variation geographically and socially. PloS One, 6, 9.CrossRefGoogle ScholarPubMed
Zhang, X., Zhao, J., & LeCun, Y. (2015). Character-level convolutional networks for text classification. In Proceedings of the international conference on neural information processing systems (pp. 649657). Neural Information Processing Systems Foundation.Google Scholar

Accessibility standard: WCAG 2.1 AA

Why this information is here

This section outlines the accessibility features of this content - including support for screen readers, full keyboard navigation and high-contrast display options. This may not be relevant for you.

Accessibility Information

The PDF of this Element complies with version 2.1 of the Web Content Accessibility Guidelines (WCAG), covering newer accessibility requirements and improved user experiences and achieves the intermediate (AA) level of WCAG compliance, covering a wider range of accessibility requirements.

Content Navigation

Table of contents navigation
Allows you to navigate directly to chapters, sections, or non‐text items through a linked table of contents, reducing the need for extensive scrolling.

Reading Order & Textual Equivalents

Single logical reading order
You will encounter all content (including footnotes, captions, etc.) in a clear, sequential flow, making it easier to follow with assistive tools like screen readers.
Short alternative textual descriptions
You get concise descriptions (for images, charts, or media clips), ensuring you do not miss crucial information when visual or audio elements are not accessible.

Visual Accessibility

Use of colour is not sole means of conveying information
You will still understand key ideas or prompts without relying solely on colour, which is especially helpful if you have colour vision deficiencies.

Structural and Technical Features

ARIA roles provided
You gain clarity from ARIA (Accessible Rich Internet Applications) roles and attributes, as they help assistive technologies interpret how each part of the content functions.

Save element to Kindle

To save this element to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Syntactic Variation from Individuals to Populations
  • Jonathan Dunn, University of Illinois Urbana-Champaign
  • Online ISBN: 9781009420280
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Syntactic Variation from Individuals to Populations
  • Jonathan Dunn, University of Illinois Urbana-Champaign
  • Online ISBN: 9781009420280
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Syntactic Variation from Individuals to Populations
  • Jonathan Dunn, University of Illinois Urbana-Champaign
  • Online ISBN: 9781009420280
Available formats
×