1. Introduction
What may be in words besides their semantic meaning? How do we perceive emotions in words while reading them? In the fast-developing area of the psychology of emotions, word stimuli gained understandable popularity, creating not only demand for various databases of validated affective norms (e.g. Imbir, Reference Imbir2016a; Monnier & Syssau, Reference Monnier and Syssau2014; Montefinese et al., Reference Montefinese, Ambrosini, Fairfield and Mammarella2014; Riegel et al., Reference Riegel, Wierzba, Wypych, Żurawski, Jednoróg, Grabowska and Marchewka2015) but also demand for research exploring how people perceive words and what consequences this brings to their cognitive functioning (Fields & Kuperberg, Reference Fields and Kuperberg2016; González-Villar et al., Reference González-Villar, Triñanes, Zurrón and Carrillo-De-La-Peña2014). However, the previous research was focused on unidimensional words and their consequences; prior studies also usually investigated the valence of word stimuli, thus providing characteristics for words differentiated by levels of valence. An obvious gap in the research was not only one regarding ambiguous words (Brainerd, Reference Brainerd2018) but also words ambiguous on emotional spaces different than valence, for instance origin (created by the dimensions of automaticity and reflectiveness; Jarymowicz & Imbir, Reference Jarymowicz and Imbir2015) and activation (arousal; Russell, Reference Russell1980; and subjective significance; van Hooff et al., Reference van Hooff, Dietz, Sharma and Bowman2008). For this reason, we decided to exclude valence and instead study words with perceived origin and activation ambiguity (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022) to not only see their main effects but also the interactions and interlacing of the two constructs.
It seems that for studying the properties of origin and activation it might be beneficial to control for valence, but not to manipulate it. As valence is a very basic, evolutionarily understood (Frijda, Reference Frijda, Ekman and Davidson1994) emotional space, it might be noticed and processed very early (Citron, Reference Citron2012; Citron et al., Reference Citron, Weekes and Ferstl2013; Imbir et al., Reference Imbir, Jurkiewicz, Duda-Goławska, Pastwa and Żygierewicz2018) and interact with origin (Imbir, Reference Imbir2017b), thus disrupting the effects of origin and activation. We propose to focus on the word-stimuli origin and activation, isolating those two emotional spaces and mapping their properties only. We included two types of measurements: a behavioural one through rating the emotionality of word stimuli in Experiment 1 and a psychophysical webcam-based eye-tracking measurement in Experiment 2, which might allow us to see how people look at ambiguous emotional stimuli and, therefore, how they manage their visual attention (Krejtz et al., Reference Krejtz, Krejtz and Bielecki2008; Salvucci & Goldberg, Reference Salvucci and Goldberg2000).
1.1. Emotionality in words
When we look at a word, we perceive its different qualities. We see its physical characteristics: colour of the font and the size of the letters. The information about the perceived stimulus is sent to the thalamus and occipital cortex where we receive the visual stimulus and process it (Carter & Luke, Reference Carter and Luke2020). Only then can we further process the semantic meaning, orthography, and phonology of the word (Herbert, Reference Herbert2022). Shortly after that, we start to process everything else; we start to think about what is in the word other than semantic meaning. Are there any emotional properties? In neuropsychological research, we have evidence that as soon as 100 ms after seeing a word stimulus we react differently to subsequent words varying in their valence and emotional arousal (Kissler et al., Reference Kissler, Herbert, Peyk and Junghofer2007; Scott et al., Reference Scott, O’Donnell, Leuthold and Sereno2009). We notice emotions in response to words, we process them, and we might react to them differently (Fields & Kuperberg, Reference Fields and Kuperberg2012) and differently in response to different affective contexts (Imbir et al., Reference Imbir, Duda-Goławska, Jurkiewicz, Pastwa, Sobieszek, Wielgopolan and Żygierewicz2022a; Imbir & Pastwa, Reference Imbir and Pastwa2021). For this reason, studying characteristics of words other than their valence may be of importance; in previous studies, some semantic attributes were proven to be ambiguous (e.g. concreteness, meaningfulness, and familiarity; Brainerd et al., Reference Brainerd, Chang and Bialer2021) and explained by the U-shaped relations between variables (similar to the U-shaped relationship between valence intensity and valence ambiguity, thus proving that this relation may be projected into different properties of words).
1.2. The dimension of origin
The recently proposed, dualistic dimension of origin (Jarymowicz & Imbir, Reference Jarymowicz and Imbir2015) was a product of dual-process theories, describing two different modes of processing: fast and effortless and slow and effortful (e.g. Kahneman, Reference Kahneman2013; Strack & Deutsch, Reference Strack and Deutsch2004). This distinction of one heuristic, effortless system, and another one being more deliberative and effortful, was generalised to the formation of emotions – and thus explaining the amount of the cognitive component in them (Jarymowicz & Imbir, Reference Jarymowicz and Imbir2015). The assumption behind research on origin was that all emotions may be categorised into automatic or reflective ones, differing in the evaluative system, which was engaged in their formation (Jarymowicz & Imbir, Reference Jarymowicz and Imbir2015). The so-called automatic emotions are fast and effortless, usually elicited by external stimuli or very basic internal mechanisms, such as drives or needs; they do not need a cognitive component, and they may be innate and unconscious (an example of such emotion could be the fear as a response to sudden, threatening stimulus – we do not have to process or think about anything and we jump instinctively, avoiding the harm). On the contrary, the very basis of reflective emotions is cognition, or a conscious evaluation derived from a comparison of stimuli with previous knowledge, standards, and values. Because of the involvement of cognition, reflective emotions are significantly slower, effortful, and engaging for individuals (Jarymowicz & Imbir, Reference Jarymowicz and Imbir2015); an example of such emotion could be guilt – to feel it, we need to notice that we did something wrong, compare the current situation with the ideal one, and take into consideration the norms and rules. The origin dimension (at the beginning perceived as a one bipolar continuum starting from automatic and ending in reflective) was measured in several studies, usually with the usage of a very intuitive measurement of Self-Assessment ManikinsFootnote 1 (Imbir, Reference Imbir2015; Reference Imbir2016a,Reference Imbirb; Imbir et al., Reference Imbir, Spustek, Duda-Goławska, Bernatowicz and Zygierewicz2017b).
In previous research, automatic versus reflective emotions had different consequences on human decision-making (Walkowiak & Imbir, Reference Walkowiak and Imbir2018); furthermore, the words rated as automatic or reflective had different influences on assessing social stimuli (Imbir, Reference Imbir2017b; Imbir et al., Reference Imbir, Duda-Goławska, Jurkiewicz, Pastwa, Sobieszek, Wielgopolan and Żygierewicz2022a; Imbir & Pastwa, Reference Imbir and Pastwa2021) and controlling cognition (Imbir, Reference Imbir2017a; Imbir et al., Reference Imbir, Duda-Goławska, Sobieszek, Wielgopolan, Pastwa and Żygierewicz2022b). After splitting the bipolar dimension of origin into two unipolar dimensions of automaticity and reflectiveness, they negatively and moderately correlated with one another (r = −.50; Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). Therefore, while the dimensions were opposites, they did not perfectly exclude each other. It seemed that the automaticity and reflectiveness might be perceived at the same time in reaction to one stimulus, thus – as two relatively independent dimensions – creating a bivariate space of origin (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). These results allowed for describing something’s perceived origin ambiguity – a phenomenon of perceiving both automaticity and reflectiveness in an emotion or a stimulus, or an analogical emotional structure to the ambivalence (Oh & Tong, Reference Oh and Tong2022; Peterson & Janssen, Reference Peterson and Janssen2007), the two systems interlacing one other. Words assessed as ambiguous on the space of origin required significantly shorter reaction times and received lower ratings of emotionality than in a control group that rated unidimensional words. However, it is worth mentioning that they were not statistically different from the words of ambiguous valence (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). Finally, words of ambiguous valence and origin also caused more and longer eye-fixations than control group words and words of ambiguous activation (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022).
1.3. The dimensions of activation
Another dimension that has frequently been used in describing emotional experiences in words is the concept of activation (Barrett & Russell, Reference Barrett and Russell1999). It has been mostly described in terms of biological and physical arousal (e.g. Maddock et al., Reference Maddock, Garrett and Buonocore2003; Russell, Reference Russell1980); however, there were very early studies and theories exposing the fact that activation (also called arousal) may not be a monolith construct, at it became possible to distinguish different factors inside it (Osgood et al., Reference Osgood, Suci and Tannenbaum1957) or various kinds of it (Thayer, Reference Thayer1986). In the light of dual-process theories, it seemed plausible to assume that there are two distinctive kinds of activation – ones related to the automatic mind and others to the reflective (Imbir et al., Reference Imbir, Spustek, Bernatowicz, Duda-Goławska and Żygierewicz2017a). Automatic activation was supposed to be bodily, biological, and very much connected to the energetic resources required to instigate some action (Russell, Reference Russell2003), called arousal. Arousal is physical activation; it implies alertness to surroundings and a readiness to act/react, sometimes very quickly, as it usually channels some innate reaction and does not require much cognitive processing (Imbir et al., Reference Imbir, Spustek, Bernatowicz, Duda-Goławska and Żygierewicz2017a). It was previously linked to the factor of the size of the stimulus (along with the meaningfulness, number of features, and happiness; Reference Brainerd, Chang, Bialer and LiuBrainerd et al., 2022, 2023); it seems that arousal may be determined by the sheer magnitude of the stimulus, which further support the evolutionary character of this dimension.
However, some stimuli do not cause automatic, physical arousal. Sometimes they are important to us, but we need to think about them. We need to assess the situation and compare it with our goals, values, and priorities, and only then do we know how crucial something is and how much mental effort we should apply to it. This reflective, cognitive activation is the dimension of the subjective significance of a stimulus (Imbir et al., Reference Imbir, Spustek, Bernatowicz, Duda-Goławska and Żygierewicz2017a; van Hooff et al., Reference van Hooff, Dietz, Sharma and Bowman2008). While this appears intimately connected with an individual’s values, subjective significance ratings may be generationally and culturally shared and thus may comprise affective norms – mean emotional ratings for each word on a particular dimension (e.g.Imbir, Reference Imbir2016b; Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022).
In terms of the ambiguity of a word, it seemed that those two kinds of activation – the dimensions of arousal versus subjective significance – may create a space of activation, allowing for one stimulus to be perceived as ambiguous in that it both arouses and holds subjective significance (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). However, it is important to keep in mind that in the theory that describes the different kinds of ambiguity (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022), spaces of valence and spaces of origin consist of two negatively correlated constructs, whereas the phenomenon of activation ambiguity is something different. Arousal and subjective significance were positively correlated with each other in previous studies (Imbir, Reference Imbir2016b; Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). Furthermore, in previous experiments, words associated with activation ambiguity were rated as more emotional than control group words and words rated ambiguous on the spaces of valence and origin. Similarly, activation ambiguous words elicited the lowest number of eye-fixations and the shortest fixations when participants were reading them (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022).
1.4. The relationships between origin and activation
It was proposed that two kinds of activation – what was postulated to create an emotional space of activation – namely dimensions of arousal and subjective significance, are in fact system-dependent (Imbir, Reference Imbir2016b). This means that while arousal will facilitate automatic processing, the subjective significance of a stimulus will promote reflective, systematic cognition. Furthermore, the presence of the activation of the opposite systems may disrupt functioning. For example, high levels of arousal may interfere with systematic cognition: it is difficult to study when an individual feels in any way threatened and arousal is high. Similarly, in a situation where an automatic response is needed, slow and reflective subjective significance does not apply; we do not need to think through all our options and values to jump away from the snake in the grass (Imbir, Reference Imbir2016b). The work of both the automatic and reflective systems is very much context-dependent, and the systems appear to share tasks in order to optimise human functioning (Kahneman, Reference Kahneman2013).
However, the relationships just presented are for emotion–cognition interactions in a unidimensional framework with an assumption that we may only feel one emotional characteristic at a time. In reality, humans can notice both arousal and subjective significance simultaneously. It is not that we may either try to process the situation automatically or try to employ reflection instead. It seems that human emotional functioning is more complex than that, and we need to consider a host of different characteristics that sometimes oppose each other and sometimes mix (Berrios et al., Reference Berrios, Totterdell and Kellett2014). We will mainly focus on word stimuli and the emotionality perceived in them, and it was already shown that different characteristics may be seen in the same stimulus; for example, one word may be assessed as both high on positivity and negativity dimension, or one word can trigger both automaticity and reflectiveness (Brainerd et al., Reference Brainerd, Chang and Bialer2021; Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). This dynamic was explained in the theory of ambiguity, which included the possibility of activating two opposite constructs for spaces other than valence: dimensions of origin and activation (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). Therefore, the question that we need to ask is not only about the aforementioned characteristics of the relationships between the dimensions (between automaticity and subjective significance), but we also wish to disentangle the relationships between the spaces of emotional dimensions – the dimensions of origin and activation and how they interlace.
1.5. Cognitive perception of emotional load with an eye-tracking method
One of the psychophysical measurements gaining popularity in psychological research is eye-tracking, which allows investigators to record how people look at stimuli (Serda et al., Reference Serda, Becker, Cleary, Team, Holtermann, The, Agenda, Science, Sk, Hinnebusch, Hinnebusch, Rabinovich, Olmert, Uld, Ri, Lq, Frxqwu, Zklfk and Edvhg2015) and thus analyse how they manage their visual attention (Carter & Luke, Reference Carter and Luke2020). It has enabled the gathering of data not only about the placement of the fixations – which can lead to marking specific areas of interest (AOI) on a screen – but also on the number of times participants fixate their eyes on one spot, where the average fixations have lasted from 100 to 400 ms (Salvucci & Goldberg, Reference Salvucci and Goldberg2000). This information might serve as an indirect measurement of attention towards the stimulus and one’s processing of it (Raney et al., Reference Raney, Campbell and Bovee2014); as eyesight is usually one of our most dominant senses, we look at the things we think about, which we process, which interest us, etc.; in other words, our cognitive functioning appears very much connected to our eyes’ movement (Rayner & Reingold, Reference Rayner and Reingold2015). Furthermore, we look at different stimuli differently (Bednarik & Tukiainen, Reference Bednarik and Tukiainen2006; Krejtz, Reference Krejtz2016), depending on how novel, interesting, important, or strange they seem (Krejtz et al., Reference Krejtz, Duchowski, Niedzielska, Biele and Krejtz2018) and based on individual characteristics (Holas, Reference Holas2015).
1.6. Aim and hypotheses
The main aim of our experiments was to delineate the characteristics of the perception of words differing in their intensity in origin and activation ambiguity with behavioural (Experiment 1) and – later, to carefully track the psychophysical mechanisms behind the behaviour – also the webcam-based eye-tracking procedures (Experiment 2). We decided to exclude the most common dimensions of valence from our study design – positivity and negativity, as well as the ambivalence phenomenon – in order to try to isolate the main effects of origin and activation ambiguity and their interactions only, uncontaminated by valence. Positive and negative valence may not only entirely disrupt results in an emotionality rating task, but they also might interact with origin and activation dimensions and significantly change their result patterns. To minimise that risk, we decided to control for valence between our groups and to manipulate only the spaces of origin and activation ambiguities.
In the behavioural study, we expected that ratings of emotionality would decrease along with the intensity of origin ambiguity; however, emotionality ratings were predicted to increase with the intensity of activation ambiguity. We also hypothesised that reaction times would be longer for words of increasingly ambiguous origin and that reaction times would decrease alongside increasing activation ambiguity. Furthermore, we wanted to explore interaction effects between word origin and word activation ambiguity to check how those two variables might moderate each other’s impact.
For the eye-tracking study, we expected that words that increased in their levels of ambiguous origin would also elicit more and longer fixations. We predicted that this effect would be the opposite for the activation dimension, which was hypothesised, along with its levels, to cause lesser and shorter fixations.
2. Method
2.1. Participants
2.1.1. Experiment 1
The sample size in Experiment 1 was estimated with the usage of G*Power software (version 3.1.9.4; Faul et al., Reference Faul, Erdfelder, Lang and Buchner2007); for the repeated-measures ANOVA with nine dependent measurements, 95% power (two-sided, α = 0.05), and the estimated effect size of η2 = 0.05 (f = 0.23), we obtained the minimum sample of only 36 participants. The effect size was estimated based on previous experiments (e.g. Wielgopolan and Imbir); however, we considered the fact that in the current experiment we wanted to check for the interaction of factors, so we might have needed more participants. Also, since we wanted to rule out potential outliers (i.e. reaction times that significantly deviated from the mean), we decided to increase the group size to 60 participants.
All of them were students at Polish universities, recruited via various groups on Facebook. We analysed answers from 59 participants after one had to be deleted due to monotonous answers and very short reaction times. The participants ranged in age from 18 to 30 (M = 22.13, SD = 2.83). We ensured that they were all native Polish speakers and had normal or corrected-to-normal vision and no past diagnosis of dyslexia.
2.1.2. Experiment 2
In the second experiment, we conducted an identical G*Power analysis as in the first. Our results from Experiment 1 varied in their effect sizes (from η2 = 0.05 - η2 = 0.84), so we decided to input an estimated effect size of η2 = 0.20 (f = 0.33). The a priori analysis showed us a required sample size of only 27 participants. We recruited 40 participants in total, keeping in mind the potential for outliers and the possibility of ruling out participants who might provide compromised data caused by, for example, poor lighting in the room. Participants were aged between 18 and 29 (M = 23.11, SD = 1.99). All the participants in this sample met the criteria used in Experiment 1.
All the procedures involving human participants were conducted in accordance with the ethical standards of the institutional research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. We reported how we determined our sample size, all data exclusions, all manipulations and all measures in the study.
2.2. Materials
The word stimuli were selected from a database of ambiguous words (see Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022) and compiled into a list of 225 words (see Table 1). Differing in characteristics, we divided them into nine groups, with 25 words in each group. The words between groups only differed in their characteristics of the manipulated variables: origin ambiguity and activation ambiguity, but not for any of the controlled variables, namely positivity (F[8, 216] = 1.17, p = .32, η2 = 0.04) and negativity (F[8, 216] = 3.21, p = .001, η2 = 0.10; none of the post-hoc comparisons were statistically significant). The groups of words were also not different by the number of letters in word stimuli (F[8, 216] = 1.48, p = .16, η2 = 0.05) or their frequency of appearance in the Polish language (F[8, 216] = 1.08, p = .31, η2 = 0.02).
Table 1. Nine groups of words with stimuli examples and Polish translations

2.3. Procedure
2.3.1. Experiment 1
In the behavioural experiment, participants were recruited to the procedure with a short Qualtrics survey containing questions about the requirements to take part in the study (cf. Procedure). After that, they were invited to choose a date for an online meeting with an experimenter, taking place on the Zoom video communications platform. After logging in, the experimenter introduced themselves and described the aim of the study and the consecutive steps of the procedure. For the duration of the experiment, both participants and experimenters had their cameras and microphones on, mimicking a stationary laboratory situation. Participants could ask questions at any time; experimenters, however, had the possibility to verify whether participants were engaging in the experiment without distracting themselves or leaving their computer prematurely.
Participants were ensured that they could withdraw from the study at any point without providing any reason and that their incomplete data would be automatically deleted; they were also told about the anonymity of the data and the fact that analyses would be conducted only on a group level. After making sure that the participants had no questions, experimenters sent them a link to the experiment designed on Gorilla Experiment Builder.
The first screen presented to participants on Gorilla was one with instructions that explained the task in the procedure. Participants were asked to read words shown on the screen: only one word appeared at a time at the centre of the screen, written in a large black Arial font sized to 12% of the screen, automatically adjusted to participants’ screen sizes. Participants rated whether they thought that the word they viewed was emotional or non-emotional. They chose their answer by pressing one of the keyboard keys, with a left arrow press indicating ‘emotional’ and a right arrow press for ‘non-emotional’; both the options were written on the screen and coded as EMO and NON-EMO. Participants were asked in the instructions to heed their intuition during the experiment and answer how they really felt about the word; the instructions specified that there were no good or bad answers.
Participants completed a short training, assessing the emotionality of four example words. After that, they viewed a screen with instructions explaining that they were about to start the main task. When they were ready and pressed their space key to proceed, participants were presented with a total of 225 words appearing one at a time in a fully randomised order.
After finishing the main task, participants were instructed to return to the window containing their active Zoom meeting with an experimenter. The experimenter thanked them for their participation in the study, and participants had the opportunity to ask questions. The entire procedure took about 20 minutes.
2.3.2. Experiment 2
The procedure in Experiment 2 was very similar to Experiment 1, with the only exception being the application of a webcam-based eye-tracking measurement. At the beginning of the meeting on Zoom, in addition to all the previously described information participants were told that the eye-tracking measurement they would engage in was not in any way based on creating any recording or pictures of them and that the only data that would be gathered would populate numerical databases and then be stored in Excel files. Participants were also asked to restrict their movements during the experiment.
When participants were ready to begin, they clicked on a link to an experiment on the RealEye online research platform. The experiment began with an initial calibration task where participants were presented with their photograph from their webcam and asked to sit in a way where their face would fit into a small frame on their screen. After achieving a proper face position, which was indicated by the frame becoming green, participants were asked to hold that posture and physically move as little as possible. After that, they engaged in another calibration task where participants were asked to follow little red dots appearing on the screen with both their mouse and eye-gaze. Forty dots were presented in different placements on their screen, while the screen’s background colour ranged from white to black. After following all the dots, participants were asked to look sequentially at four dots until each exploded in term. If during this procedure the data quality was poor from an unsuccessful calibration, participants were asked to repeat it; in that case, they were provided with the help of an experimenter who checked carefully that the conditions of the study were being met.
After successful calibrations, participants were completing a task nearly identical to the one used in Experiment 1. The only difference was that the experiment was split in the middle: we programmed the procedure to employ one additional calibration identical to the first that appeared in the middle of the experiment, about 10 minutes in. The reason for that was to give participants a moment to change their physical position and then to stay in it until the end, but the additional calibration was also employed to improve the quality of our data throughout the whole experiment. However, the position of participants was monitored during the entire experiment; if during the procedure participants moved and the original calibration was lost, the study was automatically paused, and the screen presented them with the initial calibration screen and asked them to fit their face into the small digital frame.
After rating all sets of randomised words, participants were asked to close the RealEye experiment window and return to their Zoom meeting. As in Experiment 1, volunteers were thanked for their participation; they were invited to ask any questions about the study or the eye-tracking measurement. When they were done, participants could simply log out of the meeting. The entire procedure, including the calibrations, took 20 to 25 minutes on average.
3. Results
3.1. Behavioural results
3.1.1. Emotionality ratings
To check for differences in ratings of emotionality between the groups of words, we conducted a repeated-measures ANOVA, applying a Greenhouse–Geisser correction when necessary. We obtained a main effect for origin ambiguity: F(2, 116) = 299.81, p < .001, η2 = .84. Pairwise comparisons conducted with the usage of the repeated-measures t-tests showed that words of low origin ambiguity (M = 0.58, SD = 0.16) were rated as significantly more emotional than those of moderate origin ambiguity (M = 0.31, SD = 0.16): t(58) = 20.03, p < .001, d = 2.61, as well as high origin ambiguity (M = 0.30, SD = 0.18): t(58) = 18.31, p < .001, d = 2.38. However, ratings of moderate and high origin ambiguity were not significantly different from each other: t(58) = 0.43, p = .34, d = 0.05 (see Fig. 1).

Figure 1. Main effects of emotionality ratings for origin ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.

Figure 2. Main effects of emotionality ratings for activation ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.
We found a main effect for activation ambiguity: F(2, 116) = 164.17, p < .001, η2 = .74. Words of low activation ambiguity (M = 0.28, SD = 0.15) were rated as significantly less emotional than those of moderate (M = 0.41, SD = 0.17): t(58) = 13.11, p < .001, d = 1.71, and high activation ambiguity (M = 0.49, SD = 0.18), t(58) = 14.80, p < .001, d = 1.92. Furthermore, ratings of emotionality for words of moderate activation ambiguity (M = 0.41, SD = 0.17) were significantly lower than for those of high activation ambiguity (M = 0.49, SD = 0.18): t(58) = 7.94, p < .001, d = 1.03 (see Fig. 2).
Finally, we found interaction effects between the factors of origin and activation ambiguities for the ratings of emotionality: F(4, 232) = 17.33, p < .001, η2 = .23. Among the words of low origin, those of low activation ambiguity were assessed as significantly less emotional than words of low origin (M = 0.42, SD = 0.18) and moderate activation ambiguity (M = 0.58, SD = 0.17): t(58) = 10.33, p < .001, d = 1.34, as well as high activation ambiguity (M = 0.74, SD = 0.19): t(58) = 15.25, p < .001, d = 1.99. Similarly, words of moderate activation ambiguity were rated as significantly less emotional than stimuli with high activation ambiguity: t(58) = 9.70, p < .001, d = 1.25.
Furthermore, words of low origin and low activation ambiguity (M = 0.42, SD = 0.18) were assessed as significantly more emotional than words of moderate origin and low activation ambiguities (M = 0.23, SD = 0.16): t(58) = 13.17, p < .001, d = 1.72, moderate origin and moderate activation ambiguity (M = 0.32, SD = 0.17): t(58) = 9.23, p < .001, d = 1.20, and those of moderate origin and high activation ambiguity (M = 0.38, SD = 0.19): t(58) = 2.87, p = .006, d = 0.38, high origin and low activation ambiguities (M = 0.21, SD = 0.16): t(58) = 13.37, p < .001, d = 1.72, high origin and moderate activation (M = 0.33, SD = 0.22): t(58) = 5.38, p < .001, d = 0.70, and high origin and high activation ambiguities (M = 0.37, SD = 0.21): t(58) = 3.01, p = .004, d = 0.39.
Words of low origin and moderate activation ambiguities (M = 0.59, SD = 0.17) were assessed as significantly more emotional than those of moderate origin and low activation ambiguities (M = 0.23, SD = 0.16): t(58) = 16.07, p < .001, d = 2.09, moderate origin and moderate activation (M = 0.32, SD = 0.17): t(58) = 14.80, p < .001, d = 1.93, moderate origin and high activation (M = 0.38, SD = 0.19): t(58) = 12.24, p < .001, d = 1.59, and those of high origin and low activation ambiguities (M = 0.21, SD = 0.16): t(58) = 17.77, p <. 001, d = 2.31, high origin and moderate activation (M = 0.33, SD = 0.22): t(58) = 10.91, p < .001, d = 1.42, and finally those of high origin and activation ambiguities (M = 0.37, SD = 0.21): t(58) = 11.40, p < .001, d = 1.47.
Words of low origin ambiguity and high activation ambiguity (M = 0.38, SD = 0.19) were assessed as more emotional than those of moderate origin and low activation ambiguities (M = 0.22, SD = 0.16): t(58) = 19.97, p < .001, d = 2.60, moderate origin and moderate activation (M = 0.32, SD = 0.17): t(58) =18.50, p < .001, d = 2.41, moderate origin and high activation (M = 0.38, SD = 0.19): t(58) = 17.69, p < .001, d = 2.30, high origin and low activation (M = 0.20, SD = 0.16): t(58) = 19.98, p < .001, d = 2.60, high origin and moderate activation (M = 0.33, SD = 0.22): t(58) = 15.58, p < .001, d = 2.03, and high origin and high activation ambiguities (M = 0.37, SD = 0.21): t(58) = 16.29, p < .001, d = 2.12.
Among the words of moderate origin ambiguity, we found differences between those of low activation (M = 0.26, SD = 0.16) and moderate (M = 0.32, SD = 0.17): t(58) = 7.52, p < .001, d = 0.98, low and high (M = 0.38, SD = 0.19): t(58) = 9.10, p < .001, d = 1.19, and moderate and high ambiguities: t(58) = 3.68, p < .001, d = 0.48. Among the words of high origin ambiguity, those of low activation ambiguity (M = 0.21, SD = 0.16) had significantly lower emotionality assessments than those of moderate (M = 0.33, SD = 0.22): t(58) = 6.58, p < .001, d = 0.85, and high activation ambiguities (M = 0.37, SD = 0.21): t(58) = 8.32, p < .001, d = 1.08. Similarly, words of high origin and moderate activation ambiguities had significantly lower ratings than those of high activation ambiguity: t(58) = 2.34, p = .02, d = 0.31. All of this section’s comparisons are presented in Fig. 3.

Figure 3. Interaction effects of origin and activation ambiguity for emotionality ratings of words from nine groups; for the clarity of presentation, panel A presents the comparisons inside different kinds of ambiguity (respectively, inside different intensities of origin ambiguity), while panel B depicts the comparisons that span across different intensities of origin ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.

Figure 4. Main effects of decision time for origin ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.
3.1.2. Reaction times
For participants’ reaction times, before our analysis we transformed the results from milliseconds into natural logarithms (NLs). Then, we used a repeated-measures ANOVA to check for differences among the groups. We obtained a significant main effect for origin ambiguity: F(2, 116) = 3.08, p = .05, η2 = .05. Further t-test comparisons showed that words of low origin ambiguity (M = 1269.44, SD = 376.54, NL: M = 7.09, SEM = 0.04) received significantly shorter reaction times than those of moderate origin ambiguity (M = 1305.38, SD = 434.42, NL: M = 7.12, SEM = 0.04): t(58) = 1.77, p = .04, d = 0.23, as well as those of high origin ambiguity (M = 1331.12, SD = 483.28, NL: M = 7.13, SEM = 0.04): t(58) = 2.66, p = .01, d = 0.34. However, the comparison between words of moderate and high origin ambiguity was insignificant: t(58) = 1.22, p = .11, d = .16.
We also found a significant main effect for activation ambiguity: F(2, 116) = 6.15, p = .003, η2 = 0.10. Reaction times for words of low activation ambiguity (M = 1302.23, SD = 449.07, NL: M = 7.11, SEM = 0.04) were significantly shorter than those of moderate ambiguity (M = 1342.68, SD = 470.32, NL: M = 7.14, SEM = 0.04): t(58) = 2.05, p = .05, d = 0.26. Furthermore, the words of moderate activation ambiguity had significantly longer reaction times than words of high activation ambiguity (M = 1261.04, SD = 376.00, NL: M = 7.09, SEM = 0.04): t(58) = 3.84, p < .001, d = 0.50. There were no significant differences between the low and high ambiguity dimensions: t(58) = 1.73, p = .09, d = 0.22.
We observed statistically significant interaction effects between the factors of origin and activation ambiguity: F(4, 232) = 5.26, p < .001, η2 = .08. We found significant differences in five comparisons; that is, the words of low origin ambiguity and moderate activation ambiguity had significantly shorter reaction times than words of high origin and moderate activation ambiguity: t(58) = 3.55, p < .001, d = 0.46. Next, the words of low origin ambiguity and high activation ambiguity had significantly shorter reaction times than words of both high origin and high activation ambiguity: t(58) = 2.75, p = .008, d = 0.36, and words of low origin and activation ambiguity: t(58) = 2.95, p = .005, d = 0.39. Finally, the words of high origin ambiguity and moderate activation ambiguity received significantly longer response times than words of high origin ambiguity and low activation ambiguity: t(58) = 4.39, p < .001, d = 0.57, and words of high origin and high activation ambiguity: t(58) = 3.08, p = .003, d = 0.40.
3.2. Webcam-based eye-tracking results
3.2.1. Number of fixations
For the eye-tracking data, we analysed observations from 39 participants. We excluded one participant’s data because of their monotonous answers, low quality of provided data, and extremely short reaction times. To check for differences in the mean number of fixations on a word among the nine groups, we conducted a repeated-measures ANOVA. We obtained a main effect for origin ambiguity: F(2, 72) = 7.44, p = .001, η2 = 0.17. The number of fixations on words of low origin ambiguity (M = 1.89, SD = 0.44) was significantly lower than for moderate (M = 2.36, SD = 0.82): t(38) = 3.73, p < .001, d = 0.60, and high origin ambiguity words (M = 2.27, SD = 0.60): t(38) = 3.38, p = .002, d = 0.72.
We also obtained a main effect for activation ambiguity: F(2, 72) = 7.96, p < .001, η2 = 0.18. The number of fixations was the highest for low activation ambiguous words (M = 2.32, SD = 0.55) and significantly higher than for moderate (M = 2.15, SD = 0.55): t(38) = 1.94, p = .03, d = 0.31, and high activation ambiguity (M = 2.03, SD = 0.46): t(38) = 2.86, p = 007, d = 0.25 (Fig. 7).

Figure 5. Main effects of decision time for activation ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons.
*p < .05; **p < .01; ***p < .001.
The interaction effect between the factors of origin and activation ambiguities was not statistically significant: F(4, 144) = 0.26, p = .90, η2 = 0.007.
3.2.2. Duration of fixations
To check for differences in the mean duration of fixations on words among groups, we conducted a repeated-measures ANOVA. We obtained a main effect for origin ambiguity: F(2, 72) = 19.26, p < .001, η2 = 0.35. The mean time of fixations was significantly shorter for words of low origin ambiguity (M = 209.29, SD = 30.69) than for those of moderate (M = 247.34, SD = 38.59): t(38) = 6.29, p < .001, d = 1.01, and high origin ambiguity (M = 252.51, SD = 26.06): t(38) = 6.06, p < .001, d = 0.97.
We also observed significant main effects for groups of words with activation ambiguity, namely F(2, 72) = 21.81, p < .001, η2 = 0.38. Words of low activation ambiguity (M = 253.46, SD = 32.84) had significantly longer mean fixation times compared with words of moderate (M = 236.06, SD = 28.43): t(38) = 3.04, p = .004, d = 0.49, and high ambiguity (M = 218.83, SD = 18.51): t(38) = 7.06, p < .001, d = 1.13. Furthermore, words of moderate activation ambiguity received significantly longer fixations than those of high activation ambiguity: t(38) = 3.45, p = .001, d = 0.54 (Fig. 8).

Figure 6. Mean decision times of nine groups of word stimuli.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.

Figure 7. Main effects for the number of fixations on word stimuli from groups of (A) origin ambiguity and (B) activation ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.

Figure 8. Main effects for mean durations of fixations on word stimuli from groups of (A) origin ambiguity and (B) activation ambiguity.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.

Figure 9. Interaction effects between origin ambiguity and origin ambiguity for the mean durations of fixations on word stimuli.
Note: Error bars show standard deviations; black horizontal lines show statistically significant comparisons. *p < .05; **p < .01; ***p < .001.
Furthermore, there was a significant interaction effect for origin and activation ambiguities: F(4, 144) = 5.04, p < .001, η2 = 0.12. Words of low origin and low activation ambiguities (M = 239.76, SD = 47.92) had significantly shorter fixation times than words of high origin and low activation (M = 270.81, SD = 51.51): t(38) = 2.84, p = .007, d = 0.46. Words of low origin and moderate activation ambiguity (M = 212.98, SD = 60.43) had significantly longer times than those of low origin and high activation (M = 175.11, SD = 41.44): t(38) = 3.16, p = .003, d = 0.51. Word stimuli of low origin and high activation ambiguities (M = 212.98, SD = 60.43) had significantly shorter fixation times than those of moderate origin and moderate activation (M = 248.63, SD = 45.92): t(38) = 2.93, p = .006, d = 0.48, as well as those of high origin and moderate activation ambiguity (M = 247.90, SD = 32.71): t(38) = 3.31, p = .002, d = 0.53. Finally, words of low origin and high activation ambiguity (M = 175.11, SD = 41.99) received significantly shorter fixations than words of moderate origin and low activation (M = 242.91, SD = 52.73): t(38) = 5.71, p < .001, d = 0.93, as well as high origin and high activation (M = 238.81, SD = 42.70): t(38) = 5.88, p < .001, d = 0.94 (Fig. 9).
4. Discussion
We confirmed most of our hypotheses. In Experiment 1, we obtained the predicted main effects of the origin and activation ambiguities. Similar to previous studies (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022), increasing intensity of word origin ambiguity was accompanied by gradually lower ratings of the emotionality of the words, as well as longer reaction times. For activation ambiguity of words, we observed that emotionality ratings were higher within increasing levels of ambiguity. These results are rather intuitive, and they clearly underline the specific properties of those two kinds of ambiguities: origin ambiguity is built on the space created by two negatively correlated dimensions; if we perceive opposite characteristics in a single stimulus, we either feel two things at the same time – which might be cognitively demanding – or we switch between seeing those two characteristics at the same time, which also might be exhausting (cf. Vaccaro et al., Reference Vaccaro, Kaplan and Damasio2020). Either way, there is a cost associated with perceiving ambiguity in opposite things (i.e. usually in order to try to compare those characteristics and reduce the ambiguity, as it was described for the attitudinal ambivalence; Van Harreveld et al., Reference Van Harreveld, Rutjens, Schneider, Nohlen and Keskinis2014) that might result in additional cognitive processing of the stimulus (Monteith et al., Reference Monteith, Devine and Zuwerink1993) and thus lead to relatively lower emotionality ratings. That cost may not be paid when we see and feel ambiguity consisting of positively correlated dimensions, for instance arousal and subjective significance, which may work in the same direction (c.f. [masker review], in review). In that case, emotionality may even be enhanced when an individual is looking at such stimuli and perceiving both characteristics.
In the interaction effects between the origin and activation dimensions for emotionality rating results, a pattern is clearly visible and repetitive throughout all the origin levels: when we look at each one of them, we see that within increasing activation ambiguity levels, emotionality ratings rise. We also obtained these results for the main effect of activation ambiguity. However, it can be seen that the ratings for low-origin words are significantly higher than for all the other word groups. This seems logical that, when origin ambiguity is low, activation ambiguity sets the tone. This changes when origin ambiguity is moderate or high: these emotionality ratings, while they still repeat the pattern of increases, are significantly lower. This is in line with the previously proposed theory (Imbir, Reference Imbir2016b) that origin is accompanied by activation, even when they are both ambiguous. Nevertheless, we would like to argue that origin is an emotional space that is hierarchically higher than activation: it sets the tone, gives the message about the activation, and, apparently, causes its ambiguity’s intensity (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022). Results from responses to ambiguous word stimuli show that those two factors – origin ambiguity and activation ambiguity – while having their own main effects and specific patterns, may also interact and change each other’s properties.
As for reaction times, there are visible discrepancies in the results: they came out much different among origin ambiguity levels than emotionality ratings. For low origin ambiguity, levels of activation ambiguity elicit gradually shorter reaction times. This pattern is opposite the pattern seen in the emotionality ratings: word groups of higher emotionality elicited shorter reaction times. This actually seems plausible in the case of low origin ambiguity words and therefore for low ratings of words on dimensions of automaticity and reflectiveness. The intensity of origin ambiguity – the only one consisting of opposite dimensions – is low. Meanwhile, what is present is only the ambiguity created by two positively correlated dimensions, which sets the tone for the results and allows humans to answer quickly and assess words as highly emotional or not.
For moderate and high origin ambiguity, we see that with words differing in activation, the ambiguity’s intensity creates the pattern observed in the main effect of the activation, with moderate activation obtaining the longest reaction times. It seems that while for the emotionality ratings the intensity of origin ambiguity shapes results by lowering emotionality assessments, for the actual reaction time – the time needed to react and to take action rather than think and assess – activation ambiguity produces most of the variability with a few differences between low and high origin ambiguities as exceptions.
Per our eye-tracking results, we confirmed our hypotheses, linking the specifics of visual attention to the processing of emotional word (as it was in previous studies, when the duration of fixations was interpreted as an occurrence of the cognitive processes, more attention towards the stimuli; Hayhoe, Reference Hayhoe2004; Rayner et al., Reference Rayner, Schotter and Drieghe2014). As origin ambiguity intensity increased, the number of fixations and the average time of duration of a fixation both increased as well; this is in line with previous results using different word lists (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2023), and it seems plausible. Words of high origin ambiguity are perceived automatically and inspire reflectiveness at the same time; it seems natural that our participants needed to return their gaze to them and further process them to assess their emotionality and make a decision. This effect observed on the emotional space of origin, however, perhaps could be compared to the functioning of two systems (cf. Kahneman, Reference Kahneman2013) – of experiential and systematic processing (Strack & Deutsch, Reference Strack and Deutsch2004) – which exchange tasks between them to optimise functioning. Perhaps when we look at ambiguous words, we do the same: we see with automaticity but also employ a degree of reflection, and we switch quickly from perceiving one dimension to another, which would be similar to one of the proposed models to feeling ambivalence (Cacioppo & Berntson, Reference Cacioppo and Berntson1994; Vaccaro et al., Reference Vaccaro, Kaplan and Damasio2020), a process that both takes time and involves fixations, apparently. This would also be very much connected with the previous models describing relations between managing attention and eye-tracking results (Liu et al., Reference Liu, Chen and Chang2010): longer fixations were linked to better learning and higher cognitive load (Bednarik & Tukiainen, Reference Bednarik and Tukiainen2006; for review, see Rosch & Vogel-Walcutt, Reference Rosch and Vogel-Walcutt2012). Furthermore, if the fixation duration is, in fact, the indicator of the cognitive processes (Raney et al., Reference Raney, Campbell and Bovee2014) – in that case the time of reading and thus decoding the message (Liu, Reference Liu2014), then it seems intuitive that decoding the increasingly ambiguous words may become more and more difficult.
Similarly, the results for activation ambiguity also came out as we predicted, and they were somehow opposite to the origin dimension results: the higher the activation ambiguity, the fewer and shorter fixations were elicited. Looking at words with activation ambiguity – while it has the aforementioned ambiguous part because of the mixture of two different kinds of activation characteristics in it – may be easier and not require one to revisit the AOI as many times as for words high on origin ambiguity. We also proposed that perhaps what we see might be a derivative of the hyper-scanning phenomenon, where people look at something and produce a lot of very short fixations, along with long saccades, so while they do cover a lot of ground, so to speak, they do it very fast and in abrupt movements (Horley et al., Reference Horley, Williams, Gonsalvez and Gordon2003). We propose that this strategy of looking at words might be a result of the words’ activation itself – a different kind of arousal that it elicits – which allows humans to react quickly (cf. Russell, Reference Russell1980). This could again be caused by what we discussed in the behavioural results – the fact that the construction of activation ambiguity is different, consisting of two positively correlated dimensions. It seems that while people notice it (Wielgopolan & Imbir, Reference Wielgopolan and Imbir2022), it does not cause such a need for further cognitive processing, thus elongating the fixations (Bednarik & Tukiainen, Reference Bednarik and Tukiainen2006; Liu, Reference Liu2014).
One of the limitations of this study could be the fact that it was conducted online. That said, we did prepare the procedure for those studies in a similar manner to one that could be conducted in person; for example, we ensured that our participants received instructions from an experimenter. These conditions were made as close as possible to conditions in a stationary laboratory setting. We did use a webcam-based eye-tracking method instead of a stationary one, but to ensure that those results were reliable, we meticulously prepared the stimuli by choosing words and presenting words in a large font. We also facilitated two calibrations during the experiments – in addition to the software checking that participants’ position and calibration remained satisfactory at all times during the procedure – and we carefully checked the quality of the gathered data. While a stationary laboratory replication of our experiments would be interesting, it seems that online methods of conducting studies are developing quickly in positive directions to the extent that online results compare well with stationary studies (Wisiecka et al., Reference Wisiecka, Krejtz, Krejtz, Sromek, Cellary, Lewandowska and Duchowski2022) and they provide researchers with great new opportunities (e.g. sampling wider audiences).
Furthermore, we excluded the dimension of valence from our study design because we wanted to obtain the effects of origin and activation ambiguities only, uncontaminated by valence. Origin and activation may have a significant impact on human functioning (Imbir & Pastwa, Reference Imbir and Pastwa2021; Jarymowicz & Jasielska, Reference Jarymowicz and Jasielska2012), but they are still understudied phenomena. We introduced the possibility of emotional ambiguity in spaces of origin and activation, and to map the properties of the ambiguities, we needed to distinguish them from valence and ambivalence to create a clear picture for the origin and activation spaces only. It is important to mention that creating a reliable list of stimuli is more and more difficult within an increasing number of experimentally manipulated dimensions. Using word lists seems to be the most precise method to test various and reliable (because of the previously prepared affective norms) stimuli, but it is a method that might have reached its limits. In a recent study, the interactions between the three dimensions reached a maximum, which might be understood and interpreted (Imbir et al., Reference Imbir, Duda-Goławska, Jurkiewicz, Pastwa, Sobieszek, Wielgopolan and Żygierewicz2022a). For this reason, deciding which spaces/dimensions to consider may be inevitable.
Future studies could also include mapping the consequences of ambiguous words. While there is some research on ambivalent words and how we remember them (Brainerd, Reference Brainerd2018; Brainerd et al., Reference Brainerd, Chang and Bialer2021, Reference Brainerd, Chang, Bialer and Liu2022; Chang & Brainerd, Reference Chang and Brainerd2023), as well as on explaining the mechanism of the ambivalence and semantic ambiguity (Brainerd et al., Reference Brainerd, Chang and Bialer2021; Chang & Brainerd, Reference Chang and Brainerd2023), there has been none done yet on stimuli ambiguous on the spaces of origin and activation. Preparing the stimuli and studying how participants perceive them were only the first step, and the next steps should include attempting to see more of what people do when they process words with origin or activation ambiguity. Do they get remembered better than unidimensional stimuli? If they are rated differently in terms of emotionality, might they disrupt some processes, such as cognitive control, and not others? What kind of results would we get in research using these words during classical cognitive tasks, such as an emotional Stroop, go/no-go, or N-back task?
Our study is one of the very first attempts to research dimensions of ambiguity other than valence. It seems, however, that those two spaces have unique characteristics impacting how participants perceive them (e.g. processing and managing visual attention when looking at them) and assess their emotionality. Furthermore, as we see in earlier unidimensional theories, ambiguity and valence are also connected with one another: they may interact and change patterns of results. The differences we observed in this study might provide a foundation for future psychological interventions or linguistic practices, such as creating text containing specific characteristics or performing a sentiment analysis for constructs other than valence only.
Acknowledgments
We would like to thank Radosław Wcześniak for help in gathering the data. The studies were funded by the Grant of the Excellence Initiative Research University. This work was supported by Ministerstwo Edukacji i Nauki.
Data deposition
The datasets from both of our experiments are publicly available in the figshare repository: https://doi.org/10.6084/m9.figshare.21971963.v1.
Ethics statement
The study received a positive opinion granted by the Research Ethics Committee of the Faculty of Psychology of University of Warsaw (opinion no. 18/06/22).
Competing interest
The authors declare none.
 
 









