Valence sound symbolism across language families: a comparison between Japanese and German

Vowels are associated with valence, so that words containing /i/ (as in English meet ) compared with /o/ (as in French rose ) are typically judged to match positively valenced persons and objects. As yet, valence sound symbolism has been mainly observed for Indo-European languages. The present research extends this to a comparison of Japanese-speaking and German-speaking participants. Participants invented pseudo-words as names for faces with different emotional expressions (happy vs. neutral vs. sad vs. angry). For both Japanese-speaking and German-speaking participants, vowel usage depended on emotional valence. The vowel I was used more for positive (vs. other) expressions, whereas O and U were used less for positive (vs. other) expressions. A was associated with positive emotional valence for Japanese-speaking but not German-speaking participants. In sum, emotional valence associations of I (vs. rounded vowels) were similar in German and Japanese, suggesting that sound symbolism for emotional valence is not language specific.


Introduction
A cheesy way of inviting people to smile for a picture consists in asking them to 'say cheese'. The reason is that articulating the vowel in cheese, /i/, leads to a facial expression that resembles smiling (especially when the /i/ is lengthened). This resemblance is consistent with associations between vowels and valence. Participants associate pseudo-words containing /i/ rather than /o/ with positive meaning (Rummer et al., 2014). In the present research, we examine the cross-linguistic generalizability of this phenomenon by comparing associations between vowels and emotional valence for participants speaking unrelated languages: German and Japanese.

Sound symbolism
In words like slurp or bang, the phoneme sequences imitate the denoted meaning. Although such imitative words are comparatively rare in Indo-European spoken languages, they are more prevalent in other language families (Vigliocco et al., 2014). Many spoken languages contain ideophones, a word class depicting sensory qualities (Dingemanse, 2012). In Japanese ideophones, for example, consonant voicing depicts mass, so that koro denotes a small object rolling and goro denotes a large object rolling. The more general property of language that sublexical features of word forms (e.g., phonemes) are associated with word meaning is called sound symbolism or iconicity (for reviews, see Dingemanse et al., 2015;Lockwood & Dingemanse, 2015;Nuckolls, 1999;Perniss et al., 2010). In addition to its prevalence in lexicons, sound symbolism has also been demonstrated experimentally. Vowels, for example, have been found to be associated with size (Sapir, 1929). To denote large (vs. small) objects, participants tend to choose pseudo-words that contain /a/, such as MAL, over pseudo-words that contain /i/, such as MIL (Newman, 1933;Thompson & Estes, 2011). In addition to size, sound symbolism has been demonstrated for various other dimensions, for example, shape (Ćwiek et al., 2022;Köhler, 1929), color (Cuskley et al., 2019;Simner et al., 2005), speed (Kuehnl & Mantau, 2013;Monaghan & Fletcher, 2019), taste Pathak et al., 2020), personality (Sidhu et al., 2019), and complexity (Lewis & Frank, 2016).
Valence sound symbolism can be explained by an articulatory mechanism relating to facial muscle tension (Körner & Rummer, 2022a). Facial muscle tension for articulation and emotional expressions overlap, so that the zygomaticus major muscle is active both, when articulating /i/ and when smiling (Hardcastle, 1976; see also Rummer et al., 2014;Whissell, 2003). The association between zygomaticus activity and positive valence could have extended, via proprioceptive feedback during articulation, to the vowel /i/, so that the articulation of /i/ is associated with positive valence. In contrast, the articulation of rounded vowels entails contracting muscles that are antagonistic to the ones responsible for lip spreading (Leanderson et al., 1971). Lip rounding could therefore be associated with negative valence or less positive valence (Rummer et al., 2014). Empirically, articulatory similarity (specifically facial muscle tension) rather than acoustic similarity predicts vowel-valence associations (Körner & Rummer, 2022a). Thus, valence sound symbolismat least for /i/ versus rounded vowelsseems driven by articulatory vowel properties.
If valence sound symbolism is caused by this articulatory mechanism, it should occur for all languages that use vowels whose articulation resembles smiling and ones whose articulation inhibits smiling. As yet, however, valence sound symbolism has been mainly examined for Indo-European languages (English, European Portuguese, German, and Russian), with Mandarin Pinyin (Yu et al., 2021) as the only exception (other studies lack critical comparisons [e.g., Miron, 1961], employed some vowels that did not occur in examined languages [Taylor & Taylor, 1962], or examined vowels in isolation, i.e., without word context [Ando et al., 2021]). The present research makes a first step toward testing the cross-linguistic generalizability of valence sound symbolism by comparing participants from two linguistically unrelated languages: German and Japanese.

Language comparisons
Psychological research in general (Henrich et al., 2010) and also in sound symbolism  is biased toward examining Western participants and languages. Among the studies that did compare sound symbolism for unrelated languages, both similarities and differences have been observed. The largest study, examining sound symbolism in the basic vocabulary of more than 4,000 languages, observed, for example, that a large portion of languages use nasal sounds in words for nose (Blasi et al., 2016;see also Johanson et al., 2020). Additionally, size sound symbolism (Huang et al., 1969;Shinohara & Kawahara, 2010; see also Blasi et al., 2016) and shape sound symbolism (Ćwiek et al., 2022) have been observed across many languages (for other cross-linguistic similarities, see, e.g., Dingemanse et al., 2013;Winter et al., 2022). However, differences between languages have been observed, for example, concerning valence associations. Nasal consonants at word beginnings have been found to be associated with positive valence in speakers of some Germanic languages but with negative valence in speakers of Chinese (Louwerse & Qu, 2017; for other differences between languages, see, e.g., Taylor & Taylor, 1962; for a mixture of similarities and differences, see Athaide & Klink, 2012). In sum, there is evidence for cross-linguistic generalization for some sound symbolic associations, but also evidence for language-specific associations (see also Imai & Kita, 2014).
In the present experiment, we compare emotional valence sound symbolism across two unrelated languages, German (an Indo-European language) and Japanese (a Japonic language). Although several sound symbolism phenomena have been demonstrated in both languages, for example, associations with size (Shinohara & Kawahara, 2010), color (Asano & Yokosawa, 2011), and shape (Ćwiek et al., 2022;Kawahara et al., 2019), some studies observe different and especially more sound symbolic associations for speakers of Japanese compared with Indo-European languages (e.g., Iwasaki et al., 2007; see also Saji et al., 2019). Similarly, ideophones are underdeveloped in German as well as other languages from the Indo-European language families (see, e.g., Dingemanse & Majid, 2012), but very prevalent in Japanese, so that, according to Kakehi and colleagues (Kakehi et al., 1996, xi) in Japanese "the occurrence of iconic words […] is anything but marginal. Such forms are indispensable to daily communication." Thus, although some associations are similar, German and Japanese differ in their prevalence of sound symbolism. Judging from previous research, therefore, it is unclear whether or not to predict the same vowel-valence associations across the two languages.
Judging from theoretical considerations, however, we make the same predictions for /i/ and rounded vowels. As smiling is universally used to express joy (e.g., Scherer & Wallbott, 1994), the muscle tension to express positive affect and the muscle tension to articulate /i/ should overlap in all languages where /i/ occurs and involves activity of the zygomaticus major muscle. Accordingly, the proposed mechanism for valence sound symbolismoverlapping muscle activity for articulation and emotional expressionspredicts that /i/ is universally associated with more positive valence than vowels whose articulation is incongruent with smiling, specifically rounded vowels.
To test this hypothesis, we employed the experimental paradigm from Rummer and Schweppe (2019) in which participants are asked to invent pseudo-words to denote specific objects or people (for similar paradigms, see Berlin, 2006;Shinohara et al., 2016;Vinson et al., 2021;Whissell, 2000). This paradigm contains fewer constrictions than typically employed paradigms where participants have to rate or match experimenter-selected pseudo-words. When using experimenter-selected pseudo-words, any aspect of the pseudo-word might influence judgments. For example, position of the target letter in pseudo-words has been found to influence judgments (e.g., Maschmann et al., 2020;Nielsen & Rendall, 2013). Moreover, when comparing speakers of different languages, such incidental pseudo-word features might influence speakers of different languages differently. Therefore, using a paradigm where linguistic stimuli are as unconstrained as possible, as is the case when participants invent pseudo-words, is likely to be least biased, which seems especially important for cross-linguistic studies.
In the present study, participants invented pseudo-names for faces that differed in emotional expression. Specifically, we compared vowel usage for faces with positively valenced emotional expressions with neutral expressions as well as two negatively valenced emotional expressions: anger and sadness. Comparing the two negative expressions enables us to explore whether, in addition to emotional valence, arousal (high for anger and low for sadness) also influences vowel usage.
As participants typed in the pseudo-words, we examined vowel usage on grapheme (instead of phoneme) level. Both languages have a close grapheme-to-phoneme mapping, so that vowel graphemes correspond to one phoneme (in Japanese) or to one of a few similar phonemes (in German). We examined how frequently the vowels A, E, I, O, and U, which constitute all Japanese vowels, were used in invented names depending on both participant language and emotional expression of the depicted face. We predicted the vowel I to be associated with positive emotional valence and O and U with negative emotional valence, for both Japanese-speaking and Germanspeaking participants. We had no specific predictions for A and E but included these vowels for exploratory purposes.

Participants
Participants were recruited through social media or approached in person and invited to participate. Those who were recruited online received a link to the study; those who agreed to participate when asked in person participated on site (mostly in Cafés) using their own or the experimenter's laptop. A total of 134 participants, 76 German-speaking and 58 Japanese-speaking, completed the study (8 additional participants, 4 from each country, started the study but terminated less than 10% into the study). Of these, participants who reported a native language other than the expected language (German in Germany and Japanese in Japan; N = 18) and participants who provided existing names or existing words instead of self-invented words in more than 50% of the trials (N = 17) were excluded from all analyses, resulting in a final sample size of 99 participants (49 Japanese-speaking, 50 Germanspeaking; 39 female, 58 male, 2 other gender; M age = 35, SD age = 12).
This yields a power of β = .80 (with α = .05) for finding an effect of d z = 0.28 for within-participants effects of emotional expressions. For the exploratory question whether there is an interaction between participant origin and emotional expression, the study has a power of β = .80 (with α = .05) for finding an effect of η 2 p = .02. Final sample size depended on logistic constraints and was determined before any data analysis was performed. We report all data exclusions, all manipulations, and all measures. Materials, data, and analysis codes are available at https://osf.io/bdrsh/.

Materials
Participants were asked to invent names for faces taken from the Karolinska Directed Emotional Faces (Lundqvist et al., 1998; for European faces), and from the Taiwanese Facial Expression Image Database (Chen & Yen, 2007; for East Asian faces). From each database, eight male and eight female persons were selected and one picture of each of four facial expressions (happy vs. neutral vs. angry vs. sad) per person were selected. The faces were cropped to show only the face (chin to hair line and ear to ear) and were converted to gray scale (where necessary, brightness was adjusted). Each participant saw one randomly selected picture per face.

Procedure
After providing informed consent, participants were asked to invent a name for each of 32 ensuing faces. The names should not exist in a language they knew and should be at least two syllables long. Faces were presented separately and in random order (two for each combination of gender, cultural background, and emotional expression). For each face, participants were to invent a name, then to articulate this name, and finally to type it; however, some participants did not consent to have their voice recorded and some participants preferred to have the experimenter type in their responses. As additionally the quality of many audio files was poor, a phonemic transcription of the spoken pseudo-words was infeasible, which is why we report only grapheme-based analyses. All phases of the experiment were self-paced.
After inventing names for 32 faces, participants were asked, as a manipulation check, to rate the same faces (in new random order) for valence. Specifically, for each face, they answered the question What in your opinion describes the facial expression of this person? (translated) responding by clicking one of five numbers (1 = very positive; 2 = positive; 3 = neutral; 4 = negative; 5 = very negative). Finally, participants provided demographic information and could comment on the study.
As the experiment included nonindependence due to both repeated measures within participants and repeated measures of stimuli (for different participants), we report linear mixed-effects analyses using R (R Core Team, 2021; version 4.1.2) and the packages lme4 (version 1.1.-30; Bates et al., 2015) and lmerTest (version 3.1-3; Kuznetsova et al., 2017). We used a maximal random effects structure (Barr et al., 2013). When this resulted in negative eigenvalues, random effects were removed until the issue was resolved. As significance tests, we report Type III Analysis of Variance with the Satterthwaite method for calculating degrees of freedom (for more information on this method, see Kuznetsova et al., 2017). As there is no generally accepted effect size measure for linear mixed-effects analyses, we report η 2 p and d z , calculated from participant-level data.

Results
For the manipulation check, valence evaluations were entered into a 4 (emotional expression: happy vs. neutral vs. sad vs. angry; within participants) Â 2 (participant language; between participants) factorial linear mixed-model analysis. There was no main effect of participant native language on valence evaluations (F(1, 97) = 0.59, p = 0.445, η 2 p = .006, 90% CI = [.000, .056]). However, confirming the validity of the manipulation, the emotional expression did influence valence judgments (F(3, 3,063) = 3,604.73, p < 0.001, η 2 p = .932, 90% CI = [.921, .940]). Specifically, faces with the two negative emotional expressions did not significantly differ in valence, whereas all other pairwise comparisons show significant differences (see Table 1). In addition to this main effect of emotional expression, the interaction between participant language and emotional expression was also significant (F (3 For the main analysis, examining how participant language and emotional expression influenced which vowels were used when inventing pseudo-words, each pseudo-word was coded by a native speaker blind to condition. Real words in the target language or in English as well as words that were repeated more than twice were excluded from analyses (8.2% of the words). Hiragana and Katakana mores were transliterated using the Hepburn system (using the R package stringi, version 1.7.6; Gagolewski, 2022) and accents were removed. Repeated consecutive vowel graphemes were replaced by single graphemes (e.g., Obaata was changed to Obata), and the number of occurrences for each vowel grapheme per word was calculated (in the preceding example, the value for A is 2, for O it is 1, and for all other vowels, it is 0). The mean vowel occurrence per invented word was then entered into a 5 (grapheme: A vs. E vs. I vs. O vs. U; within-participants) Â 4 (emotional expression: happy vs. neutral vs. sad vs. angry; within participants) Â 2 (participant language: Japanese vs. German; between participants) factorial linear mixed model. The three-way interaction was significant, indicating that participant language and emotional expression influenced the usage of different graphemes differently (F(12, 13,541) = 4.00, p < 0.001, η 2 p = .030, 90% CI = [.008, .038]; for lowerorder effects, see the Supplementary Material). Frequencies of grapheme occurrences depending on emotional expression and participant language were then analyzed separately for each vowel. . I was used more frequently in pseudo-words for people with happy facial expressions than for people with other facial expressions. Among the other three emotions, I occurrences did not differ significantly (see Table 2). Thus, replicating previous valence sound symbolism findings (e.g., Rummer & Schweppe, 2019), I was associated with positive emotional valence.
For the vowel E, there were no significant effects. Specifically, neither the main effect of language (F(1, 99) Table 3). In contrast to the other vowels, for A, there was a significant interaction of language and emotional expression (F(3, 2798) = 4.04, p = 0.007, η 2 p = .045, 90% CI = [.009, .084]; see Fig. 6). Whereas, for German-speaking   participants, there was no significant influence of emotional expression on A occurrences (F(3, 73) = 1.00, p = 0.400, η 2 p = .021, 90% CI = [.000, .057]), for Japanesespeaking participants, the influence of emotional expression was significant (F(3, 1,417) = 14.70, p < 0.001, η 2 p = .209, 90% CI = [.110, .296]). Specifically, for Japanese-speaking participants, A was more frequently used for people with happy  emotional expressions compared with all other examined expressions. Additionally, A was used more frequently for people with neutral expressions than for people with negative expressions. For the two negative expressions, the frequency of A did not differ (see Table 3). In sum, in contrast to German-speaking participants, for  Japanese-speaking participants, A was associated more with positive and less with negative emotional valence compared with neutral valence.

Discussion
The aim of the present work was to gain a deeper understanding of valence sound symbolism by comparing participants speaking two unrelated languages. Using the articulation-based explanation of valence sound symbolism (Körner & Rummer, 2022a;built on Rummer et al., 2014;Rummer & Schweppe, 2019), we predicted that I, because its muscle tension overlaps with smiling, would be associated with positive emotional valence, whereas vowels that involve antagonistic muscle tension (the rounded vowels O and U) would be associated with less positive emotional valence. When inventing names for people with different facial expressions, Japanesespeaking and German-speaking participants preferentially used I in names for people with happy facial expressions compared with both neutral and negative (angry and sad) facial expressions. Conversely, O and U were used less for people with happy facial expressions compared with neutral and negative expressions. None of these results were moderated by participant language, indicating that valence sound symbolism generalizes across the two employed languages: German and Japanese.
Another extension compared to previous research, which mostly examined two (e.g., Rummer & Schweppe, 2019;Yu et al., 2021) or three vowels (Körner & Rummer, 2022a), was that the present research examined occurrences of all five Japanese vowels. Exploratory analyses indicated that E is not strongly associated with emotional valence in either language as the usage of E did not differ across emotional expressions. However, A was associated with positive emotional valence for Japanesespeaking participants but not for German-speaking participants. Thus, except for A, the present results indicate that emotional valence associations in these two languages are similar.
The association of A with positive emotional valence for Japanese speakers (although not German speakers) might seem surprising because, in previous research, /i/ has been contrasted with another a-type vowel, /˄/, and the latter seemed to be a negatively associated vowel (Yu et al., 2021; for a similar result using syllables and /a/ instead of /˄/, see Tarte, 1982). However, the previously employed paradigm did not test whether /˄/ is associated with negative valence more strongly than with positive valence. In Yu et al. (2021), the task consisted in indicating whether a word containing /i/ compared with /˄/ was more positive. Therefore, it is possible that both vowels are associated with positive rather than negative valence, only /i/ more strongly than /˄/. Testing this reasoning in the present data, we find an interaction between vowel (I vs. A) and emotional expression (see the Supplementary Material), indicating that, for Japanese-speaking participants, I is more strongly associated with positive (compared with other) emotions than A. Thus, although A is more strongly associated with positive than neutral or negative emotional valence, this valence association is less strong than for I. In sum, both findings can be reconciled; /a/ and /˄/ could be sound symbolically less positive than /i/, but /a/ need not be associated with negative valence but instead could be neutral or somewhat positive in its valence association.
In general, the present results seem driven by positive (rather than negative) emotional valence. That is, vowel usage for faces with positive expressions were different from the rest, whereas there were neither significant differences between neutral and negative faces nor between the two types of negative faces. That is, in the present research, vowel usage was not influenced by arousal as it did not differ for anger (a high arousal emotion) and sadness (a low arousal emotion). The finding that positive instead of negative emotional valence drives valence sound symbolism is similar to the results reported in Rummer and Schweppe (2019), where the simple comparisons also generally resulted in significant differences between positive and both, neutral and negative valence, but no significant differences between the latter. Thus, rounded vowels were not specifically associated with negative valence but rather less strongly with positive valence. Although early research on valence sound symbolism postulated rather an association with negative valence than a less strong association with positive valence for rounded vowels (Rummer et al., 2014), the described mechanism is more consistent with the present findings. This mechanism rests on a facilitation (vs. inhibition) of smiling, specifically activation (vs. inhibition) of the zygomaticus major muscle. Rounded vowels, by involving the contraction of zygomaticus antagonists, are associated with less positive valence than other vowels but not with negative valence. In other words, valence sound symbolism seems driven by the contraction (vs. inhibition) of smiling muscles, so that positive valence drives the observed valence sound symbolism effect for I compared to rounded vowels.
The major caveat of the present research is that we examined vowels only on the (Latinized) grapheme level. Both examined languages have the five vowels: /a/, /e/, /i/, /o/, and /u/. The Japanese vowel system comprises these five vowels. Although the German vowel system is larger, the German vowels /a/, /e/, /i/, /o/, and /u/ are similar to the respective Japanese vowels. The only exception is /u/, which involves slightly different articulation; in German, /u/ is a close-back rounded vowel ([uː]), whereas in Japanese, /u/ is also close-back but unrounded ([ɯ̟ ]) or compressed ([ɯ̟ ᵝ]). Accordingly, when taking only the coarse five vowel grapheme distinction into account, Japanese and German vowels can be compared. Still, for more complete examination of vowel-valence associations, future research should examine vowels on the phoneme instead of the grapheme level. This would be useful for German and other languages that contain more vowel phonemes than graphemes, and it is imperative for languages, such as English, where there is no close grapheme to phoneme mapping.
The present manipulation uses pictures of emotional facial expressions. Positive facial expressions entail smiling so that facial mimicry might have led to participants' smiling when looking at positive expressions, which might in turn have facilitated I usage in pseudo-names for positive expressions. Although we cannot rule out this possibility in the present study, previous research has observed valence sound symbolism for /i/ compared to rounded vowels when mimicry was inhibited, for example, when participants invented names while holding a pen between their lips (which blocks contraction of the zygomaticus major; Rummer & Schweppe, 2019, Exp. 2); and when mimicry was impossible because no faces were presented, for example, when participants invented words for valenced objects (e.g., coffin vs. dolphin; Rummer & Schweppe, 2019, Exps. 3 and 4), or when participants judged the competence of a person known only by user name (Garrido & Godinho, 2021). Thus, although in the present study it might have increased the effect size, mimicry is not necessary for valence sound symbolism.
Although sound symbolism is a vibrant research area, the psychological mechanisms that drive sound symbolism are frequently unclear (Sidhu & Pexman, 2018). Probably the broadest distinction of mechanisms is between associations whose origin are incidental co-occurrences (also called conventional sound symbolism; Hinton et al., 1994) and associations that are driven by psychologically meaningful processes (synesthetic sound symbolism; Hinton et al., 1994). Incidental associations could stem from accidental clustering of specific sublexical features for related meanings. Statistical learning could then lead to associations between these word form features and the depicted meaning, which might, in turn, lead to an increasing number of words being coined and persisting that fit this association. Incidental clustering might be similar in related languages but should be less similar in unrelated languages.
In contrast, psychologically meaningful sound symbolism phenomena rest on general psychological processes that can result from ecological or embodied experiences (Körner et al., 2022). For example, pitch height overlaps between sounds emitted by small objects and high vowels (Ohala, 1984). Accordingly, size sound symbolism might originate from ecological co-occurrences between (small vs. large) object size and (high vs. low) auditory pitch elicited by objects or animals. Whenever these experiences are universal (independent of, say, geographic and cultural aspects), they should have a similar probability of leading to sound symbolic associations across unrelated language families. Conversely, finding that a sound symbolism phenomenon occurs in unrelated language families can be seen as an indication for a psychologically meaningful association. Thus, although statements about wider prevalence of valence sound symbolism require evidence from a much larger number of unrelated languages, the present result lends initial support for the argument that valence sound symbolism could reflect a psychologically meaningful association.
Data availability statement. All data, analysis scripts, and materials can be found at https://osf.io/bdrsh/.