Disentangling word stress and phrasal prosody: A view from Georgian

This paper investigates the interaction of word stress and phrasal prosody in Georgian by studying the distribution of acoustic cues (duration, intensity, F0) in controlled data. The results show that initial syllables in Georgian words are marked by greater duration than all subsequent syllables, regardless of syllable count and phrasal context. After excluding domain-initial strengthening as an alternative explanation, this finding provides evidence in favor of fixed initial stress. Likewise, initial syllables are marked by greatest intensity, but the consistent gradual drop in intensity throughout the word suggests that this effect may not be stress-related. The F0 results align with the existing accounts: individual lexical words form ACCENTUAL PHRASES marked by a low pitch accent on the initial syllable and a high final boundary tone on the final syllable. Additionally, new evidence for a phrasal accent, aligned with the penult, is presented. F0 targets are shown to be completely absent in the context of post-focal deaccenting, which shows that F0-marking in Georgian is reserved for phrasal prosody and is not intrinsic to stress-marking. These results help account for the facts related to word stress, phrasal intonation, and their interplay in Georgian, the object of debate in the literature.


Introduction
When it comes to the issue of stress in an understudied language, many questions emerge: does it have word stress?How is its placement determined?What acoustic parameter (syllable/vowel duration, 1 F0, intensity) does the realization of word stress chiefly rely upon?Answering these may not be straightforward.To provide just a brief illustration, word stress in Hunzib, a small Tsezic language of Dagestan, has been described as free and cued by F0 (Bokarev 1967: 474), fixed on the initial syllable and not relying on F0 (Gamzatov 1975: 18), penultimate and often accompanied by high F0, but not for all speakers (van den Berg 1995: 28), and initial but with numerous exceptions, driven by morphological factors (Isakov & Khalilov 2012: 78).Finally, it has been suggested that Hunzib does not have word-level stress, and, instead, initial syllables carry phrasal accents (Kibrik & Kodzasov 1990: 332).
As the example of Hunzib demonstrates, another important question concerns the interaction of wordlevel stress and the expression of phrasal intonation.This issue is a notoriously difficult one and has not been settled even in some better-studied languages.One reason for this is that in many languages individual words are isomorphic to the smallest prosodic phrases, which does not allow for a clean separation between the two.This issue has been shown to arise in languages as diverse as Korean (Jun 1993;1998) and Chickasaw (Gordon 2003;2004;2005), among many others.If lexical words and prosodic phrases correspond, the methodological challenge is to determine what constitutes a word-level or a phrase-level property.It is often assumed that, in languages without lexical tone, changes in F0 reflect phrasal prosody, while other parameters, such as duration and intensity, have a closer connection with word-level stress.This is not necessarily the case, however, since both intensity and duration can interact with F0.Specifically, intensity values are dependent on F0 values, and duration can increase, e.g., to accommodate for several tonal targets in cases of tonal crowding.Furthermore, greater duration can also result from domain-initial strengthening, which is independent from word-level stress or intonation (Fletcher 2010).
To take a closer look at another example, consider Korean.There is no agreement as to whether word stress exists in Korean, and, if it does, where its location is; initial or second (Lee 1973), second (Huh 1985), and final syllables (Polivanov 1936, as cited in Lee 1990) have been argued to regularly carry stress.More fundamentally, though, it is unclear if the phenomenon that has been labelled 'word stress' is actually wordlevel stress or phrase-level prominence.Contrary to the previous accounts, Jun (1993) shows that stress placement in a word depends on its position in an ACCENTUAL PHRASE (AP), which suggests that the phenomenon at hand is phrasal in nature.Based on instrumental evidence, Jun (1995) further shows that whether the initial or second syllable is perceived as prominent is determined by a combination of factors, such as syllable count, syllable weight, and the position of the phrase in a larger utterance; she also shows that the main acoustic cue that stress/prominence relies on is F0.Taken together, these factors suggest that what is described as 'word stress' in Korean, in fact, fits the profile of a phrasal intonational F0 target.
The language that this paper focuses on, Georgian (Kartvelian), also poses interesting challenges with respect to the questions raised above.2Like Korean, Georgian has been variably analyzed as having either word-level stress, phrase-level stress, or both.There is no unanimity about the acoustic cues that mark stress in Georgian.Initial (Tschenkeli 1958;Tevdoradze 1978, a.o.), antepenultimate (Akhvlediani 1949;Gudava 1969, a.o.), and penultimate (Zhghenti 1958) syllables have been described as carrying stress by different authors, with more than one stress locus argued to be possible in longer words.Native speakers of Georgian have no consistent intuitions about stress placement, other than that stress never targets the ultima.There are no minimal pairs based on stress and no regular variation in stress placement in declensional or conjugational paradigms.Authors who advocate for the existence of word stress in Georgian acknowledge its acoustic weakness and often remark on the uncertainty of their observations (Robins & Waterson 1952;Zhghenti 1959;Tevdoradze 1978).Finally, lexical words in Georgian regularly correspond to APs,3 which means that the challenges that arise from the isomorphism of individual words and prosodic phrases affect Georgian as well.
Based on experimental evidence, this paper argues for stress in Georgian being fixed on the initial syllable and cued primarily by syllable duration, and, possibly, intensity (the latter also interacts with phrasal prosody).With respect to F0, the findings confirm the facts about Georgian that have been established in the literature: APs in Georgian carry phrasal intonational F0 targets on initial and final syllables.In neutral broad-focus declaratives, initial syllables host low pitch accents (L*), which may be manifested as dips in F0 or low plateaus, and right edges of APs carry high final boundary tones.Additionally, there is evidence for penultimate syllables hosting low phrase accents (L), which constitute the lowest F0 point per AP.The latter result aligns with a previously established fact of Georgian intonational phonology: penultimate syllables of predicates in narrow focus contexts and questions carry a low F0 target, identified as a low phrase accent (Bush 1999;Vicenik & Jun 2014;Borise 2017).The penultimate F0 targets -the one described in the current paper and the low phrase accent described in the literature -likely constitute two subtypes of the same phenomenon.
This paper also addresses the challenge of isomorphism between lexical words and prosodic phrases and shows that there is a limited number of contexts in Georgian in which lexical words do not correspond to APs one-to-one.One such context is the parts of utterances that follow narrowly focused constituents.While post-focal deaccenting is optional in Georgian (Vicenik & Jun 2014), it is possible to find examples where it applies systematically and post-focal constituents are stripped of all tonal targets.Another context is the grouping of APs into INTERMEDIATE PHRASES (ips).This is also an optional process that may apply to semantically/syntactically connected adjacent words, e.g., a noun and a modifying adjective.When combined into an ip, individual APs retain their pitch accents, but phonologically, an extra layer of prosodic phrasing emerges between the levels of APs and full INTONATIONAL PHRASES (IPs).The results presented in this paper show that initial syllables still carry word-level stress, cued by duration, even if the word is (i) found in the domain of post-focal deaccenting or (ii) embedded in an ip.At the same time, the lack of tonal targets in the context of post-focal deaccenting convincingly demonstrates that F0 is only used in phrasal prosody in Georgian.
This paper is structured as follows.Section 2 discusses the existing research on word stress in Georgian, introspection-based (2.1) and experimental (2.2), presents the key properties of Georgian intonational phonology (2.3), and sums up the known facts (2.4).Section 3 provides the information about the experiment, the research questions (3.1), and the design and stimuli (3.2).Section 4 reports the results for syllable duration (4.1), intensity (4.2), and F0 patterns (4.3).Section 5 brings in additional data from a supplementary study (5.1), divided into datasets 1a and 1b (5.2-5.3) and datasets 2a and 2b (5.4-5.5).Section 6 offers a discussion of the results, and Section 7 concludes.

Georgian prosody: Previous work
The prosodic properties of Georgian have received considerable attention in the literature, with the existing descriptions based both on introspection and instrumental analysis.Nevertheless, there is no agreement as to the existence of word stress in Georgian or rules governing its distribution.Initial, antepenultimate and/or penultimate syllables are most often postulated as possible stress loci, with potentially more than one of these carrying stress in longer words.The question about the size/type of the prosodic domain that 'stress' in question is assigned in -i.e., whether it is a lexical/prosodic word or a larger constituent, such as a prosodic phrase -has not been settled either.

Introspection-based accounts
According to Tschenkeli (1958: LX), stress in Georgian is found on the initial syllable in di-and trisyllabic words, and is harder to locate in longer words, though there, too, it is often initial.Tevdoradze (1978: 40) also argues for fixed initial stress, but notes that secondary stress may occur in longer words: on the penult in tetrasyllables, antepenult in pentasyllables, and antepenult or pre-antepenult in hexasyllables.Antepenultimate stress placement is advocated by Ioseliani (1840: 145), Gorgadze (1912: 3), Akhvlediani (1949: 135), andGudava (1969: 106).Gorgadze notes that in longer words/phrases, the initial syllable receives secondary stress; if the antepenult consists of a vowel only, stress targets the pre-antepenult: sá.i.dum.lo'mystery', mí.i.rbi.na'(he) came running'.Zhghenti (1958: 262) describes the Khevsuri and Mokheuri dialects of Georgian as regularly assigning stress to penults.Parting from most other works, Robins & Waterson (1952: 58) propose that stress assignment in Georgian follows a rhythmic pattern (while avoiding final syllables), with alternating non-adjacent syllables carrying stress (primary or secondary): disyllables have initial stress, trisyllables can be stressed on the first or second syllables, tetrasyllables either carry stress on the second syllable or on the first and third syllables, and pentasyllables either stress the first and third or the second and fourth syllables.
In numerous accounts, Georgian stress placement is described as dependent on syllable count: initial in disyllables and antepenultimate or penultimate in longer words (Marr 1925: 13;Rudenko 1940: 24;Vogt 1971: 15), or initial in di-and trisyllables and antepenultimate in longer words (Dirr 1904: 3;Janashvili 1906: 5;Akhvlediani 1949: 132); Dirr (1904: 3) also notes that these rules apply regardless of the morphological structure of a word.In words over four syllables long, a secondary stress on the initial syllable is possible; both are obligatory in words over six syllables long.A similar approach is adopted by Skopeteas & Féry (2016), who take disyllables to carry stress on the first syllable, and words of four syllables or longer to carry primary stress on the antepenult and secondary stress on the initial syllable.According to Aronson's grammar (1990: 18), in words up to four syllables long, stress falls on the antepenult or the initial syllable, while in longer words both are stressed.Finally, according to Hewitt (1995: 28), in trisyllables, the initial syllable carries stress; in longer words, stress is either antepenultimate or initial.
In contrast, some maintain that Georgian does not have word-level stress, and its prosodic organization only includes tonal targets that are assigned within prosodic phrases.This view goes back to Gorgadze's (1912: 13) notion of 'syntactic groups', Marr's (1925: 14) 'accentual complexes', andZhghenti's (1953: 162;1963: 144) 'rhythmic groups' as domains of 'stress' assignment in Georgian.Some evidence supporting this view comes from traditional Georgian poetry, which is based on syllable count and not alternation of stressed and unstressed syllables (Gachechiladze 1968).Finally, so-called mixed approaches advocate for there being both word stress and phrasal F0 targets in Georgian.This view is maintained by Chikobava (1942: 302) and Tschenkeli (1958: LXI), who point out that word stress in contemporary Georgian is considerably weaker than phrasal prosodic targets.

Experimental investigations
There have been several experimental studies of Georgian prosody, both word-and phrase-level.The conclusions that have been reached vary, similarly to how introspection-based reports do.In one of the earliest studies, Selmer (1935) reports on an instrumental investigation of word stress in Georgian, based on recordings of one speaker pronouncing 27 Georgian words (twenty disyllables, six trisyllables, and one tetrasyllable), some iterated twice, with the total stimuli count being 36.The experimental items were uttered in isolation.Measurements of F0 curves and vowel duration are reported.Selmer (1935) notes that the initial syllable invariably carries an F0 peak, with the average rise being 2.64 st (semitones).In disyllables, the two vowels are almost equal in duration, while in trisyllables the second vowel is the shortest, with the two others being comparable in duration. 4With respect to vowel quality, Selmer notes that, in the initial syllable, vowel quality does not have a significant effect on vowel duration, which suggests that the durational effect on the initial syllable cannot be explained, e.g., by the intrinsically greater duration of low vowels.Overall, Selmer cautiously interprets his results as consistent with Vogt's initial assessment, later published as Vogt (1936;1971), according to which di-and trisyllables are stressed on the initial syllable.
According to the results of Zhghenti's (1953;1959) production experiment, all syllables in Georgian words other than the final two are high in prominence, which Zhghenti (1953;1959) takes to be expressed in F0 values and intensity, while the final two syllables (or the final one in disyllables) are less so.Zhghenti's results are based on the analysis of F0 traces of individual words two to six syllables long, uttered in isolation.The total number of stimuli or speakers is not reported, but a number of F0 traces are discussed: disyllabic (n=6), trisyllabic (n=6), tetrasyllabic (n=7), pentasyllabic (n=4), and hexasyllabic (n=2).Zhghenti interprets his results as indicative of initial stress in di-and trisyllables; he takes both the initial and the second syllable to be stressed in tetrasyllables, and refrains from interpreting the results for penta-and hexasyllables.Zhghenti's (1953;1959) results are summarized in Table 1.(Zhghenti 1953;1959) σ count Interpretation of measurements Stressed σ 2σ High F0 on the 1 st σ 1 st 3σ High F0 on the 1 st σ, high intensity on the 1 st and 2 nd σ 1 st 4σ High F0 on the 1 st & 2 nd σ, high intensity on the 1 st , 2 nd , and High F0 and intensity on the 1 st , 2 nd , and 3 rd σ ?6σ High F0 and intensity on the 1 st , 2 nd , 3 rd , and 4 th σ ?
Other experimental studies, such as Alkhazishvili (1959), argue that 'stress' placement in Georgian interacts with information structure.This suggests that the prosodic phenomenon in question is phrasal in nature, since word-stress placement typically does not depend on information-structural factors.Alkhazishvili (1959) argues that discussing 'stress' in Georgian is only possible with reference to particular information-structural contexts, and proposes that three such contexts determine 'stress' placement: • Type I: broad-focus utterances, with neutral word order (e.g, with transitive predicates, SOV) (1); • Type II: utterances with narrow focus on the preverbal constituent (2).
These utterance types, in Alkhazishvili's analysis, vary with respect to the distribution of "subject" and "predicate" prosodic phrases within them, which correspond to the more conventional notions of TOPIC and COMMENT, respectively.A comment includes the verb and the immediately preverbal focused constituent (if present), while a topic includes all other material in a clause.Stress placement within a phrase is determined by its type. (1) [Topic Giorgi-m] [Comment pex-i ar ga-a-ndzr-i-a.]Giorgi-ERG foot-NOM NEG PV-VER-move-SM-AOR.3SG5'Giorgi didn't move' (2) [Comment Omaraʃvil-ma da-i-xsn-a] [Topic gatʃ'irvebi-dan samartal-i.]Omarashvili-ERG PV-VER-save-AOR.3SG hardship-from court-NOM 'It was Omarashvili that led the court out of the difficult situation.' (3) [Comment Ga-vid-nen k'idev dɣeni da tveni.]PV-go-IPFV.3PL more day.PL and month.PL 'More days and months went by.' (Alkhazishvili 1959) The results are based on the analysis of 21 recorded utterances (Type I = 12, Type II = 6, Type III = 3), pronounced by a male and a female native speaker.Two phoneticians, one of whom was a native speaker of Georgian, acted as analysts.In topic phrases, which have an overall rising intonational pattern, initial stress was identified by the analysts, but it is not specified what acoustic cue this conclusion was based on.No acoustic evidence for antepenultimate stress was identified in topic phrases.In comment phrases, the picture is more complex.Here, in most cases, the analysts also identified initial stress, but about 20% of comment phrases were identified as having antepenultimate stress.Acoustic evidence for this judgement is scarce, and small sample size did not allow Alkhazishvili to reach a conclusion about its nature.The differences in stress perception between native and non-native speaker analysts are not reported.
Finally, Jun et al. (2007) and Vicenik & Jun (2014: 156) report on a preliminary production study, which involved four speakers of Georgian and test words embedded in carrier phrases (disyllabic = 3, trisyllabic = 4, pentasyllabic = 1) each uttered twice.They found that the initial syllable in Georgian is characterized by higher intensity and longer duration.They also report a high-low tonal contour that spans the antepenult and penult, which they take to be a manifestation of phrase accent.Based on these results, they suggest that word stress in Georgian is fixed on the initial syllable, while the antepenult and penult are loci of phrasal intonational F0 targets.Borise & Zientarski (2018) arrive at the same conclusion -initial word stress and phrasal F0 targets anchored to the right edge of prosodic domains -based on a larger dataset (one speaker, 179 words of 1-6 syllables embedded in carrier phrases).
Each prosodic word in Georgian -defined as a lexical word, which may be accompanied by clitics, like postpositions or discourse particles -forms an AP.This is based on the fact that prosodic words in Georgian carry final boundary tones, which means that they also form minimal prosodic phrases, such as APs.As part of the unmarked intonational pattern of all-new, broad-focus declarative utterances, each AP, except for the right-most one, carries a rising F0 contour.Vicenik & Jun (2014) analyze it as a low pitch accent L* on the initial syllable of the AP followed by a high final boundary tone on the final syllable, Ha (where 'a' indicates that the boundary is part of the AP).Typically, downstep applies to each successive Ha.This pattern is illustrated in Figure 1, with glosses provided in (4).Importantly for our purposes, the final high boundary tone in Figure 1 is phrasal, and not associated with word stress; the (steepest) rise in F0 and the F0 peak are typically contained within the final syllable.The full inventory of pitch accents and boundary tones can be found in Vicenik and Jun (2014).
A prosodic constituent larger than an AP is an intermediate phrase.Ip-formation in Georgian is optional and may apply to two or three syntactically/semantically related adjacent APs, such as a noun and an adjective/demonstrative.Each of the APs within an ip retains its own pitch accent, but the presence of an ip is signaled via a change in the F0 pattern: non-ip-final APs carry an H* La tonal pattern instead of L* Ha. 7 The ip-final AP carries an L* pitch accent and an H-final boundary tone, which overrides that of the AP.Ip-formation is illustrated in (5a) and Figure 2, where the ip corresponds to the nominal phrase lamazma kalbat'on-ma 'beautiful-ERG lady-ERG' (boxed).Lack of ip-formation is illustrated in (5b) and Figure 3, where lamaz-ma 'beautiful-ERG' and kalbat'on-ma 'lady-ERG' form two independent APs; note that the OV vs. VO order in (5a) vs. (5b) has no bearing on the intonational contour of the subject.While the presence of an ip does not break the isomorphism between lexical words and prosodic phrases (APs), being embedded in another layer of prosodic phrasing may affect the phrasal prosodic make-up of the APs contained in an ip -e.g., because of it being adjacent to a higher-level prosodic boundary.Ip-containing contexts will be investigated in Datasets 1a-b of the supplementary study.Finally, the largest prosodic constituent, an intonational phrase (IP), corresponds to a clause and typically carries a final low boundary tone L%, as shown in Figures 1-5 (other IP boundary tones, like H% and HL%, are attested in other contexts; for a full tonal inventory proposed for Georgian, see Vicenik & Jun (2014)).
The prosody of focus-marking in Georgian is not going to be discussed in detail here (see Skopeteas, Féry & Asatiani 2009;Skopeteas & Féry 2010;2014;Borise 2019, a.o.), but one of the focus-related phenomena that has a bearing on breaking the isomorphism between lexical words and prosodic phrases is post-focal deaccenting/dephrasing.In many languages, the part of the utterance that follows a narrowly focused constituent exhibits no/little evidence for the presence of phrasal tonal targets (Ladd 1980;1996;Jun 1993;Rahmani, Rietveld & Gussenhoven 2018, a.o.), or, if the tonal targets are present, the tonal range is dramatically reduced (Harnsberger & Judge 1996).If analyzed as eliminating tonal targets, including phrasal boundaries, post-focal deaccenting provides a context where the isomorphism between words and prosodic phrases does not hold.As such, it is an environment that allows to distinguish word-vs.phraselevel prosody.In Georgian, post-focal deaccenting is optional: the constituents that follow a narrowly focused one may either retain their tonal targets (often with a reduced tonal range), or be stripped of their tonal targets (Vicenik & Jun 2014: 179).The former is illustrated in (6a) and Figure 4, and the latter in (6b) and Figure 5; in Figure 4, creaky voice does not allow for pitch tracking on the final word tve-ʃi 'month-LOC'.The subscript 'F' in (6) indicates narrow focus.The L pitch target on the verb in Figure 4 is a phrase accent, discussed in detail in the remainder of this section.Also, note that the focused subjects form ips with the verbs that follow them.8There is a strong preference for preverbal focus placement in Georgian, and focused constituents often form a prosodic constituent with the verb.In such cases, post-focal deaccenting applies to the material that follows the verb.In addition to pitch accents and boundary tones, Vicenik & Jun (2014) argue for there being another F0 target in Georgian -a phrase accent.In the AM literature, phrase accents have been variably analyzed as boundary tones for mid-level prosodic phrases (e.g., phonological phrases), or as F0 targets found between the rightmost pitch accent and a final boundary tone (Bruce 1977;Pierrehumbert 1980;Ladd 1983;Grice, Ladd & Arvaniti 2000). 9The distribution and properties of phrase accents in Georgian have been addressed in several studies, and it has been established that they anchor to penultimate syllables, in certain contexts (Bush 1999;Müller 2005;Vicenik & Jun 2014;Borise 2017).Penultimate placement makes phrase accents reminiscent of antepenultimate/penultimate word stress, as postulated by some analyses.
Let us briefly review the properties of Georgian phrase accents.In a study of prosody of yes/noquestions in Georgian, Bush (1999) notes that penultimate syllables of verbs in yes-no questions are always marked by a low tone before a sharp rise on the final syllable.He takes the low F0 target on the penult to be (part of) a phrase accent, though notes that the precise anchoring of the low tone to the penultimate syllable is atypical of a phrase accent.Müller (2005) in her study of the prosodic right periphery in yes/noquestions in Georgian also notes the low F0 target on the penult of the yes/no-question on the whole and takes it to be a phrase accent.Note that, since the questions in Bush's study were verb-final, Müller's findings also replicate those of Bush.Similarly, Vicenik & Jun (2014) analyze the penultimate syllable in yes-no questions, wh-questions, and narrow focus contexts as carrying a phrase accent.They specify further that the low target on the penult is preceded by a high target on the antepenult, and analyze the phrase accent as H+L.Finally, the issue of the phrase accent is also taken up in Borise (2017), where two conclusions are reached.First, it is shown that the phrase accent is associated with the verb, as opposed to the right edge of the question.This is demonstrated with the help of questions in which the verb is clause-initial or -medial, but the low F0 target is still associated with the verb.Second, it is shown that the high F0 target on the antepenult does not always accompany the low tone on the penult, which confirms the intuition that the low tone constitutes the main tonal element of the phrase accent.In line with these conclusions, phrase accents are labelled L in this paper.The realization of the phrase accent is illustrated in (7) and Figure 6 (as well as in Figure 4 above).( 7) ʃe-tʃ'am-a Manana-m alubal-i?PV-eat-AOR.3SGManana-ERG cherry-NOM 'Did Manana eat the/a cherry?'Most importantly for our purposes, the existence of a phrase accent in Georgian is another factor to consider when interpreting the instrumental results.

Summary
There is no unanimity on the nature and distribution of stress in Georgian, and stress interacts with phrasal prosody in a complex way.It seems uncontroversial that shorter words (di-and trisyllables) carry stress on the initial syllable, but the picture is less clear in longer words, where many authors note the presence of another stress locus on the antepenult or penult, with no agreement as to which of the loci carries main stress.Additionally, factors like information structure may systematically affect the F0 contours, which also influences the perceived location of word stress.Lexical words are commonly isomorphic to APs, which makes it difficult to distinguish word-and phrase-level prominence, but there are two contexts in Georgian where the isomorphism may not hold: (i) the domain of post-focal deaccenting and (ii) ipformation.Several types of tonal targets are distinguished, including pitch accents, boundary tones, and phrase accents.

Research questions
Cross-linguistically, the expression of stress typically relies on duration, F0, and intensity-related measures (Beckman 1986;Gordon & Roettger 2017) (Sluijter & van Heuven 1996).In the existing studies, duration is most often mentioned as an acoustic cue for stress, followed by F0, intensity, and formant and spectral qualities (Gordon & Roettger 2017).In turn, the expression of phrasal prosody commonly relies on F0 targets, aligned with stressed syllables or edges of prosodic domains, though it may also be cued by other effects: most commonly, final lengthening (Edwards, Beckman & Fletcher 1991), initial strengthening (Hock 1988), and/or glottalization (Dilley, Shattuck-Hufnagel & Ostendorf 1996).The questions that this paper aims to answer, therefore, are the following: Is there consistent evidence that certain acoustic cues (F0 targets, durational and intensity effects) are realized on the syllables that have been identified as possible stress loci in Georgian (initial, antepenultimate, penultimate)?
(ii) How is the distribution of these cues best interpreted phonologically (i.e., as word-or phrase level)?
(iii) How does this evidence relate to the existing descriptions and earlier experimental results available for Georgian?
The experiment addresses (i) and allows for formulating preliminary responses to (ii).The evidence brought forward by the supplementary study provides further evidence relevant for answering (ii).The data from both investigations is considered to address (iii).

Stimuli and design
The experiment builds on previous instrumental work, especially Vicenik and Jun (2014) and Borise & Zientarski (2018).The stimuli consisted of Georgian words (n=172) two to five syllables long, of CV syllable structure, where onsets contain a single voiced consonant (a nasal, liquid, or a voiced stop or fricative), and the vowel is any of the five vowels of Georgian /i, u, ɛ, ɔ, a/.CV syllable structure was selected for two reasons.First, to avoid complex onsets, which are very common in Georgian and often include both voiced and voiceless consonants, which would disrupt pitch tracking.Second, to unify syllable weight across stimuli since syllable weight has not been mentioned as relevant for stress assignment in the existing literature on Georgian.The test words included nouns, adjectives, and participles.10Given the observation, made in the existing grammars, that morphological structure is not a relevant factor in stress placement in Georgian, both mono-and polymorphemic stimuli were used.A representative sample of the stimuli is provided in Table 2.A full list of stimuli is available in the supplementary materials.11The stimuli were embedded in one of three carrier phrases.In them, the test word acted as the object in an SOV clause, and corresponded to an AP: Seven native speakers of Georgian participated in the study: two males (M1, M2) and five females (F1-F5). 12All participants were natives of Tbilisi, with the age range 22-35 y.o., mean age 26.8 y.o.Speaker M2 was recorded in Tbilisi, Georgia, speaker F2 in College Park, Maryland, and the other five speakers in Cambridge, Massachusetts.Of the speakers recorded in the US, two (F1 and F2) have lived there for eight and seven years, respectively, but still use Georgian on a daily basis with their families.Speaker F5 arrived in the US a year prior to the recording time, and speakers M1, F3 and F4 arrived in the US less than a year prior to the recording time.The data from speakers F2 and M2 was collected using a lavalier microphone in a quiet classroom; the other five speakers were recorded in a sound-proof booth using a head-worn microphone.All data was recorded at the sampling rate of 44.100 Hz and 16 bits per sample; both microphones had the frequency response range 20Hz -20kHz.
The same set of stimuli was used for all speakers.Each speaker was provided with a randomized list of all stimuli and, as a separate list, the three carrier phrases.They were instructed to pick each word from the list, one by one, insert it into one of the carrier phrases, also picked consecutively, and pronounce the resulting combination.The two lists were kept separate to avoid the effect of reading.Each stimulus was iterated three times in a row -i.e., each type contributed three tokens. 13These two factors -different verbs accompanying different experimental stimuli, and the fact that each stimulus was iterated three timeswere intended to discourage a contrastive reading on the consecutive stimuli.The stimuli that were pronounced with list intonation -the first and/or the second iteration of the carrier phrase with the same stimulus -were discarded; rising F0 at the end of the carrier phrase was taken to be the main indicator of list intonation.Since no additional context was provided for the stimuli embedded in the carrier phrases, their information-structural status was taken to be that of neutral/broad focus declaratives. 14fter eliminating disfluent tokens (due to pauses, speech errors, throat clearing, etc.), the final dataset consisted of 1,146 word types, which equals to 3,252 word tokens and 11,388 syllables.A breakdown of the dataset by speaker is provided in Table 3, 15 and by syllable count in Table 4.The data was manually annotated in Praat (Boersma & Weenink 2021) by trained research assistants, who labelled words and syllables using the segmentation criteria in Machač & Skarnitzl (2009).Duration, maximum intensity, and mean F0 of each syllable, as well as F0 at four fixed points throughout a syllable (25%, 50%, 75%, right edge) were measured using a Praat script based on Elvira-García (2014); all F0 measurements were performed in Hz.Note that the intensity data excluded data from speakers F2 and M2 because they were recorded with a lavalier microphone with gain normalization; the total item counts used for the investigation of intensity are provided in Table 10 in Section 4.2.Statistical analysis of the data was carried out using the glmer function in the lme4 package (Bates et al. 2015) for R (R Core Team 2020).

Results
Typical responses, for the sake of illustration, are provided in Figure 7 and Figure 8.

Duration
The duration results are visualized in Figure 9, and mean values per syllable are provided in Table 5.They show that the initial syllable has greater duration than all subsequent syllables in words of all syllable counts (two to five syllables). 16 16 Syllable duration rather than vowel duration alone was measured here because the durational effect of stress may affect the consonant(s) in the stressed syllable, either in addition to or instead of the vowel: e.g., stressed syllables are marked by lengthened onsets in Estonian (Gordon 1995;Lehiste 1966) and codas in Welsh (Williams 1999).For more examples and discussion, see Gordon and Roettger (2017).The inherent durational differences between onset consonants of different types (stops, liquids, and nasals) are captured by the statistical model with the random intercept ITEM (which also captures any inherent differences between vowel types).
Before carrying out the statistical analysis, a mixed-effects model was fit to the data to test for the effect of polysyllabic shortening (Lehiste 1972).SYLLABLE DURATION was taken as the dependent variable, SYLLABLE NUMBER (1 st , 2 nd , etc; categorical factor) and SYLLABLE COUNT (per word) as fixed effects and SPEAKER and ITEM as random intercepts. 17A main effect of SYLLABLE COUNT was obtained (χ²(3)=88.7,p<0.0001).Therefore, to control for the effect of SYLLABLE COUNT, manifested as polysyllabic shortening, the model testing for durational differences, introduced below, was run separately for words of each syllable count, in order to have a group-specific intercept for each group.
For a mixed-effects model analysis, SYLLABLE DURATION was taken as the dependent variable, SYLLABLE NUMBER as a fixed effect and SPEAKER and ITEM as random intercepts.The model was run three times, with the initial syllable, penult, and antepenult (the positions described as locations of stress in the previous literature) as intercepts.Including random slopes improved the fit for some of the models: SYLLABLE NUMBER by SPEAKER and by ITEM for disyllables, SYLLABLE NUMBER by SPEAKER for trisyllables, and SYLLABLE NUMBER by ITEM for tetrasyllables.As the results in Table 6, Table 7, and Table 8 show, only initial syllables are significantly different in their duration from all other syllables in words of all syllable counts. 18This corresponds to the picture presented in Figure 9 and Table 5, where the main durational difference is shown to be between the initial syllable and all subsequent syllables.Note that in Table 7 and Table 8 the syllable counts where (ante)penults correspond to initial syllables are omitted because these results are presented in Table 6. 17Including random slopes led to non-convergence of the model. 18The duration of stay in the US did not give rise to significant inter-speaker differences.A mixed-effects model with DURATION (of the initial syllable, antepenult, or penult) as a dependent variable, SPEAKER LOCATION (US LONG-TERM vs. OTHER) as a fixed effect, and SPEAKER and ITEM as random effects did not detect any significant differences between speakers from different locations (initial: p=0.54, β=0.08, t=0.61; antepenult: p=0.65, β=-0.06, t=-0.46; penult: p=0.55, β=-0.08, t=-0.6).Similarly, a model with F0 as a dependent variable (other parameters held constant) did not detect significant differences between the two populations either (initial: p=0.42, β=-0.21, t=-0.8; antepenult: p=0.42, β=-0.21, t=-0.82; penult: p=0.56, β=-0.15, t=-0.59).No significance was obtained for intensity, either (other parameters held constant; initial: p=0.17, β=-0.078,t=-1.38;antepenult: p=0.17, β=-0.09,t=-1.37;penult: p=0.34, β=-0.07,t=-0.96).Anticipating the discussion of the duration results in Section 6, two follow-up tests were carried out to exclude alternative accounts of the results.The first one checked whether the durational effect on the initial syllable is driven by vowel quality; the second one verified whether this effect results from an increase in the duration of the consonant, vowel, or both segments.
The first test checked whether non-high vowels occur more frequently in initial syllables than in subsequent syllables, which would affect the results, since non-high vowels have inherently greater durations.For this, the initial syllables in the dataset were coded for vowel height: NON-HIGH ([a, o, e]) and HIGH ([i, u]).The mean duration of NON-HIGH initial syllables equals 220 ms, and that of HIGH initial syllables is 210 ms.A linear mixed-effects model, with DURATION as a dependent variable, VOWEL HEIGHT as a fixed effect, and ITEM and SPEAKER as random intercepts, revealed no significant difference between the two vowel heights (p=0.12,β=0.04, t=1.55).Accordingly, the durational effect on the initial syllable cannot be explained away as stemming from vowel quality. 19 The second test checked the relative contribution of the consonants and vowels in initial syllables to their duration.To this end, relative duration of the segments in the initial syllable was compared to that of the segments in the second syllable.A subset of stimuli from the experiment (n of tokens = 1,228), representative of the full dataset, was selected, and the initial and second syllables were annotated for 19 There is some evidence that a minimal durational difference between vowels of different heights is expected in Georgian: Shosted & Chikovani (2006: 262) found that Georgian /u/ and /ɔ/ largely overlap, and, likewise, /i/ and /ɛ/ have similar F1 values.Given that vowel height is a predictor of inherent vowel duration, the fact that (some of) the high and non-high vowels have comparable F1 values is likely to contribute to the results.segment duration. 20Since all syllables in the study are of CV shape, the ratio of the two segments in different syllables can be compared.The segment duration results are provided in Table 9:  27) 83 ( 20) 96 ( 24) As Table 9 shows, the increase in duration that the initial syllables receive, as compared to the second syllables, does not result from an increase in the duration of just one segment -instead, both receive an increase in duration.A linear mixed effects model, with SEGMENT DURATION as a dependent variable, SEGMENT POSITION as a fixed effect, and ITEM and SPEAKER as random intercepts, shows that the duration of consonants in the initial syllables is significantly greater than that of consonants in the second syllables (p<0.0001***,β=-0.22, t=-22.15).Vowels in the initial syllables are also significantly greater in duration than vowels in the second syllables (p<0.0001***,β=-0.1, t=-10.98).These results confirm that the greater duration of the initial syllable does not stem from greater duration of the initial consonant alone.The significance of this fact is further taken up in Section 6.

Intensity
Table 10 provides the numerical breakdown of the subset of the data used for the study of intensity (with the data from speakers M2 and F2 excluded).The intensity results (maximum intensity per syllable) are shown in Figure 10, and averaged values per syllable are provided in Table 11.They show that the initial syllable has greater intensity than all subsequent syllables in words two to five syllables long (note that while the magnitude of differences is small, it is likely to be perceptually substantial). 20A representative subset of the full dataset was created in the following way.15 words of each syllable count (di-, tri-, tetra-, and pentasyllables; total=60/172 tokens) were selected for by-segment annotation.Since the first two syllables were the target of the additional analysis, their parameters were the crucial selection criteria.With respect to the C1V1C2V2 template, words with both identical and non-identical C1 and C2 and V1 and V2 were selected (in roughly equal ratios, i.e., per sample of stimuli of a given syllable count, ca.¼ = CxVaCxVa, ¼ = CxVaCyVa, ¼ = CxVaCxVb, ¼ = CxVaCyVb).For a mixed-effects model analysis, MAX INTENSITY (per syllable) was taken as the dependent variable, SYLLABLE NUMBER (1 st , 2 nd , etc.) as a fixed effect and SPEAKER and ITEM as random intercepts.The model was run separately for words of each syllable count, with the initial syllable, penult, and antepenult as intercepts.Including random slopes improved the fit for some of the models: SYLLABLE NUMBER by SPEAKER and by ITEM for disyllables, and SYLLABLE NUMBER by SPEAKER for trisyllables.
As the results in Table 12, Table 13, and Table 14 show, most differences in intensity are significant or borderline significant.Note, though, that Table 13 demonstrates lack of significance between the penult and ultima, which aligns well with what we see in Figure 10 and Table 11: the penult and ultima consistently have comparable intensity values.Significant differences between other adjacent syllables correspond to the gradual drop in intensity between the syllables other than penult and ultima, also shown in Figure 10 and Table 11.

F0 values
Figure 11 visualizes the F0 contours that span the test words of different syllable counts, and Table 15 provides the mean F0 values per syllable.As Figure 11 shows, words of all syllable counts have an overall falling-rising F0 contour.The F0 target on the initial syllable (L*) is realized as a sharp dip from the high target (Ha) on the preceding word, and the high target on the ultima (Ha) is realized as a steep rise.In words of all syllable counts, the lowest F0 values are found on the penult.Therefore, the penultimate syllable acts as a turning point between the falling and rising subparts of the F0 contour.The L* Ha tonal contour is expected to span APs in broadfocus declaratives, as discussed in Section 2.3, but the low target on the penultimate syllable has not been discussed in this context before.Its distribution, though, fits well with that of a low phrase accent, which targets penultimate syllables of predicates in focus contexts, as discussed in Section 2.3.
For a mixed-effects model analysis, MEAN F0 (per syllable) was taken as the dependent variable, SYLLABLE NUMBER as a fixed effect and SPEAKER and ITEM as random intercepts.As was the case with the duration and intensity measurements, including random slopes improved the fit for some of the models: SYLLABLE NUMBER by SPEAKER and by ITEM for disyllables, and SYLLABLE NUMBER by SPEAKER for trisyllables.The model was run separately for words of each syllable count.In turns, the initial syllable, penult, and antepenult were taken as intercepts.
As the results in Table 16, Table 17, and Table 18 show, most comparisons turned out to be significant, except those between the initial and second syllables in tri-, tetra-and pentasyllabic words, second and third syllables in disyllabic words, and the third and fourth syllables in pentasyllabic words.To sum up, the results show that greater duration consistently marks initial syllables, but not penults or antepenults.The intensity values fall gradually throughout the word and are the lowest on the penult and ultima.The F0 contours attest to the presence of low F0 targets on the initial syllable (L*) and the penult (L), and a high one on the ultima (Ha).These F0 results differ somewhat from the typical declarative broadfocus F0 contour, described in Section 2.3 as L* Ha, based on Vicenik & Jun (2014).

Stimuli and design
The test words in the experiment regularly correspond to APs, as was shown in ( 8).Accordingly, the isomorphism between words and phrases does not allow for distinguishing between word-and phrase-level prosodic effects.To circumvent this, as well as the potential confound created by the final Ha at the end of a preceding AP, data from a supplementary study was additionally brought in.The study had been conducted to elicit natural word orders and prosodic realizations in different focus contexts, but, importantly for the purposes of the current paper, the data also contains examples of ip-formation and post-focal deaccenting.
The study consisted of 30 scenarios, based on 14 transitive and 16 intransitive verbs, such as A girl picked flowers last summer or Birds returned home last spring, rendered as picture prompts.Personal names and common nouns (Mariami, children, etc.) were used as subjects and objects.Lexical items containing no/few voiceless segments were preferred but non-CV syllables and consonant clusters were allowed.For each scenario, five questions were constructed, aimed at eliciting broad, VP-, subject, and object focus, as well as contrastive focus on either subject or object.A sample picture prompt is provided in Figure 12, with the questions listed in (9).What did the girl pick last summer?c.
Who picked flowers last summer? d.
What did the girl do last summer?e.
Did the girl pick cherries last summer?' Participants were presented with picture prompts on a laptop screen (pseudo-randomized), each accompanied by a question, and were tasked with answering the question based on the picture.They were instructed to speak clearly, use natural intonation, and avoid single-word replies, but otherwise were free to construct their own responses.Eight native speakers of Georgian participated in the study: two males (M3, M4) and six females (F6-F11), with the age range 20-35 y.o, mean age 26.9 y.o.All speakers were from Tbilisi and had a complete or in-progress university degree.The recordings were performed in Tbilisi, Georgia, using a head-worn microphone.All data were recorded at a sampling rate of 44.100 Hz and 16 bits per sample.
For the purposes of the current paper, four datasets were compiled from the data obtained during the focus study, used here as a supplementary one.First, to ensure full comparability of the results with those from the main experiment, only nominals, two to five syllables long, of CV syllable structure, where onsets consist of a single voiced consonant were considered.Dataset 1 was compiled to investigate the prosody of nouns that form part of an ip together with a modifying adjective.Because of the restrictions on the phonological shape of the noun, only one nominal phrase from the supplementary study was selectedgogo 'girl' accompanied by the adjective mousvenari 'mischievous, antsy' -which appeared in two scenarios: A mischievous girl stole cherries last summer and A mischievous girl fell from a ladder last week.In total, the phrase mousvenari gogo occurred in the participants' responses 49 times.These occurrences were further restricted to those in which the phrase mousvenari gogo occurred (i) not under narrow focus (i.e., in broad, VP-, object, or contrastive object focus), and (ii) not clause-finally (i.e, in SOV or SVO clauses), which yielded 35 occurrences.Of these, in 8 instances the words mousvenari and gogo were realized as independent APs (cf. Figure 3) and were also excluded.The remaining 27 occurrences of mousvenari gogo formed ips (cf. Figure 2), and the realizations of gogo were selected for the same analysis as nominals in the main experiment as Dataset 1a.
To supplement Dataset 1a, following a suggestion from a reviewer, nominal phrases that do not adhere to the CV condition on syllable structure were also considered, which allowed for analyzing the nominal phrase lamaz-ma kalbat'on-ma 'beautiful-ERG lady-ERG'.Keeping the remaining restrictions in place (the nominal phrase occurring not under narrow focus, not clause-finally, and as a single ip) yielded 33 occurrences of lamaz-ma kalbat'on-ma 'beautiful-ERG lady-ERG'.The realizations of the noun kalbat'onma, extracted from these examples, formed Dataset 1b.Given the variable syllable composition, which can impact duration values, the noun was annotated by segment (in addition to the by-syllable annotation), and only the intensity and F0 values of the vowels were analyzed.Datasets 1a and 1b were analyzed separately.
Dataset 2 was aimed at investigating the prosody of post-focal deaccenting.For this, objects in the context of subject focus or contrastive subject focus, in SFVOX word orders, were selected ('X' stands for a clause-final adverbial that buffers the object from the right-edge of the clause, to ensure lack of clausefinal effects on the object).In Dataset 2a, the same restrictions on the phonological shape of the noun applied as did with Dataset 1a, and only one noun, banan-ebi 'banana-PL', was selected for analysis.It occurred as the object in SFVOX word orders 12 times, of which, in 2 instances, it was not deaccented (cf. Figure 4).These occurrences were excluded, and the remaining 10 instances, in which banan-ebi 'banana-PL' was deaccented (cf. Figure 5) were subjected to the same analysis as nominals in the main experiment.
Additionally, following a reviewer's suggestion, nouns that do not have CV syllable structure but otherwise adhere to the same restrictions as those in Dataset 2a were extracted from the results of the supplementary study.Several such nouns were available, all trisyllabic (43 occurrences in total): alubl-eb-s 'cherry-PL-DAT', mankana-s 'car-DAT', muraba-s 'jam-DAT', murab-eb-s 'jam-PL-DAT', q'vavil-i 'flower-NOM', q'vavil-eb-s 'flower-PL-DAT'.They were pooled together, annotated by segment, and the intensity and F0 values of vowels were analyzed (duration was left out, as in Dataset 1a, given the mismatches in syllable structure).Datasets 2a and 2b were analyzed separately.A summary of the four datasets is provided in Table 19.

Dataset 1a (ip-formation)
A typical realization of an ip, formed by the noun gogo 'girl', accompanied by the adjective mousvenari 'mischievous', is provided in Figure 13, with glosses in (10): Because gogo 'girl' is a disyllabic word, and the initial syllable is the only baseline for comparison, the three acoustic parameters (duration, intensity, F0) will be considered side by side.Figure 14 provides the visualization of the acoustic parameters, and the mean values are summarized in Table 20.The durational and F0 values of the two syllables in gogo exhibit a pattern comparable to that of disyllabic stimuli in the main experiment (except for the lack of a sharp fall from the preceding Ha on the initial syllable), but not the intensity values: here, the initial syllable is lower in intensity than the second one.
Given the small size of the Datasets 1a-b and 2a-b, no statistical analysis was carried out; while this makes the evidence more tentative, the overall trends are clear from the figures and summary tables.

Dataset 1b (ip-formation)
A typical realization of the noun kalbat'on-ma 'lady-ERG' and adjective lamaz-ma 'beautiful-ERG', when they form a single ip, was provided in (5a) and Figure 2, repeated here as (11) and Figure 15, for convenience.
(11) ((Lamaz-ma)AP (kalbat'on-ma)AP)ip (k'aba)AP (mo-i-zom-a)AP.beautiful-ERG lady-ERG dress.NOM PV-VER-try-AOR.3SG'A beautiful lady tried on a dress.'Due to variable syllable structure, only vowels (and not full syllables) in kalbat'on-ma 'lady-ERG' were considered, and only the (maximum) intensity and (average) F0 per vowel were measured.These parameters are visualized in Figure 16, and the mean values are summarized in Table 21.The pattern of the intensity values is somewhat similar to that obtained for tetrasyllabic words in the main experiment: the initial syllable has the greatest intensity, which then falls throughout the word; in contrast with the main experiment, it rises again on the ultima.The rise on the ultima is also reminiscent of the intensity pattern on gogo 'girl' in Dataset 1a.The averaged F0 pattern closely follows the F0 shape found on individual examples, like Figure 15, and that found in Dataset 1a: a flat F0 contour on the first three syllables is followed by a sharp rise on the ultima (the absence of the F0 fall on the first syllable is, again, due to lack of a preceding Ha).One of the typical realizations of banan-ebi 'banana-PL', when in the domain of post-focal deaccenting, is provided in Figure 17, with glosses and translation in (12).Note that the verb in Figure 17 is deaccented together with the other post-focal material, and does not carry pitch targets, unlike in Figure 5. Examples of both kinds -i.e., those where the verb was part of the focal and post-focal domains -were represented in the sample.

Manana-ERG
VER-buy-AOR.3SGbanana-PL-NOM last week-DAT 'MananaF bought bananas last week.'The duration and intensity results are visualized in Figure 18 and an averaged F0 contour is provided in Figure 19.Duration, maximum intensity, and mean F0 values, averaged per syllable, are provided in Table 22.The duration results are comparable to those obtained for tetrasyllables in the main experiment, in that the initial syllable has greater duration than all other syllables, though there is more variability in the durational values of the subsequent syllables.The intensity values show a similar trend to those of tetrasyllables in the main experiment, in that the initial syllable has the greatest intensity, and the final two syllables have similar intensity values.With respect to F0, the fall in F0 within the initial syllable is due to the presence of a high final boundary Ha on the preceding verb in some of the stimuli (cf. Figure 5).There is no evidence for there being F0 targets.The words that comprised Dataset 2b (alubl-eb-s 'cherry-PL-DAT', mankana-s 'car-DAT', muraba-s 'jam-DAT', murab-eb-s 'jam-PL-DAT', qvavil-i 'flower-NOM', qvavil-eb-s 'flower-PL-DAT'), being deaccented postfocally, received realizations similar to that of banan-ebi 'banana-PL' in Dataset 2a.This was illustrated for muraba-s 'jam-DAT' in (6b) and Figure 5, repeated as (13) and Figure 20, for convenience.
(13) ('Who made jam last week?')((BebiaF)AP (a-ket-eb-d-a)AP)ip muraba-s ts'ina k'vira-s.grandma.NOM PV-make-SF-SM-IPFV.3SGjam-DAT last week-DAT 'GrandmaF made jam last week.'As in Dataset 1b, due to mismatches in syllable structure, the acoustic parameters of individual vowels, not syllables, were considered, and included only intensity and F0 measurements.They are visualized in Figure 21 and the summary of the mean values is provided in Table 23.As in the previous datasets, intensity values are greatest on the initial syllable and drop on subsequent syllables.The F0 contour is low and flat, presenting no evidence for F0 targets.Let us summarize the results of the supplementary study (Datasets 1a-b and 2a-b).When embedded into an ip, an AP/lexical word is still characterized by greater duration of the initial syllable and a fallingrising or rising F0 pattern.The intensity marking of the initial syllable was no longer there in the disyllabic gogo 'girl' but is present in the longer kalbat'onma 'lady'.In turn, after undergoing post-focal deaccenting, lexical words carry a flat F0 contour, but initial syllables are still marked by greater duration and intensity.

Discussion
Let us bring together the results of the main experiment and the supplementary study.The most consistent finding is that the initial syllable has greater duration than all subsequent syllables, in words of all syllable counts and in all phrasal contexts.The fact that this is the case in words that form individual APs (the main experiment), APs that are part of an ip (Dataset 1a-b), and words that do not form an AP, as in post-focal deaccenting (Dataset 2a-b), demonstrates that the durational effect is independent of phrasal prosodic phenomena.By itself, though, this does not yet mean that the durational effect is necessarily linked to wordlevel stress.Alternatively, greater duration of the initial syllable can also result from initial strengtheninga phonetic process that applies to left edges of prosodic domains and is independent of stress.However, the two phenomena have different phonetic signatures: initial strengthening affects the duration of the absolute initial segment but does not extend to the vowel in the initial syllable, while word stress can contribute to greater duration of the initial syllable on the whole (Fougeron & Keating 1997;Byrd, Krivokapić & Lee 2006;Barnes 2008).Therefore, the relative contribution of the consonant and vowel to the duration of the initial syllable was also investigated in Section 4.1.The results in Table 9 show that the greater duration of the initial syllable results from greater duration of both the consonant and the vowel, as compared to their counterparts in the second syllable.This would be unexpected on account of initial strengthening alone, which only targets the absolutely domain-initial (here, word-initial) segment.With initial strengthening as an alternative explanation ruled out, the most fitting interpretation for the durational effect on the initial syllable is that it marks word-level stress.Note that the current findings contrast with Selmer's (1935) results, who found that, in disyllables and trisyllables, the initial and final syllables were of equal duration.This discrepancy is likely due to the fact that Selmer's stimuli were not embedded in carrier phrases and were subject to phrase-final lengthening.
The initial syllable is also often (though not always) marked by greatest intensity, as compared to all other syllables, but there is also a more pronounced difference in intensity values between each syllable and the subsequent one.At the same time, intensity rises on the ultima in Datasets 1a and 1b -i.e., in those cases where the ultima carries an ip-final high H-tone.This suggests that intensity may also mark the location of word-level stress, but, unlike duration, intensity values are affected by and 'parasitic on' the phrasal prosodic context.
In terms of F0 properties, the results from the main experiment and Datasets 1a and 1b align: in both, individual words carry an F0 contour that steeply rises on the final syllable (and may include a fall on the initial syllable, if preceded by a Ha).This aligns with the existing analysis of Georgian prosody (Vicenik & Jun 2014), according to which individual APs in Georgian carry an L* Ha tonal contour.Likewise, ipfinal APs in Datasets 1a and 1b carry an L* H-contour (similar in shape, but with the final boundary tone marking the right edge of the ip).Additionally, the data discussed here presents evidence for another low tonal target on the penultimate syllable, the turning point between the falling and rising parts of the contour.Note that the F0 results are comparable to Zhghenti's (1953;1958) results, who found that F0 values drop on the penult of his stimuli.The fact that the ultima, in Zhghenti's results, did not have high F0 values, unlike in the current study, is likely due to the fact that his stimuli were produced in isolation and carried a low phrase-final tone.Finally, the words in Datasets 2a and 2b presented no evidence for tonal targets.This is expected in the context of post-focal deaccenting and, most importantly, shows that F0 targets in Georgian are phrasal/post-lexical in nature.
The results from the main experiment provide evidence that the penultimate syllable in Georgian is reserved for F0 targets that are part of the right-edge intonational make-up of a phrase, which is independent from stress (cf.Gordon 2014).The next question is what type of an F0 target it is.The inventory of F0 targets available in autosegmental-metrical theory includes pitch accents, phrase accents, and boundary tones (Liberman 1975;Bruce 1977;Pierrehumbert 1980;Ladd 1983;Gussenhoven 1984;Pierrehumbert & Beckman 1988, a.o.).Pitch accents are anchored to syllables that carry a degree of stress, while boundary tones are aligned with edges of prosodic domains.It is unlikely that the F0 target on the penults is a pitch accent: pitch accents align with stressed syllables, but there is no evidence from the distribution of duration and intensity that penults in Georgian carry stress.It is equally unlikely that the F0 target on the penults is part of a complex final boundary tone, 'crowded out' to the penultimate syllable.This is because final syllables in Georgian can accommodate bitonal boundary tones (e.g., HL%) without placing one of the tones on the penult (Vicenik & Jun 2014;Borise 2017).
The remaining prosodic category is a phrase accent.As discussed in Section 2.3, penultimate syllables in Georgian, in certain informational-structural conditions (questions and narrow focus contexts), carry a low phrase accent L. I suggest that the resemblance between phrase accents, described in the literature, and the low F0 target found on penultimate syllables in the current study is not accidental.Given their identical distribution, the most parsimonious approach to the two F0 targets is to treat them as two subtypes of the same phenomenon.This, in turn, means that the distribution of the phrase accent in Georgian is broader than has been described before and is not limited to predicates in questions and narrow focus contexts.The exact information-structural conditions that may influence its distribution merit further investigation.
If viewed against the backdrop of the existing literature, which allowed us to hypothesize that Georgian stress may be initial, penultimate, or antepenultimate, the current results demonstrate that Georgian has initial word-level stress, while the penult houses a phrasal tonal target.Notably, the results provide no evidence for antepenultimate stress.This is consistent with Alkahzishvili (1959: 402), who, similarly, found no acoustic evidence for antepenultimate stress.
These conclusions have implications for the theory of word stress.Recall that stress in Georgian is morphophonologically 'inert': speakers do not have consistent intuitions about stress placement, and stress placement is not subject to regular variation in declensional/conjugational paradigms.The fixed nature of Georgian stress, as advocated here, may be a contributing factor to this 'inertia'.Languages with fixed stress are known to have a weaker acoustic expression of stress than languages with variable/contrastive stress placement (Cutler 2005;Dogil 1999;Fónagy 1966;Janota 1967;Jassem 1962;Rigault 1970).Speakers of languages with fixed stress have weaker intuitions about stress placement (including in other languages) -the so called 'stress-deafness' (Dupoux, Peperkamp & Sebastián-Gallés 2001;Peperkamp & Dupoux 2002;Peperkamp, Vendelin & Dupoux 2010, a.o.).The fact that Georgian falls into the category of languages with fixed stress, therefore, may explain the reported 'weakness' of Georgian stress, and lack of consistent intuitions on behalf of the speakers.More broadly, these results suggest that other languages described as having 'weak' stress may, in fact have fixed, morphophonologically 'inert' but still identifiable stress (instead of not having the category of stress at all).

Conclusion
Based on instrumental evidence, this paper identified and provided interpretation for some of the acoustic cues that are regularly utilized in Georgian prosody: duration, intensity, and alignment of F0 targets.First, it established that initial syllables in Georgian words are marked by greater duration than subsequent syllables.This is the case in words two to five syllables long, when they form independent APs, APs that are embedded in an ip, or are subject to deaccenting/dephrasing in the post-focal portion of the utterance.After excluding initial strengthening as an alternative explanation, this effect is best phonologically interpreted as cuing stress, fixed on the initial syllable.This result aligns well with the existing literature on Georgian prosody, which consistently lists the initial syllable as the locus of prosodic prominence and a possible stress locus.The intensity results support the evidence form duration, in that initial syllables both in independent APs and in deaccented words often have greater intensity than all subsequent syllablesthough the intensity values also drop throughout the word, which questions the link between intensity and stress.Also, in contrast with duration, the intensity values are sensitive to phrasal prosodic context: e.g., ipfinal syllables have consistently higher intensity.This shows that, while also acting as a cue for word stress, intensity-marking interacts with prosodic phrasing.With respect to F0 effects, the results confirmed the presence of a low pitch accent on initial syllables and a high final boundary tone at the right edges of APs, as described in the literature.They also showed that penultimate syllables carry a low intonational F0 target.
Based on what is known about the phrasal prosody of Georgian, this F0 target is best analyzed as a phrase accent.Finally, the results showed that post-focal deaccenting eliminates F0 targets (but not durational or intensity effects).This demonstrates that F0-marking in Georgian is reserved for phrasal prosody and is not intrinsic to stress-marking.Overall, the findings contribute to distinguishing word-and phrase-level prosodic phenomena in Georgian and resolving the conflicting views that surround word stress placement in the language. 6

Figure 4 :
Figure 4: The intonational contour of an utterance with narrow focus on Mamuka (personal name), with no post-focal deaccenting

Figure 11 :
Figure 11: Mean F0 values at four points per syllable in stimuli of all syllable counts; smoothed at 0.6; tick marks correspond to the left edges of respective syllables (e.g., '1' marks the left edge of the first syllable).

Figure 12 :
Figure 12: Sample picture prompt (9) a.What happened last summer?b.What did the girl pick last summer?c.Who picked flowers last summer?

(
Figure 13: A realization of gogo 'girl' as part of an ip

Figure 17 :
Figure 17: A realization of banan-ebi 'banana-PL' within the domain of post-focal deaccenting

Figure 20 .
Figure 20.The intonational contour of an utterance with narrow focus on bebia 'grandma', with post-focal deaccenting on muraba-s 'jam-DAT'

Figure 21 .
Figure 21.Intensity and an average F0 contour (smoothed at 0.6) of words in Dataset 2b.

Table 1 :
Stress placement in Georgian according to syllable (σ) count . Stressed vowels/syllables commonly have greater duration than unstressed ones (De Jong & Zawaydeh 1999; Eriksson & Heldner 2015; Garellek & White 2015 a.o.).Stressed syllables may also carry an F0 target or a particular tonal contour, though it is not always clear whether this F0 specification is lexical (word-level) or post-lexical (phrasal) in nature.Stressed syllables may be identified by intensity-based measurements, such as overall intensity (Remijsen & Van Heuven 2005; Vogel, Athanasopoulou & Pincus 2016), or frequency-sensitive intensity

Table 3 :
The final dataset broken down by speaker

Table 4 .
The final dataset broken down by syllable count

Table 5 :
Mean syllable duration (ms) and standard deviation (in brackets, ms) in words 2-5 syllables long

Table 7 :
Results of statistical analysis of syllable duration with the penultimate syllable as the intercept

Table 8 :
Results of statistical analysis of syllable duration with the antepenultimate syllable as the intercept

Table 9 :
Duration of individual segments in the initial and second syllables (ms) and standard deviation (in brackets, ms)

Table 10 :
The dataset used for intensity measurements, by syllable count

Table 11 :
Maximum syllable intensity, averaged per syllable number (dB), and standard deviation (in brackets, dB) in words 2-5 syllables long

Table 15 :
Mean F0 values (Hz) and standard deviations (in brackets, Hz) per syllable in stimuli of all syllable counts

Table 18 :
Results of statistical analysis of F0 for all syllables, with the antepenult as the intercept

Table 19 :
Summary of the four datasets based on the supplementary study

Table 23 .
Duration (ms), maximum intensity (dB), and mean F0 values (Hz), averaged per syllable in the words in Dataset 2b (with standard deviations bracketed)