This site is for NowPhon 1, 2015.  For NowPhon 2, 2016, please go here: https://blogs.uoregon.edu/nowphon2016/

June 4

EMU Gumwood Room

9:00 – 9:45 – Registration / breakfast

9:45 – 10:00 – Introductions

10:00 – 10:30 – Melissa A. Redford (U of Oregon) — Speaking: A functional approach

Speaking is an intentional activity. It is also a complex motor skill; one that exhibits protracted development and the fully automatic character of an overlearned behavior. Together these observations suggest an analogy with skilled behavior in the non-language domain. This analogy is used here to argue for a model of production that is grounded in the activity of speaking and structured during language acquisition. The focus is on the plan that controls the execution of fluent speech; specifically, on the units that are activated during the production of an intonational phrase. These units are schemas: temporally structured sequences of remembered action and their sensory outcomes. Schemas are activated and inhibited via associated goals, which are linked to specific meanings. Schemas may fuse together over developmental time with repeated use to form larger units, thereby affecting the relative timing of sequential action in participating schemas. In this way, the hierarchical structure of the speech plan and ensuing rhythm patterns of speech are a product of development. Individual schemas may also become differentiated during development, but only if subsequences take on additional meaning. The necessary association of action and meaning gives rise to assumptions about the primacy of certain linguistic forms in the production process. Overall, schema representations connect usage-based theories of language to the action of speaking.

10:30 – 11:00 – Bryan Gick (U of British Columbia) — Phonetic universals live in the body, not the mind

This talk begins with basic observations showing that humans don’t need a brain, or experience, to produce some of our most complex motor behaviors. Experimental results and simulations using biomechanically realistic models in the ArtiSynth platform (www.artisynth.org; e.g., Fels et al. 2003, Stavness et al., 2012, Gick et al. 2014) will be presented showing that the body offers only a very small inventory of possible reliable postures for speech sounds and emotional expression. A model is described based on recent research in neurophysiology in which this inventory comprises independent neuromuscular structures, or “modules” in the body. I show how these modules emerge through use as part of a learner’s strategy to optimize the biomechanics of speech production, and how similar biomechanical properties may be seen in a range of body structures used for speech, including the lips/face, palate, larynx and tongue. Note that this model gives robust and highly predictive results with no brain, no experience, and no anatomically defined body parts/articulators (e.g., “lips”, “jaw”, etc.). The result is a theory of embodied phonetics built on an inventory of highly specialized body structures, each of which is “discovered” by the nervous system and used to serve a specific phonetic function.

11:00 – 11:30 – Michal Temkin Martinez & Ivana Müllner (Boise State U) — A production study of spirantization in Modern Hebrew verbs

In this talk, we report the results of an experiment examining variation in the production of Modern Hebrew Spirantization (MHS). MHS is characterized by the alternation of the stops [p], [b], and [k] with [f], [v], and [χ], respectively. Fricatives generally occur in post-vocalic position and stops occur elsewhere. This alternation is especially noticeable in verbal paradigms where a specific segment within a root may occur in different syllable positions, as in [lifgoʃ] ‘to meet’ and [pagaʃ] ‘he met’. However, there are exceptions to MHS. Exceptional segments are non-alternating [p], [b], [k], [f], [v], and [χ] which, for historical reasons, can surface as stops in post-vocalic position or as fricatives elsewhere. Some exceptional segments are represented differently from their alternating counterparts (i.e. alternating [k]/[χ] is represented by one grapheme, non-alternating [k] is represented with another, and non-alternating [χ] by yet another). The frequency of exceptions to MHS in the modern lexicon has led to the acceptability of non-alternation in segments that ought to alternate (Adam 2002, Temkin Martinez 2010). In a perception experiment, Temkin Martinez (2010) found that variation was more acceptable in post-consonantal position than in other positions. In the current experiment, 48 native speakers of Modern Hebrew participated in a sentence-completion task containing both real and nonce verbs. Variation patterns in the production of both real and nonce verbs matched those reported in Temkin Martinez (2010), with post-consonantal position driving the effect of word position. Additionally, patterns for nonce verbs indicated higher instances of non-alternation for segments for which there are different orthographic representations for alternating and exceptional segments.

11:30 – 12:00 – Sarah Greer & Steve Winters (U of Calgary) — The perception of coolness: the social uses and interpretations of creaky voice

Many researchers have noted that young, female speakers of contemporary English commonly use creaky voice (e.g., Yuasa, 2010; Wolk et al., 2012; Podesva, 2013), perhaps even to a greater extent than in previous generations (Greer & Winters, 2014). We were interested in learning whether women use this voice quality in a different way (or to a different extent) than men, and also what might be motivating women to use it more than has previously been observed in the past. To this end, we examined the social uses and interpretations of voice quality with respect to 6 social styles: coolness, authoritativeness, youthfulness, attractiveness, femininity, and masculinity. In a production task, we asked speakers to produce a set of sentences in each of these styles, to quantify the extent to which creaky voice indexed each style. In a companion perception task, listeners then evaluated whole sentences along these dimensions in a “direct comparison task”: they heard the same sentence, produced by the same speaker, in different voice qualities, and then were asked to specify which production was “cooler,” “more authoritative”, etc. The results of the production task showed that both sexes used creaky voice to index “coolness” and “masculinity”, even though female speakers produced significantly more creaky voice than men. In the perception task, listeners judged stimuli containing greater percentages of creaky voice as “cooler” and more “masculine”. Additionally, perceived “attractiveness” aligned closely with perceived “femininity” and “masculinity” in women and men, respectively. The combined results of both studies therefore show that, ironically, creaky voice is perceived as a “masculine” voice quality, even though it is used more often by women than men. The increased use of creaky voice by young women may thus stem from a desire to appear more “masculine”, so that they may tap into the sociolinguistic status afforded to men.

12:00 – 1:30 – Lunch

1:30 – 2:00 – Benjamin Tucker (U of Alberta) — How do listeners process spontaneous speech?

Research on speech comprehension has largely focused on “laboratory speech”, speech elicited from a speaker by asking them to read a list of words, sentences or stories (Cutler, 1998; Warner, 2012). Research on laboratory speech has uncovered many aspects of speech perception and comprehension, however it is severely limited, in that we know very little about the process in which actual conversational speech is recognized. Relatively little research in phonetics and psycholinguistics has focussed on spontaneous, conversational speech (with the exception of sociophonetic research), the type of speech most often found in day-to-day communications (Ernestus & Warner 2011). One example of this difficulty from spontaneous speech is that speakers often produce “reduced” speech, for example, a phrase like “Do you have to?” could be produced so that when the sounds are transcribed they result in something like: [dætǝ] (additional audio examples of reduction can be found at: <http://goo.gl/0MN2es>).  This type of reduction is extremely common in everyday speech, a recent study of conversational speech found that on average 25% of the words differ from their “correct” dictionary pronunciation (Dilts, 2013). Research on this topic has continually shown that, reduced speech is more difficult to process (e.g., Ernestus et al., 2002; Tucker, 2011; Van de Ven et al., 2012). I describe and discuss several investigations of speech production and spoken word recognition which explore aspects of how  listeners recognize spontaneous speech.

2:00 – 2:30 – Charlotte Vaughn & Tyler Kendall (U of Oregon) — Bootstrapping techniques for vowel formant estimation

Recent years have seen increased focus on methodologies for acoustic vowel study. Much work has focused explicitly on techniques for vowel analysis, such as through the development and evaluation of vowel normalization procedures (e.g., Clopper 2009, Fabricius et al. 2009, Flynn 2011, Thomas and Kendall 2007), resulting in more rigorous methods. However, sources of error exist in other facets of acoustic vowel research and these other potential problems have been addressed less frequently. Specifically, it is clear that there are limitations in the accuracy and precision of vowel measurements (e.g., Harrison 2004, 2007), as a function of linear predictive coding (LPC) methods, as well as, of course, noise in the acoustical signal being studied. For instance, different measurement points and different LPC settings (such as for the formant analysis procedure in Praat) are known to yield different results (Boersma & Weenink 2013, Duckworth et al. 2007). Researchers are generally well aware of the need to consider inter-analyst differences in their acoustic work. Yet, less research has explicitly or quantitatively studied the extent to which these differences matter for the outcome of an investigation (Duckworth et al. 2007, Harrison 2004, 2007). In this presentation, we consider the sources of error in common formant extraction techniques, investigating the extent to which the delimitation of vowel boundaries and software (Praat) settings influence the formant values obtained. To do this, we report on the results of a vowel measurement simulation where, rather than extracting a single measurement for each vowel, thousands of measurements are taken for each vowel with varied settings in jittered measurement locations (seeded by measurements from a human analyst) and vowel tokens are treated as distributions of probable formant frequencies instead of simple points or vectors in scatter plots. Such a consideration, we argue, yields important insight into the bounds of measurement error in vowel analysis.

2:30 – 3:30 @ 255 Straub Hall – Coffee and pastries

3:30 – 5:30 @ EMU Oak Room – Business Meeting

5:30 – 7:30 @ EMU Gumwood Room – Posters & Reception

7:30 – … @ Falling Sky Brewing House  – Dinner

June 5

145 Straub Hall

9:30 – 10:00 – Coffee and bagels

10:00 – 10:30 – Melissa Baese-Berk (U of Oregon) — Disruptions to perceptual learning of non-native speech sounds

The relationship between speech perception and production is complex. There is preliminary evidence suggesting that simply producing tokens during training can disrupt some aspects of perceptual learning (e.g., Leach & Samuel, 2007). I will present a series of experiments examining factors that may influence this disruption. Specifically, I examine whether this disruption is truly due to producing tokens during training, or is instead attributable to shifting between perception and production tasks during training. Further, we examine whether experience with the contrast may alleviate the disruption. Native Spanish listeners were taught a non-native phonemic contrast in Basque. In Experiment 1, we examine discrimination performance after training in perception alone, or training in perception+ production; in the latter, listeners repeat tokens on every trial. Listeners are either naïve, having no experience with the trained contrast, or are late learners who have spoken Basque for several years. In Experiment 2, rather than repeating the training tokens, listeners read an unrelated letter aloud on each perceptual training trial. In Experiment 3, listeners respond to the unrelated letter with a button press, rather than reading it aloud. The results suggest that the disruption of perceptual learning is influenced by multiple factors, including experience with the contrast and the production of the specific training token.

10:30 – 11:00 – Ashley Farris-Trimble (Simon Fraser U) — No activation without representation! The PhoProLab’s look at representations and how we access them

In order to recognize a word, the listener must match the incoming speech signal with a lexical representation. This task, already a complex one, can be obscured by phonological processes that affect the phonetic representation. In this talk, I present the results of several experiments that examine how listeners process words in which multiple phonological processes interact opaquely. I will also examine whether a participant’s production of opaque forms is related to their perception of the same words. I’ll close with a discussion of the methods and pitfalls in studying abstract representations, both in adults and in children’s developing phonologies.

11:00 – 11:30 – Gunnar Hansson (U of British Columbia) — Long-distance phonotactics: Constraints, learning, and constraints on learning

I will describe different strands of an ongoing research project which investigates to what extent the cross-linguistic typology of non-adjacent interactions in segmental phonology (e.g. consonant harmony, long-distance dissimilation) is shaped by cognitive limitations—inductive biases, heuristics, etc.—on the ability of learners to detect and internalise such dependency patterns. From a formal-theoretical perspective, one strand of this work suggests that current constraint-based analyses, which posit an abstract relation of surface correspondence as the vehicle for non-adjacent phonological interactions (e.g. my own previous work on consonant harmony), are on the wrong track. Instead, a return to some notion of “tier” or “projection” seems warranted, though as a property not of phonological representations (as in autosegmental phonology, e.g. feature geometry, underspecification, etc.) but rather of the phonotactic constraints themselves. Moreover, such notions may find support in recent developments within computational phonology (e.g. the Tier-based Strictly Local class of formal languages). A different strand of the project involves using experimental methods to uncover and explore relevant learning biases in a controlled laboratory setting, such as with artificial language learning experiments or speech perception experiments. I will briefly report on a few studies in this vein (some complete, others underway or in the planning stages) which investigate the sensitivity of non-adjacent phonotactic dependencies to such factors as locality relations, trigger-target similarity or the nature of intervening segmental material.

11:30 – 12:00 – Kaori Idemaru (U of Oregon) — Beer or pier? Online tuning of phonetic categories in perception

Speech processing requires sensitivity to long-term regularities of the native language, yet demands that listeners flexibly adapt to perturbations that arise from talker idiosyncrasies such as a non-native accent. This talk describes the statistical learning of correlations between acoustic dimensions that define phonetic categories as a mechanism for tuning in to the specific characteristics of the current input. In our experimental paradigm, native English listeners listen to “accented” words (e.g., beer and pier) in a word recognition task. In these “accented” words, the way fundamental frequency (F0) relates to voicing categories (signaled by voice onset time, VOT) is manipulated such that the F0/VOT correlation in these words is the opposite of what listeners are accustomed to in their long-term experience with English. Our results show that short-term experience with this kind of speech influences listeners’ online perception voicing categories: the role of the F0 dimension defining stop voicing categories is eliminated or greatly diminished. This indicates a very rapid online learning of the distributional statistics pertaining to acoustic dimensions as they define speech categories in the immediate speech context. Our results also show that this learning is contingent on experience: listeners do not readily transfer the learned pattern to the processing of a new speech category or even the same speech category appearing in a new lexical context.  This type of learning that operates at the fine-grained level of acoustic dimensions may explain the adaptive nature of our phonetic representations, which enable invariant perception of a variable speech signal.

11:30 – 1:30 – Lunch

1:30 – 2:00 – Molly Babel (U of British Columbia)  — Attention and expectations in abstractions versus phonetic detail

Theories about speech seem to have moved beyond an abstract versus phonetic detail battle. Instead, researchers acknowledge the mutual existence of multiple levels of linguistic knowledge. In a model of language that accommodates episodes and abstraction, the question then veers towards one of when does a listener exploit one level of representation over another? The literature suggests that phonetic variability, competing task demands, and cognitive load all contribute to which level of representation “wins” in a particular perceptual experience. In this talk I provide an overview of some recent experiments from my lab that examine the when and why in the negotiation of abstraction and episodes. The results suggest that listeners’ goals and expectations along with task demands and instructions moderate the use of phonetic detail in speech perception.

2:00 – 2:30 – Volya Kapatsinski (U of Oregon) — Addition and subtraction

I report on ongoing work exploring the role of first-order product-oriented schemas in the acquisition of alternations. Previous work (Kapatsinski 2013, Lg) has shown that examples of singular-plural pairs like butʃ~butʃi favor mit~mitʃi (better match to the form-meaning pairings observed in training) over mit~miti (better match to the change), suggesting an important role for generalization within paradigm cells in the acquisition of alternations (see also Pater & Tessier 2006, ICPhS). Current work extends the investigation to addition and deletion alternations. For example, I show that about half of the participants presented with a simple deletion language exemplified by baluka~baluk, sinoku~sinok, kareku~karek, lanipi~lanip would prefer kupa~kupak over kupa~kup as long as the first-order schema CVCVk characterizes most plurals. Most participants exposed to a simple k-addition language exemplified by balu~baluk, sino~sinok, kare~karek, lani~lanik, would prefer kupaka~kupak over kupaka~kupakak, suggesting extraction of a CVCVk or CVCVC schema. These results continue suggesting generalization over form-meaning pairings within paradigm cells (Bybee 2001, Kapatsinski 2013), i.e. cell-specific phonotactics or first-order schemas. However, the fact that half of the participants in the deletion language do learn and consistently apply a simple deletion rule suggests that the timecourse of learning proposed in Kapatsinski (2013) and Nesset (2008) for acquiring morphophonology is in need of revision: second-order schemas (arbitrary paradigmatic mappings) do not have to involve mappings between previously acquired first-order schemas and may compete with first-order schemas during production. The results also provide possible diachronic pathways from addition to subtraction and vice versa.

2:30 – 3:00 – Wrap-up

3:00 – … – Socializing