Posted on 2014/11/14 by

Poetry As Waveforms, An Experiment

The Idea

Faced with the task of engaging with waveforms, I decided to use a program that I have used for many years now for the purpose of audio recording: Adobe Audition 3.0. My idea was to have my peers and colleagues read three different poems, record the readings with Audition, and then compare how the waveforms differed from person to person for each poem. I decided upon William Wordsworth’s “I Wandered Lonely As A Cloud,” William Shakespeare’s “Sonnet 130,” and finally the first forty lines of John Milton’s Paradise Lost.

The Setup

An eight year old Toshiba Satellite laptop with a pirated version of Adobe Audition 3.0. A Tapco LINK firewire 4X6 audio interface (basically the channel between the microphone and the computer). I kept the input gain on the audio interface set at 3/4 for every recording. An EV N/D767a vocal microphone using an XLR to XLR connection. Room 666 of the Concordia University Library building in Montréal, QC. November, 2014

The Method

Encountering people at random on the sixth floor of the Library Building, I posed to each one of them the simple question, “Want to read some poetry?” If they agreed, I brought them to the recording room and handed them the poems which I had printed previously on plain white paper all in the same font size and style. I then instructed each person to stand three inches away from the microphone and to begin reading. The readers were only allowed to look at what texts they would be reading seconds before they began reading them. I also instructed them to continue reading even if they made a mistake. I did not tell any of the readers any details about or the purpose of my project until they had finished reading (not that I was so sure about what the purpose was myself!). Each reader read the three poems in the same sequence – Milton, Wordsworth, Shakespeare.

The Process

The process of this boot camp was actually quite entertaining. Many individuals became quite anxious at the sight of a microphone and at least three out of the twelve participants left the room fanning themselves in an attempt to relax and cool down. The most entertaining incident was with my friend and colleague Alanna Bartolini who jokingly compared the process to both a “shotgun wedding” and “being Shanghaied,” yet then after reading, declared her will to hunt down others so that they could (jokingly) suffer a similar fate, which she did indeed do. I recorded twelve individuals and then decided to stop as I believed I had plenty of data with which to work.

The Participants

With the exception of Judith Herz and David McGimpsey whom Alanna hunted down and so they were not obtained randomly… Carolyn Gilmore, Courtney Church, Megan Fitz-James, Stephi Stavropoulos, Lauren Turner, Alanna Bartolini, Jess Elkaim, Ian MacDonald, Meredith Evans, and one male volunteer who decided to remain anonymous – we’ll call him John.

The Waveforms

Please note that some of these visuals “overlap” one another.

Paradise Lost Waveforms

PL 1 Alt

PL 2

PL 3

Wordsworth Waveforms

WW 1

WW 2

WW 3

Shakespeare Waveforms

130 1

130 2

130 3

Observations

Time

Okay down to business. At first glance of the waveforms of the different readings of Paradise Lost, the most obvious characteristic one might notice is how the time length of each waveform is quite similar from student to student. Alanna was the fastest reader with a time of 1:04 and Lauren was the slowest with a time of 1:24. Interestingly, every Concordia professor read this poem even slower than Lauren – Judith and Meredith each finished at the same time of 1:30 and David finished at 1:37. Though Judith had to run off and so she only did the Paradise Lost recording, Meredith and David both also exceeded the longest student reading in both of the subsequent recordings of Wordsworth and Shakespeare.

Amplitudes

I suppose the second most prominent characteristic of these waveforms is how the amplitudes of the waveforms differ from person to person in all three recordings. Anonymous John’s wave amplitudes are literally off the charts. Judith, Ian, Megan, Courtney and Carolyn’s wave amplitudes are most often significantly larger than those of David, Stephi, Lauren, Alanna, Jess and Meredith. I’ve concluded from the live observation of these recordings that these differences in amplitudes can be attributed to each reader’s enthusiasm – the more enthusiastic each reader was, the louder their voice was, and so they created larger wave amplitudes. Yet, enthusiasm does not necessarily mean a more accurate reading. Judith, Alanna, David and Meredith’s readings of Paradise Lost were all technically more accurate than the rest in the sense that they all pronounced every word of PL correctly whereas in every other reading there were at least one or two errors. Yet, out of these “perfect” readers, only Judith’s wave amplitudes are significantly larger. I should also note that Alanna’s reading was the quickest, and the professors’ readings the longest, yet it was these four people that read PL “perfectly.”

Spikes

Interestingly, many of the PL waveforms are riddled with spikes in wave amplitude, especially those of Megan, Courtney, Stephi and Carolyn, whereas in the WW and Shakespeare recordings, spikes in amplitude are few and far between in general. Listening to these spikes, I determined that some of them can be attributed to a strong “P” sound – many musicians actually use pop guards or pop filters when recording vocal audio to prevent these particular amplitude spikes (fun fact: back in my earliest recording days I followed my friend’s advice to stretch some dollar store leggings across a wire coat hanger to create a make-shift pop guard). Yet, other than these particular spikes, many of them can be attributed to a simple, seemingly random increase in vocal volume.

Patterns

Another glaring characteristic about the PL waveforms is that they are, for the most part, totally dissimilar to each other. If one looks at the vocal pauses in said waveforms (represented by the points in each waveform where the wave amplitude = 0), the patterns of the rise and fall of individual sections of the waveform, and finally the spikes in amplitudes, I find it very difficult to describe any sort of similarities or common patterns across different readers’ waveforms. However, if one takes a look at the waveforms of the Wordsworth reading – it is very easy to see a common pattern. Every person spoke in such a way to create a waveform that repeats a pattern of high to low wave amplitude – this pattern is perhaps most obvious in Jess, David and Megan’s Wordsworth waveforms. If one listens to said waveforms, one will recognize that this repeating rise-fall pattern is actually representative of each line of WW’s poem – in other words, readers began each line of the poem with a loud volume which would then slowly taper off as readers reached a quiet volume by the end of each line. Additionally, a similar pattern can be detected in the waveforms for the Shakespeare reading, yet it is not nearly as obvious or universal as in the WW reading. This pattern can be seen clearly in anonymous John, Jess and Courtney’s Shakespeare reading, but the other Shakespeare readings have relatively constant wave amplitudes throughout. However, amazingly, some of the Shakespeare waveforms show a constant “up-down, up down” amplitude fluctuation (for lack of a better explanation) seemingly revealing visually the unstressed, stressed syllable pattern of iambic pentameter that resides in Shakespeare’s sonnet. This pattern is especially apparent in Megan, Jess and Ian’s reading of Sonnet 130. One can also observe this pattern in the PL recordings, particularly in Ian and Jess’s waveforms. Finally, readers seem to have higher average amplitudes in the later recordings than they do in the earlier recordings, particularly Lauren, Meredith, and Alanna.

Speculations

Time

In each and every reading, professors took longer to read than even the slowest reading student. Given this fact, we might speculate that university professors have become accustomed to a slower, instructional style of speech, one that a listener can follow, take notes on etc. We also might speculate that professors are used to reading poetry in a dramatic style – indeed, Judith Herz’s reading of PL is spine-tingling! I have posted it below. This dramatic style, filled with pauses and eloquent, un-rushed speech necessarily takes longer to perform. Yet in terms of accuracy of pronunciation, Alanna’s reading suggests that a quicker reading can still be an eloquent, accurate reading, though perhaps not as dramatic.

Judith Herz’s reading of PL. I have converted it to mp3 format for compatibility.

Amplitudes

If you want a summary of what I learned about amplitudes, it is as follows: Enthusiasm = higher vocal volume = bigger wave amplitudes. Riveting. Here’s what I found more interesting: As far as I am concerned, there is no doubt that Judith Herz’s reading of PL created in me the largest affective response of all the PL readings. Alanna expressed that she had “chills” after Judith had finished reading, and I must admit, for reasons I cannot explain, that I could not help but to keep my hands over my mouth to stifle some unknown emotion as she read. Let us recall that Judith’s reading was the only “perfect” reading with large wave amplitudes throughout. Other readers did show wave amplitudes that were of similar height as Judith’s amplitudes, but there are very few pauses in their recordings and their recordings contain errors in pronunciation. There is something about the combination of good reading, high wave amplitudes and short pauses that lends Judith’s reading its superior dramatic quality.

Spikes

Why are the PL readings riddled with amplitude spikes and other readings relatively stable in terms of amplitude? If we accept that PL is much more difficult to read than “I Wandered…” or “Sonnet 130,” it would seem that the more difficult the text is (indeed, I chose the first forty lines of PL in hopes that readers would flub their readings), the more mistakes are made and the more spikes we have in the waveforms. After listening to these spikes however, I noted that they are not the points at which the readers make their mistakes, instead the spikes are seemingly random. Thus, we might speculate that the difficulty of a text disrupts the flow of a reading and that this somehow leads readers to raise the volume of their voices much more often than in a simpler text.

Patterns

Like with spikes, we must consider the difficulty of the readings when we analyze the waveform patterns. Given the hesitation and stumbling over words in the majority of the PL readings, it comes as no surprise that the waveforms are essentially incomparable with respect to comparing patterns. The fact that one can recognize common waveform patterns in both the Wordsworth and Shakespeare readings suggests that each reader read said readings in a more similar manner than they did the PL reading, a phenomenon that suggests people read easier texts aloud in similar ways. In addition, the repeating high amplitude declining to low amplitude pattern that in fact represents the reading of individual lines of the WW poem suggests that the form/spacing of a poem on paper influences how one controls the volume and inflections of one’s voice as they read it aloud. It would be interesting to see if these patterns would repeat if I printed each poem, but specifically the WW poem, as one giant paragraph with no deliberate line spacing or punctuation and repeated the experiment. With respect to the meter of the poems, we can see the stressed, unstressed syllable pattern in the Shakespeare reading and the PL reading, and I have included just below a close up visual of each of Ian’s waveforms that illustrates this more clearly. There is thus something to be said about how iambic pentameter forces one to speak in a certain way – one inevitably foregrounds this stress/unstressed pattern which would otherwise stay in the background (say, if one were to read Shakespeare or PL silently to themselves). Another observation is that the absence of a pattern in wave amplitude (i.e. a generally constant amplitude throughout a waveform) can be attributed to very little accentuation or inflection of words on the part of the reader, perhaps suggestive that certain readers fail to lend much more to a poem when they read it aloud compared to when they read it silently. Finally, the overall increase in average wave amplitude (i.e. average amplitudes are lowest in the PL reading, slightly higher in WW reading and even higher in Shakespeare reading) shows that the more poems a reader read (remember, each reader read the three poems in the same order), the higher the volume of their voice became, a phenomenon that suggests that the more they read, the more readers gained confidence and became assertive about their reading.

Ian’s Waveforms – Note how his PL reading and his Shakespeare reading each have many “thin” wave peaks. I believe these to represent individual stressed syllables. Ian’s WW waveform differs and has thicker wave peaks because he seemingly stresses, or does not stress, several syllables in a row, for there is no iambic pentameter to control his syllable stressing. It is not the easiest thing to observe – zoom in and look closely! The phenomenon is most obvious if you compare Ian’s PL recording with his WW recording – the Shakespeare recording is not as good of an example.

Ian’s Paradise Lost Waveform (Thin peaks)

Ian PL

Ian’s Shakespeare Waveform (Thin Peaks)

Ian Shakes

Ian’s Wordsworth Waveform (Thick peaks)

Ian WW

Error

A major mediating factor in this project is the Audition interface. Though I could alter the vertical width of each track manually (bigger width = easier interpretation of each waveform), I could not figure out how to make all the tracks a uniform width, which might have been a problematic factor when it came time to start interpreting the waveforms. Also, I could only fruitfully analyse so many waveforms (about five) at a time because of the size of my computer screen, and so I was constantly scrolling and switching between windows while analyzing. In retrospect I should have printed out the waveforms so I could view them all at once on paper. The quieter readers produced waveforms that were difficult to decipher in terms of amplitude patterns, spikes etc. for they maintained almost the exact same volume throughout their readings and sometimes exemplified minimal accentuation or inflection. I could have boosted the amplitudes with some Audition tools to make the waveforms more readable, while I also could have turned up the input gain on the audio interface during recording to accomplish the same goal, but I opted not to do these things in favor of maintaining a constant recording process for each reader and a constant interpreting process for myself. Also there was plenty of potential for error/inconsistency in the recording process itself, the main problem being the distance of each reader from the microphone (it was hard to get all readers to maintain my suggestion of staying three inches away from the microphone throughout the entire reading). We must also not forget the influence that I had on the readers – perhaps if a reader were reading something aloud while alone in an attempt to improve their comprehension of a text, they would not have read it like they did in this experiment. Finally, and this is a big one, a lot of the assertions I have made in this speculation section have been based on my VISUAL interpretation of these waveforms, yet the amplitudes of these waveforms do in fact have numerical values.

What I Probably Should Have Done

In retrospect, considering that the major differences in these readings were between professors and students, it might have been a more fruitful endeavor to compare the readings of varying age groups, or varying levels of expertise in the field of English etc. and to see what results these sorts of comparisons might have revealed. I also should have tried to have an equal number of male and female readers, for I could have speculated on the ways in which gender impacts a waveform. As this experiment stands, there was an insignificant variety of readers with respect to age, expertise, gender… what you will… to speculate about how different types of people read aloud…sorry.

What Did I Even Learn, Man?

Frankly, I am both satisfied and dissatisfied about the fruits of my labor. I thought that the evidence of iambic pentameter in the waveforms was really interesting, and was not something I anticipated at all. I also think that some of my speculations have merit – how frequent amplitude spikes in more difficult texts seem to say something about the nature of stumbling over one’s words – how dramatic readings take longer and are filled with more pauses – how professors tend to read more dramatically or in an instructive fashion… I could go on, but what does that all even mean anyway? A case study of twelve people is hardly something with which one can start making universal, useful claims – thus we have the source of my dissatisfaction. What I think I can take away from this experiment is a series of questions about the nature of reading aloud and recording readings as waveforms. Does reading something aloud add something to a text, as in, something beneficial to a listener/reader that would otherwise not have been experienced (e.g. can reading aloud contribute to a deeper understanding of a text?)? Do different styles of reading have bigger impacts with respect to these ideas? Is the waveform itself a beneficial visualization of vocal audio for scholars? Has my study of waveforms helped me reach any new or beneficial conclusions/speculations? Does the process of being recorded alter the way in which one would go about speaking if they were speaking only to themselves, to a friend, to a stranger? Does saying something in a certain way affect its meaning/how a listener receives the message? Can we detect rhetorical or dramatic styles of speech or their absence through the study of waveforms? I think that this experiment allows me to answer a simple, tentative “yes” to every one of these questions, but to go beyond that is going to take much more work!

Print Friendly