Benjamin Teitelbaum

Brown University

May 2008




This article was conceived as part of a larger study. Please contact the author if you would like further information.



Describing Music


The title of this paper may seem curious to some. Some understand description to be an act devoid of human intervention, an unaltered mirroring of The External. Description is indiscriminate, astylistic, and apolitical. Music theorists perpetuate this understanding by juxtaposing description and analysis: Music theory pedagogue Michael Rogers writes that “the goal of description…is to collect information,” whereas “analysis seeks to answer ‘how’ and ‘why’ questions” (2004:74-75). Rogers sees description as a necessary, but separate prerequisite to analysis. As he sees it, it is only once we begin to analyze, to move beyond mere description, that we show our intellectual, stylistic, and political selves.

In the following pages I discuss modes of thought that shape our experience of musical sound, that are historically situated, politically motivated, and offer mutually contradicting accounts of reality. Thus, this paper focuses on analytical thinking. Why not then title the paper “Analyzing Music”? Because my interest is not only in analysis, but also in the ways analyses are understood and presented. I am interested in how music theorists, lay and professional, imbue their analyses with power by calling them description.

This paper is devoted to highlighting a topic and accompanying method for future research. In this paper I argue that musical thinking builds from a plurality of descriptive modes. This assertion undermines notions of description as non-analytic by suggesting that there is no essential account of The External. I also argue, however, that this plurality is not always apparent, and that communities can, at times, elevate a single descriptive mode. This act of essentializing a descriptive mode not only gives rise to understandings of description as non-analytic, it also reflects and reinforces power relations in a society. I draw from musicologists, cognitive scientists, and political theorists all commenting on Western individuals or communities, and the conclusions I draw from their work should be regarded accordingly.

Thinking Musically through Descriptive Modes

                 Music philosopher Stephen Davies writes of the process of becoming familiar with a new piece or music. He creates a hypothetical situation where a women, Cecilia, hears a Mozart sonata for the first time. But because of her unfamiliarity with the genre, she is unable to engage the work analytically. Her commentary on the sonata will thus be relegated to unsubstantial and unrevealing description. All she can say is:

that some tune begins here and ends there, that later it is repeated, that parts of it are recalled in the bits between the main tunes, that it has an expressive character which distinguishes it from the other main themes…(1994:69)

The method by which Cecilia describes the piece is not, as Davies implies, an unbiased account of Mozart’s work. Sections, cadences, and melodic figures dominate her description. Why doesn’t she mention the other musical elements, such as rhythm, or timbre—the musical element psychologists claim to be most perceptually salient? [1] Do her selections result from innate tendencies of human cognition? Does the sound object itself possess qualities that encourage this kind of framing? Rather, I believe, her selections were framed by what I call a descriptive mode. Descriptive modes act like Anderson’s schemata (1977), Lakoff’s idealized cognitive model (1987), and Theberge’s conceptual infrastructure (1997:166). They are systems of thought shaping categorization and organization of the outside world, bringing meaning and understanding to even our most basic sensory intake. What I seek to emphasize in distinguishing “descriptive modes” is that these systems are communicative. Though they impact our private mental procedures, descriptive modes are public arenas: They can be shaped, reflected upon, imposed and rejected. A closer look at musical practice highlights the nature and implications of these modes.

                 Davies argues that basic musical thought—that of recognizing intervals, simple form, etc.—occurs through an astylistic, unbiased, and therefore solitary descriptive mode. But converging research from various subfields of music scholarship suggests otherwise. Lerdahl and Jackendoff (1983) argue that trained musicians possess multiple types of musical knowledge. They identify separate cognitive patterns of categorization designated for different levels of rhythmic, melodic, and harmonic density in Western, tonal compositions. They claim, thus, that musical thinking, the type that gives way to initial descriptions of sound, is informed by a plurality of descriptive modes.

                While they posit the existence of multiple descriptive modes, Lerdahl and Jackendoff still claim that all these modes function together as a whole (1983:278). Later scholars revised this stance, suggesting instead that musical thinking is comprised of conflicting and contradicting accounts. Mary Louise Serafine argues that, in the West, the categories used to describe and analyze music play a marginal role in listening experience (1988:7). Serafine claims that music notation caused the disconnect between the categories we hear and the categories we use to analyze. Scales and chords, the two musical elements best represented in Western notation, are also the elements most often used in formal and casual analysis (1988:42-67). Serafine provides an example where two descriptive modes, two methods and frames for describing musical sound, came into conflict with each other. Fiske (1996) elaborates upon Serafine’s analysis, noting that the two contrasting descriptive modes derive from different sensory experiences with music: one aural (the sound itself), and the other visual (notation). Given that the two sensory experiences maintained different descriptive modes, and given that musical experience always engages multiple senses (Baker 2001, Raffman 1993, and Kalakoski 2001) [2], we can assume that there will always be diversity and conflict within our understanding of music.

Heteroglossia, Music, and Power

Why do Serafine’s listeners chose one descriptive mode over another? And what is the significance of the fact that, while employing a descriptive mode tied to visual experience, we claim to speak about sound? Linguistic and political theory can shed light on these questions. Musical understanding and language function in similar ways. Both allow us to organize and find meaning in the physical world, thus both are descriptive modes. Both index political boundaries and social status, and are seldom reflected upon in everyday practice. Further, our understanding and treatment of these descriptive modes reveals much about extant power relation in society.

Russian linguist Mikhail Bakhtin argues that language use is always diverse. In any language community individual speakers communicate through multiple linguistic genres. Bakhtin titles this collective of genres heteroglossia. Professional, regional, economic, as well as gender-based factors can shape these genres. Each genre upholds a distinct view of reality, and these views, despite emanating from a single individual, can be contradictory (Bakhtin:289-290). Heteroglossia is not often recognized by speakers. This, Bakhtin suggests, is no accident: Given that dominant groups often seek to impose a particular notion of reality on others, their rule is threatened by the existence of multiple worldviews. Thus by highlighting one linguistic genre as primary dominant groups elevate a single worldview—one that likely reinforces their position of power.

Bakhtin’s concept of heteroglossia proved a useful concept to music scholars in the past. McKay (2007) and Cone (1974) use the term to describe the way in which Western composers combine multiple voices in songs. Kaminsky (2004) uses the term with reference to the multiple and occasionally contradicting ways people talk about folk music. But I seek to apply Bakhtin’s concept beyond instances of verbal language in and about music. Like language, musical understanding is multifarious. People use contrasting descriptive modes to understand an infinite variety of music. But as with language, this diversity often remains hidden. Albeit in ways different from the case of language, the move to mask the diversity of musical understanding can also speak to ongoing power struggles.

Bakhtin highlights the prominent role of nationalism in obscuring heteroglossia. Nation-states, often seeking to instill ideologies of unity and solidarity, will try to eradicate examples of diversity from the public consciousness. Dominant forces often control central venues of public discourse (religious and educational institutions, media) and are therefore capable of declaring “official” or “standard” descriptive modes. This campaign to recast diversity is typically successful. That is, as far as our public consciousness is concerned. Postmarxist-theorists Antonio Gramsci and Pierre Bourdieu claim that subordinate people still maintain a more accurate understanding of their situation. This understanding exists in private settings, however, and is not always articulated clearly to others, or even ourselves.

                 Gramsci once wrote, “Various philosophies or conceptions of the world exist, and one always makes a choice between them. How is this choice made? Is it merely an intellectual event, or is it something more complex? And is it not frequently the case that there is a contradiction between one’s intellectual choice and one’s mode of conduct? Which therefore would be the real conception of the world: that logically affirmed as an intellectual choice? Or that which emerges from the real activity of each man, which is implicit in his mode of action?” (1971:631). He argues that the personal realm of thought coexists with a notion of reality held by most members of a society. Often times, these two perspectives, called internal and external thought, will contradict each other. A person’s lived experience may provide them with one definition of their situation, while the public worldview will indicate another. A central measure of domination, for Gramsci, is the ability of one group to impose its own account of reality into the public discourse. This allows a dominant group to promote a false explanation of subordinate peoples’ lived experiences, and in doing so solidifies existing authority and stifles resistance (Crehan 2002:115-119).

Bourdieu echoes Gramsci’s thoughts. He recognizes a drift in the public presentation of our actions and the actions themselves. Bourdieu claims that this behavior stems from our use of two contrasting definitions of reality—definitions he refers to as “official” and “practical” respectively (1977:33-38). He is interested in how and why we use these two definitions, particularly how and why certain definitions are “officialized.” He writes that the point of officializing a definition is to “transmute ‘egoistic’, private, particular interests into disinterested, collective, publicly avowable, legitimate interests” (1977:40). Thus, groups, often dominant groups, are able to conceal and preserve self-serving accounts by elevating and essentializing those accounts in the public consciousness.[3] All those who define their situation in ways different from the official are “condemned to appear unreasonable in seeking to impose [their] private reason” (ibid). The act of domination runs deeper than this. Groups can exercise power through officializing a definition because doing so “presupposes the competence required in order to manipulate the collective definition of the situation” (ibid).

Combined, the theories of Bakhtin, Gramsci, and Bourdieu provide insight into discussions of musical thought in contemporary Western society. The above mentioned studies by Lerdahl and Jackendoff, Fiske, Serafine, all point to the existence of diverse genres of musical thought—a situation mirroring that Bahktin describes in language. As with Bakhtin’s heteroglossia, music research claims that this diversity is contained both in communities and within individual’s cognitive processing. However Serafine argues that, in Western society, one genre, one descriptive mode, is elevated above all others. That mode is treated as the official and often exclusive means for conceiving of music in public arenas. As Bourdieu and Gramsci imply, and as Serafine’s example shows, this account may conflict with people’s lived experiences. The struggle over establishing an official mode is waged through mediums of public discourse, and the interests with control over those mediums have an advantage in the fight.

Taking contemporary America as an example, we can see that the music industry and the music academe both thrive on the prominence of a single descriptive mode—the former for the purpose of advertising, the latter for legitimation. By neglecting certain descriptive modes in favor of those advocated by these institutions, music fans, performers, and scholars reflect their subordinated social standing. In doing so, they also enhance existing power relations by displaying their subordinated position to others, instilling the single descriptive mode in the public consciousness and shackling alternative definitions in private obscurity.

Looking Forward

Janet McIntosh writes, “Power is everywhere…not only as an apparent universal that recurs across societies and history, but as an academic mantra for our times” (1997). Considering the impact of power-related scholarship on the social sciences, the lack of power related research in music theory and music cognition is noteworthy. [4] My discussion thus far has highlighted a potential topic for study that would expand and integrate the fields of music cognition and theory into cross-disciplinary discourse on power and society. I now will turn my attention from the topic of power and musical understanding to addressing possible research methods.

Serafine, Bakhtin, Bourdieu, and Gramsci all indicate the presence of unspoken and even unconscious descriptive modes. How can a researcher gain access to these thoughts? Declaring an understanding of other people’s private thoughts is a dubious path for a researcher (or anyone) to take. Still there is possibility for achieving some insight into these tacit realms of understanding.

Antonio Gramsci, wrote extensively about humans’ relation to the physical world. Gramsci thought that humans and nature were in a complex of relations involving ourselves, other people, and the environment. To change our personality, we change or reframe our relationship to other people or the environment (In Crehan 2002:115-119). Cognitive scientist Mark L. Johnson takes this a step further, theorizing that descriptive modes emerge through our interaction with the physical world.

In his book, The Body in the Mind (1987) Johnson argues that bodily interaction with the external world gives rise to image schemata. Image schemata allow us to understand this external world as meaningful. They provide us fundamental understanding of basic concepts such as “in and out” or “start/finish,” as well as general categories like “triangle” or “container.” They are “constantly operating in our perception, bodily movement through space, and physical manipulation of objects” (1987:23). Whereas we may have a sensory experience of a three-dimensional object, it is our image schemata that lets us understand what three dimensionality is.

Johnson’s image schemata is an example of a particular type of descriptive mode. These modes are analogous to Gramsci’s “implicit” understanding of reality, Bourdieu’s “practical definitions.” They contrast with accounts of reality gained through public discourse. They are important to Bourdieu and Gramsci as they reflect a definition of reality often muted by descriptive modes imposed by dominant forces. They suggest that our image schemata—descriptive modes that bear a more appropriate account of certain aspects of our world—can be concealed by official descriptive modes.

Image schemata offer not only a key tool for analyzing power struggles, they also present the best opportunity for researchers to gain access to unspoken descriptive modes. Unlike other tacit modes that build from past experiences, image schemata possess tangible examples of their origins: We can observe people as they interact with the physical world. Music scholars can look to both psychology and embodied experience to better understand the image schemata operating in musical practices. How do individuals perceive sounds? How are physical objects used in the making and enjoying music? Once scholars have collected some information about image schemata, they can then adopt a critical view of the dominant methods of description. Do people hear the sounds they describe? Are the sound relationships implied through our musical instruments the same as those advocated in official music theory? [5] Instances of conflict between image schemata and official descriptive modes present choice opportunities for study the affect of power relations on musical practice.

Conclusion: What Age of Musical Change?

In this paper I highlighted an approach to the study of music and power. In conclusion, I will argue that the opportunities for applying this approach have never been better. The digital era has radically reshaped musical practices. The advent of digital technology introduced new notions of musical space, transmission, performance, interaction, and composition. The changes, viewed from within Johnson’s theory of image schemata, have the potential to alter descriptive modes. Our tactile and motor interaction gained through the production of musical sound has been expanded by the emergence of sound modifying technology, as well as digital and virtual instruments. Visual representations of musical sound emerge in an intractable array of variation, from the images accompanying music listening software, to the displays found in electroacoustic music performances and instillations. Technology has multiplied potential performance and listening spaces, introducing new sights, sounds, smells, and tastes to musical practices.

                 But has this diversification of musical practice produced a diversification in musical thinking? If so, is that thinking represented in the public musical discourse? The digital era offers unprecedented possibilities for investigating the issues mentioned in this paper. While scholars have devoted considerable attention to music, repression, and emancipation on a global scale, I would like to see these topics addressed on the most local of scales—in the minds of individuals.



1. See Crowder and Pitt (1992) for more about timbre and cognition.

2. Note that these scholars were expanding on earlier studies that advocated a bimodal (auditory and tactile) approach to musical imagery, such as Mikumo (1994) and Zatorre and Becket (1989).

3. Perhaps this is what Maurice Bloch means when he writes, “The pre-requirement for the establishment of ideology turns out, in fact, to be a systematic and furious assault on non-ideological cognition” (1989:129).

4. Some notable exceptions to this trend include Baker (2008) and Kerman (1985).

5. See Theberge (1997).




Anderson, Richard C. 1977. "The notion of schemata and the educational enterprise:

general discussion of the conference." In Anderson, Richard C., Rand J.

Spiro, and William E. Montague (eds.) Schooling and the Acquisition of

Knowledge. Hillsdale NJ: Lawrence Erlbaum.


Baker, Geoffrey. 2008. Imposing Harmony. Durham: Duke University Press.


Baker, James M. 2001. “The Keyboard as Basis for Imagery of Pitch Relations.”

In Musical Imagery, eds. Rolf Inge Godøy and Harald Jørgensen. Lisse, The

Netherlands: Swets and Zeitlinger.


Bakhtin, Mikhail. 1981. The Dialogic Imagination: Four Essay; translated by Caryl

Emerson and Michael Holquist. Austin: University of Texas Press.


Bloch, Maurice. 1989. Ritual, History and Power: Selected Papers in Anthropology.

The Athlone Press: London.


Bourdieu, Pierre. 1977. Outline of a Theory of Practice, translated by Richard Nice.

Cambridge: University of Cambridge Press.


Cone, Edward T. 1974. The Composer’s Voice. Berkeley: University of California



Crehan, Kate. 2002. Gramsci, Culture and Anthropology. Berkeley and Los Angeles:

University of California Press.


Crowder, Robert G., and Mark A. Pitt. 1992. “Research on Memory/Imagery for

Musical Timbre.” In Audio Imagery, ed. Daniel Reisberg. Hillsdale, NJ:

Lawrence Erlbaum Associates, Inc.


Davies, Stephen. 1994. Musical Understanding and Musical Kinds. In Philip

Alperson (ed.) Musical Worlds. University Park, Pa.: The Pennsylvania State

University Press.


Fiske, Harold. 1996. Selected Theories of Music Perception. Lampeter: The Edwin

Mellen Press Ltd.


Gramsci, Antonio. 1971. Selections From the Prison Notebooks. Hoare, Quintin and

Geoffrey Nowell Smith (Eds.). London: Lawrence & Wishart.


Ives, Peter. 2004. Language and Hegemony in Gramsci. London: Pluto Press.


Johnson, Mark J. 1987. The Body in the Mind. Chicago: University of Chicago Press.


Kaminsky, David (2005). "Hidden Traditions: Conceptualizing Swedish Folk Music

in the Twenty-First Century." Ph.D. Dissertation, Harvard University.


Kalakoski, Virpi. 2001. “Musical Imagery and Working Memory.” In Musical

Imagery, eds. Rolf Inge Godøy and Harald Jørgensen. Lisse, The Netherlands:

Swets and Zeitlinger.


Kerman, Joseph. 1985. Musicology. London: Fontana.


Lakoff, George. 1987. Women, Fire, and Dangerous Things: What Categories Reveal

About the Mind. Chicago: University of Chicago Press.


Lerdahl, Fred, and Ray Jackendoff. 1983. A Generative Theory of Tonal Music.

Cambridge, Mass.: MIT Press.


McIntosh, Janet S. 1997. ”Cognition and Power” Available online,


Markman, Arthur B., Takashi Yamauchi, and Valerie S. Makin. 1997. “The

Creation of New Concepts: A Multifaceted Approach to Category Learning.”

In Creative Thought: An Investigation of Conceptual Structures and Processes, eds. Thomas B. Ward, Steven M. Smith, and Jyotsna Vaid. Washington DC: American Psychological Association.


McKay, Nicholas. 2007. “One for All and All for One: Voicing in Stravinsky’s

Musical Theater.” Journal of Musical Meaning 5.


Mikumo, Mariko . 1994. Motor Encoding Strategy for Pitches of Melodies. Music

Perception 12.


Raffman, Diana. 1993. Language, Music, and Mind. Cambridge, Mass.: MIT Press.


Rogers, Michael R. 2004. Teaching Approaches in Music Theory: An Overview of

Pedagogical Philosophies. Carbondale: Southern Illinois University.



Serafine, Mary Louise. 1988. Music as Cognition: The Development of Thought in

Sound. New York: Columbia University Press.


Theberge, Paul. 1997. Any Sound You Can Imagine: Making Music/Consuming

Technology. Hanover, NH: University Press of New England (for Wesleyan

University Press).


Zatorre, Robert J., and Christine Beckett. 1989. Multiple Coding Strategies in the

Retention of Musical Tones by Possessors of Absolute Pitch. Memory and

Cognition 17.