Joshua D. Sites

Joshua D. Sites is a PhD candidate in The Media School at Indiana University. He studies the perception and cognition of popular music, as well as the impact of music embedded in other mediums.

Contact information:
sitesj at

Robert F. Potter

Robert F. Potter is a professor at The Media School at Indiana University. He studies attention and emotional response to audio.

Everything Merges with the Game: A Generative Music System Embedded in a Videogame Increases Flow

by Joshua D. Sites, Robert F. Potter


Flow, "the optimal experience," is a state of pleasant deep-focus on the task at hand that is both challenging and rewarding. Naturally, flow is a hot topic in videogame research. The purpose of this study is to determine what impact a generative music system embedded in a videogame would have on the subjective experience of flow. In this study, participants played two versions of a Tetris-like game. One version had a generative music system for the music soundtrack, and the other had a traditional interactive soundtrack. Participants that played the version with the generative music system first reported flow more than those that played the traditional soundtrack version first. Participants were not made aware of the manipulation, nor were they capable of correctly identifying any differences between the two versions when asked. Another purpose of this study was to begin to bring together disparate bodies of literature: musicology, experimental psychology and music theory. These fields have much to offer each other, but are not often enough connected.

Keywords: flow, generative music, GameFlow, algorithmic music, game design, music


Flow is "the optimal experience" (Csíkszentmihályi, 1990). In 1990 Csíkszentmihályi, the person who named the concept of flow, referred to it as a pleasant state of deep concentration where the outer world melts away and a person focuses intently on the task at hand due to its difficulty being closely paired with their ability. This means the person is always challenged, yet able to succeed. Flow originated as a way to describe the pleasurable deep concentration observed in painters and sculptors. Research on flow has identified the phenomenon in several situations: people have reported experiencing flow while participating in a wide range of hobbies, from playing chess to rock climbing. People have also reported flow while using media, such as videogames (Sherry, 2004) or listening to music (Diaz, 2011).

This study connects previous research on the concept of flow to questions concerning the impact of music in videogames. Specifically, it tests if the experience of flow can be affected by manipulating the music soundtrack of a videogame. Nakamura and Csíkszentmihályi (2002) describe the experience of flow as:

  • Intense focus and concentration on the activity at hand
  • Synthesis of awareness and action
  • "Loss of reflective self-consciousness" (p. 90)
  • A sense of control over the situation, either overt or through anticipating the next action
  • Loss of sense of time
  • "Experience of the activity as intrinsically rewarding, such that the end goal is just an excuse for the process" (p. 90).

While describing a flow state may be relatively easy, measuring flow presents a unique challenge for researchers because of its very nature: people in a state of flow lose their sense of self and become deeply focused on the task at hand. In plainer words: flow research often tasks the participant with reflecting deeply on a period of time that they forgot they existed. This makes collecting reliable information about the nuances of the experience of flow particularly difficult. While the cumulative experience of flow may be obvious to the person experiencing it, teasing apart the exact moments a person is in flow, or the components of those time periods that make flow possible, is difficult. A primary issue facing flow research is not investigating whether people experience it, but how to effectively measure or indicate flow in such a way that illuminates what can modulate the experience of it. Naturally, game designers seek to create enjoyable games that lead to players experiencing flow. One element of game design of particular interest to this study is game music soundtracks.

Composers of videogame music use any combination of common compositional techniques. Generally speaking, these techniques vary on a spectrum of linearity. Recorded music, which describes the vast majority of contemporary Western popular music, is completely linear. It can be played back repeatedly and each successive presentation will be identical to the last. This is an example of one end of the linearity spectrum.

On the other end of the spectrum is generative music. Generative music is not composed as a specific piece of music in the same sense as linear music. Instead, it is a carefully constructed system of rules that outputs ever-changing music. Generative music systems are also often guided or impacted by non-musical inputs that perturb the system. This presents unique attributes compared to linear music when embedded in an interactive medium such as a videogame. Unlike other linear forms of entertainment like movies, videogames are open-ended and each player's experience is unique, even in the most rigidly structured games. If the system of rules in a generative music system utilizes inputs from the game and the player's interactions with it, the system can always be synchronous with the player's gameplay. Each time a person plays the game, the gameplay will be different because the player makes different choices or is confronted by different challenges presented by the game. In turn, the music that the generative music system outputs will be different because it reflects each individual and unique moment in the gameplay.

Generative Music

Wind chimes are a classic example of a generative music system (Dorin, 2001). A composer does not specify which chimes are struck when, but the composer does design the physical properties of the wind chime - such as materials, orientation and organization of chimes - and places it strategically to receive an appropriate breeze. Herber (2008) explicates this idea further: "the process of composition can be understood as the conception and organization of musical ideas, whereas an instrument provides the equipment necessary to realize such a work" (pg. 103). In a sense, generative music composition is not about the "concept and organization of musical ideas" (Herber, 2008, pg. 103) but the concept and organization of metamusical ideas into a system.

There are many scholars who have written at length about generative music (Herber, 2008; Herber, 2010; Wooller, Brown, Miranda, & Berry, 2005). In the context of this study, a limited definition of generative music will be offered in order to empirically test the effects of generative music on flow in players of a videogame. In this study, generative music is conceptually defined as music that arises from a dynamic system designed by a composer. The composer of a generative music system does not create specific melodies, harmonies, or rhythmic components. Instead, the composer of a generative music system organizes metamusical possibilities (Herber, 2008) and allows the dynamic system to function without direct control.

Herber (2008) describes a way for music soundtracks of interactive media such as videogames to better reflect the aesthetic and structure of the media: blending the concepts of instrument and composition. "This allows a piece of music to play, or undergo a performance like a traditional composition. [...] This treatment allows the musical output of the work to be modified in the course of an interaction" (Herber, 2008, pg. 104). There are contemporary examples of generative music systems in videogames, too: Spore (Buskirk, 2008), No Man's Sky (Joyce, 2016), and Proteus (Solberg, 2015) all use generative music systems to create some or all of the music in the game.

In this study, the generative music soundtrack is created by utilizing a series of algorithms that interact dynamically with the player's input to the videogame. In the context of this experiment, the generative music system is an algorithm coded into a Tetris-like game called QuatrEno, created specifically for this study (Ingerson & Herber, 2015). Data, coming from the player's manipulation of the controls and current game state of QuatrEno, is input to the generative music system's algorithms. These algorithms create music that reflects the player's interactions with the game. For example: as the blocks stack higher on the screen, indicating risk of failure to the player, the musical harmonic content of the soundtrack gradually becomes more strained and tense. Other examples of player interactions that impact the music system are players moving the blocks horizontally, slamming them down in place, or rotating the blocks. These granular kinds of interactions impact the music system in such a way that an interactive music system cannot. Due to the random nature of the gameplay, such as what shape of block appears at the top of the screen, combined with the unique human responses of the player via the game interface, the generative music soundtrack is not expected to ever repeat itself exactly the same way. In other words, the generative music system permits more granular shifts in the music soundtrack to better reflect the player's progress. In this way, generative music in the game meets a key requirement of flow (Sweetser & Wyeth, 2005). In a more traditional approach to game soundtracks, however, the game would be programmed with a limited number of discrete musical modes, such as "normal" and "panic," and switch between them. There would be no gradient of tension as the blocks slowly pile up on the screen, like in the generative system discussed in this study.

To represent this more traditional approach to the soundtrack, this study uses a comparative condition to generative music created by recording the audio output of the generative music system while the composer played the game. This recording was then re-purposed as a linear music version of the soundtrack. Specifically, three different versions of linear music soundtrack were made to prevent spurious effects from a single instantiation of the linear version of the soundtrack in that experimental condition. In the linear condition, participants were randomly assigned to games containing one of these three versions.

It should be noted that the linear version of the game soundtrack is actually interactive: it responds to game state changes such as the "panic" mode described above. While this is not truly linear, it is certainly more linear than the generative version of the soundtrack. The decision to use an interactive soundtrack instead of a purely linear one is to better represent common practices in gaming. Interactive soundtracks are common, but completely linear soundtracks are almost unheard of. This distinction is made here, but for sake of clarity and parsimonious writing, the term "linear" is used throughout this study to describe the more linear, interactive version of the game soundtrack.

Flow in Videogames

The concept of flow has been applied to the study of videogames in several ways. Some research focuses on the concept as an industry tool for creating better games (Jeggers, 2009; Sweetser & Wyeth, 2005). Others try to build a theory of flow in videogames (Cowley, Charles, Black, & Hickey, 2008; Sherry, 2004) and then test their theories using game players, collecting qualitative responses (Bowman, 1982) and quantitative data (Weber, Tamborini, Westcott-Baker, & Kantor, 2009; Weber, Alcia, & Malthik, 2009). Sherry (2004) believes flow could have powerful explanatory power for the subjective experience of playing videogames, claiming they meet Csíkszentmihályi's "explication of activities that are most likely to create the flow state," and as a result should elicit flow regularly in players (pg. 339). Videogames "(a) have concrete goals and manageable rules [...], (b) provide action that can be adjusted [...] by our capabilities, (c) provide clear feedback [...], (d) have abundant visual and aural information that helps screen out distraction" (Sherry, 2004, pg. 339). Sweetser and Wyeth (2005) expand upon Sherry by building a new model, called GameFlow, to specifically address flow in videogames. Sweetser and Wyeth (2005) critique game design and evaluation saying they focus on usability instead of player enjoyment. While claiming the standing definitions of flow are valid, Sweetser and Wyeth (2005) describe a lack of an integrated model that describes player enjoyment as a whole. They claim that GameFlow is universal across all videogame experiences, and not impacted by genre of game, gaming system or any other factors.

While their definition of GameFlow is not substantially different from Csíkszentmihályi's (1990) original concept, the phrasing is unique and suggests a small shift in application. A task that can be completed is unique to Sweetser and Wyeth's (2005) definition. It suggests that an unending game would not be a good candidate for flow, nor an open world or sandbox game since they lack a clear goal. Csíkszentmihályi's (1990) definition of flow, on the other hand, includes someone working on a Japanese rock garden - a task with no concrete end or goal. Sweetser and Wyeth (2005) would have to exclude such an activity if using their GameFlow heuristic.

Sweetser and Wyeth (2005) map concepts from videogame literature to flow. The concepts used by Sweetser and Wyeth (2005) that are most relevant to generative music systems in videogames are concentration, control, feedback and immersion. According to Sweetser and Wyeth (2005), concentration requires that games provide stimuli from different sources that are worth attending to, that grab the player's attention and keep it; players should not be distracted from game tasks they need to concentrate on. They define control as the sense of agency and impact players feel they have in the game world. Feedback is defined as the response the game makes to player input that indicates the player's progression through the game (Sweetser & Wyeth, 2005). Control and feedback form a reciprocal relationship in the sense that control exerted over a game prompts feedback from the game, which then prompts further and/or different control and so on. Lastly, and perhaps most importantly, is immersion. Immersion is defined as the player becoming less aware of their (real) surroundings, less concerned with day-to-day life, experiencing an altered sense of time and feeling emotionally and viscerally involved in the game (Sweetser & Wyeth, 2005).

To test their model, Sweetser and Wyeth (2005) looked at two games with similar release dates, genre and gameplay and asked expert reviewers to rate them in terms of elements of GameFlow. They offer no explanation of the expert's qualifications, limiting the ability to ascertain the veracity of these analyses. As such, although GameFlow is promising conceptually, it lacks rigorous testing.

Jegers (2009) critiques flow as being too general to be useful as a goal of game design. Perhaps unwittingly demonstrating this, Chen (2007) suggests game designers "mix and match" the constituent parts of flow while designing a game. Chen's (2007) suggestion does not provide any direction to researchers or game designers because there is no explanation of how a developer could mix and match. While GameFlow lacks in terms of operationalization, it does successfully offer a guide for game designers for where to seek the elements of flow. Jegers (2009) also establishes Pervasive GameFlow, which is meant to approach GameFlow from the standpoint of Pervasive games. The genre of 'pervasive games' blurs the distinction between game and reality. An example of a pervasive game is Niantic, Inc.'s (2016) Pokémon Go. In Ingress, players navigate the real physical world to play the game. Their locations are tracked via their phone's GPS systems. The game interface is an overlay on a map with points of interest. Once a player reaches a point of interest, they can tag it on the game. Teams compete to gain control of points of interest by battling other teams in the game. The game world never stops or ends. Instead, it is an always-on and ever-changing game. Jegers'(2009) Pervasive Gameflow is also not a tool to measure flow, but a guide for game development.

Music Listening and Flow

Music has been explored by researchers for its ability to elicit flow. Similar to other flow experiences, a fundamental aspect of flow in music is matching the listener's skill and the difficulty of the music (Diaz, 2011). Research on music and flow tends to focus on music performance or creation (for examples, see Csíkszentmihályi, 1990; de Manzano et al., 2010; Wrigley & Emmerson, 2011). However, Diaz (2011) looked at the relationship between attention, flow and music listening. Diaz approached it in a novel way: by using "adaptations from research on meditation and attention, along with protocols established in research in music listening and affective response" (2011, pg. 43). Critiquing common methodologies used to study flow, Diaz (2011) described how music listening presents a unique challenge: open-ended interviews, structured questionnaires and the Experience Sampling Method (ESM, Larson & Csíkszentmihályi, 1983) have all been used in other flow research focused on physical activities, but are not as helpful in music listening. Flow already "does not easily lend itself to experimental control" and music listening is "less explicitly observable" than other activities previously examined for flow (Diaz, 2011, pg. 43). In fact, music listening is a unique activity for flow research because the listener has a passive relationship to the stimulus. Flow is typically only associated with physical activities. Music listening, however, can be entirely sedentary. Despite this, it requires cognitive ability to interpret and understand the music. Diaz (2011) described cognitive ability as attention and discrimination and the adequacy of the stimulus judged by how interesting it is to the listener so much as to warrant their attention and discrimination.

Diaz (2011) tested the impact of a mindfulness exercise prior to listening to music on participants' experience of flow. Participants were placed into one of two experimental groups: those that did a mindfulness exercise prior to music listening, and those that did not. "Mindfulness has been described as the process of bringing a specific quality of attention to moment-by-moment experience" (Diaz, 2011, pg. 44 citing Kabat-Zinn, 1994) that is not judgmental or elaborative and is centred in the present (Bishop, 2004). People regularly practicing mindfulness exercises saw improvements in three subsystems of attention: 'alerting,' 'orienting,' and 'conflict monitoring' (Jha, Krompinger, & Baime, 2007). Through manipulating mindfulness, Diaz believed they indirectly manipulated attention. Because flow can be characterized as a heightened affective experience, and continuous response measurement (CRM) had been used in prior music listening studies looking at heightened affective experiences, Diaz (2011) argued that CRM was a valid choice to measure flow and had respondents use a dial-turn system to indicate their experience of flow over time.

Diaz (2011) noted that participants reported experiencing flow at the similar time periods during the music listening task. Additionally, a significantly positive correlation was reported between participants in the mindfulness experimental group and CRM data corresponding to flow compared to participants who were not in the mindfulness experimental group and their CRM data corresponding to flow. However, Diaz (2011) voiced concern over the definition of flow they presented to participants as being "unidimensional" and too similar to other constructs like attention or concentration. The key finding, however, was that by lessening distractions by administering a mindfulness exercise prior to music listening, an individual's likelihood of experiencing flow was increased.

While the results of Diaz (2011) are interesting, it is difficult to generalize because only trained musicians were used as participants. Literature shows that musicians' brains are structured and respond differently than the general population (Gaser, C., & Schlaug, G., 2003; Münte, T. F., Altenmüller, E., & Jäncke, L., 2002; Pantev, C., Oostenveld, R., Engelien, A., Ross, B., Roberts, L. E., & Hoke, M., 1998; Schmithorst, V. J., & Wilke, M., 2002). This suggests that musicians could be a poor choice for participants in terms of being able to generalize to the population of a whole.


Flow is established through an enjoyable experience or task, where the player must be able to focus intently (Csíkszentmihályi, 1990). Linear music systems in videogames can be distracting because they do not consistently match the interactive gameplay (Herber, 2010). This study argues that linear music soundtracks impair a player's ability to focus, and in this way, affect flow. Generative music systems take real time, non-musical input and turn it into a real time musical output (Wooller et al., 2005). Matching input from the game to musical output will better allow players to focus on the game goals and allow flow to be achieved more easily. Therefore:

H1: There will be a main effect for Soundtrack Type such that players will have a greater experience of flow playing a game with a generative music soundtrack than playing a game with a linear music soundtrack.

Specifically, since the scale developed for this study is based primarily off of Engeser and Rheinberg's (2008) Flow Short Scale (FSS), validated as a measure of flow during videogame play, the following operationalization of H1 is offered:

H1a: There will be a main effect of Soundtrack Type on self-reported experiences of flow such that flow questionnaire (FQ) levels will be greater following game rounds with generative music soundtracks compared to game rounds with linear soundtracks.

Just as watching television and playing videogames induce real emotions, so does listening to music. De Manzano et al. (2010) observed increased electrodermal activity (EDA) associated with self-reported states of flow. In an effort to replicate these findings and demonstrate the effect is related to the experience of flow while playing videogames, electrodermal activity will be recorded in this study and the following hypothesis is made:

H1b: There will be a main effect of Soundtrack Type on tonic electrodermal activity such that EDA will be greater during game rounds with generative music soundtracks compared to game rounds with linear soundtracks.

If linear music disrupts flow and generative music eliminates that disruption, then participants are likely to have a more positive subjective impression of the game after playing the generative soundtrack version. If participants are randomly assigned to which version of the game they play first, and asked in between versions to report how excited they are to play the game a second time, it is expected that those reports will differ significantly. Therefore:

H2: There will be a main effect of Order on self-reported excitement to play again. Participants who play the version of the game with the generative music soundtrack in the first block with have higher means of self-reported excitement to play again compared to those who play the version of the game with the linear music soundtrack in the first block.

Finally, we expect that the experience of flow will be related to game performance. Since participants are expected to be more likely to enter into a flow state during the generative music version of the game:

H3: Participants will achieve higher average videogame scores during the generative music soundtrack.



This experiment employed a 2 (Soundtrack Type) x 3 (Flow Questionnaire Measurements) x 2 (Order) mixed design. 'Soundtrack Type' is a within-subjects independent variable and refers to the type of music system in roughly 15-minute blocks of gameplay. There were two levels to the Soundtrack Type variable: linear and generative. Within the generative soundtrack level there were three different versions of the linear music stimuli to which participants were randomly assigned. This controlled for any possible spurious effect from a specific iteration of the linear soundtrack. 'Order' is a between-subjects independent variable and referred to which version of the game soundtrack participants were exposed to first.


The game stimulus was a Tetris-like game created specifically for this experiment. Tetris has been known to induce a trance-like state in players, making it a likely candidate for flow (Rollings & Adams, 2003). The general visual elements of the game were consistent between versions of the stimulus. The Soundtrack Type changed between blocks of gameplay for each subject. Each block lasted approximately 15 minutes. A round was defined as a player starting the gameplay, and playing until losing—until the stack of QuatrEno blocks reached the top of the screen and the game reset. A participant completed as many consecutive game rounds as time permitted in that block, with the accompanying musical stimulus. The participant's goal was to beat their previous Score.

The generative soundtrack was composed to respond in real time to the participant's input to the game in addition to the relative state of the game, such as the number of QuatrEno blocks on the screen. The linear soundtrack was a recording of instances of the generative soundtrack so the musical aesthetics were identical between versions of the stimulus. The linear soundtrack also had two modes: normal and panic. Normal mode occurred during typical gameplay, panic indicated to the participant that they may fail the game soon. These modes followed gameplay, but not in a granular fashion. Instead, there was a comparatively dramatic shift between modes instead of a gradual change. Furthermore, unlike the generative music soundtrack version of the stimulus, the music the player heard from round to round was not unique.

Dependent Measures

Dependent variables for this study included self-reported experience of flow, psychophysiological measure of arousal, self-reported excitement to play the game again after the distractor task and scores within block.

Self-reported flow: Participants responded to a questionnaire to provide data about their experience of flow. The participants responded to each prompt on a 7-point Likert scale, with 1 labeled "not at all agree," 4 labeled "partly agree," and 7 labeled "very much agree." This questionnaire was adapted from Engeser and Rheinberg's (2008) Flow Short Scale. Elements of Kalyanaraman and Sundar's (2006) attitude assessment were also incorporated to track participant's attitude toward the game. The questionnaire was also based in part on Diaz's (2011) questionnaire on music listening. Because the self-report measure used in this study is a revised version of several previously used scales, it is hereafter referred to as the Flow Questionnaire (FQ).

Psychophysiological arousal: Arousal is also a dependent variable and is operationalized by recording electrodermal activity levels from the medial side of the participant's foot during gameplay. While electrodes are more commonly placed on the participant's hand to capture electrodermal activity data, this study required participants use their hands to interact with the game stimuli. Use of hand muscles would interfere with electrodermal activity data collection (Potter & Bolls, 2012). The medial sides of the participant's feet were used because there is also a high concentration of eccrine sweat glands in that location, similar to the palms of the hands. Eccrine sweat glands are innervated by the sympathetic nervous system and respond to stressors in the environment (Potter & Bolls, 2012). Raw EDA data was aggregated as the mean microSeimens value per each second. This was then converted into change scores from a 5-second prior to each block of gameplay.

Self-reported excitement: Participants responded to a questionnaire regarding their excitement to play more QuatrEno after the distractor task.

Scores: In the stimulus, one point was earned by having a row of the screen completely filled with QuatrEno blocks, removing them from the screen. The final score for each round of gameplay was recorded as that round's Score.


Thirty-nine media school undergraduate students participated in this experiment and received extra credit in the class they were recruited from. Students who did not wish to participate were given an alternate way to earn the same amount of extra credit. Data collected from participants unable to play a videogame on a computer and/or unable to hear would have been excluded from analysis because they are not able to interact with the stimulus or experience the experimental manipulation of the generative music system in the stimulus. Despite this requirement, no participants were excluded on these grounds.


The stimuli were presented on a Windows desktop computer with MediaLab (Empirisoft, 2012) installed. Participants were seated approximately 3 feet from a large screen 42-inch TV that served as the monitor for the computer. Participants sat in a recliner chair and with a lap desk that held the keyboard and mouse. Participants wore a pair of circumaural closed-back Sennheiser headphones in order to hear the auditory portion of the stimulus while also reducing the ambient noise level.

Electrodermal activity was recorded via a BIOPAC system. Disposable electrodes were used to collect electrodermal activity.


Upon arrival at the lab, participants read the informed consent form, asked the experimenter any questions and signed it to agree to participate. Next, participants were led to the lab, seated in front of a television monitor and again explained the procedure for measuring skin conductance. Electrodes were placed on the medial sides of the participant's left foot.

Participants read the prompt on the laptop describing the concept of flow and were instructed to attempt to stay relatively still during the experiment, moving only to the extent necessary to play the game. During each 15-minute block of gameplay the experimenter monitored the passage of time. The blocks of gameplay were broken up to collect self-reported data. However, in order to not interrupt possible flow states, the experimenter limited interaction with the participant only to notify them to proceed to the questionnaire on the Media Lab Screen.

The participant filled out the FQ after completing the first round of gameplay. This established an initial level of self-reported flow for the participant, and was used as a basis for comparison to the subsequent measures. Next, the participant played consecutive rounds of gameplay for at least 6 minutes while the experimenter watched on a monitor in an adjacent room. After the 6-minute mark, the experimenter waited until the end of a round that lasted at least 1 minute. The experimenter then entered the room and asked the participant to complete the FQ. After the 6-minute mark in the gameplay block, if the participant did not successfully play a round for a least one minute, the experimenter was prepared to administer the questionnaire at the conclusion of the 3rd attempt. This circumstance did not occur with any participant, however.

Following the completion of the FQ, the participant again played consecutive rounds of gameplay for at least 6 minutes and the experimenter repeated the data collection procedure. After playing for roughly 15-minutes in the first block, the experimenter had the participant complete a brief distractor task. The experimenter then asked the participant to complete the questionnaire designed to measure excitement levels in anticipation of playing the game a second time. The participant then completed the second block in which the same data collection procedure was followed as the first. At the conclusion of the second block of gameplay, the participant answered another a short questionnaire designed to ascertain the perceived value of the game stimulus, as well as determine if participants were aware of the experimental manipulation. Upon completion of the questionnaire, the experimenter returned to the room and removed the electrodes from the foot. The experimenter asked the participant if they had any questions. Finally, the experimenter informed the participant that the experiment had concluded and dismissed them.


H1 predicted a main effect of Soundtrack Type on the experience of flow. Flow is operationalized in two ways: a participant's responses to the FQ scale and EDA data. Cronbach's alpha for the 14 elements of the FQ scale was high (a = .962) and therefore a single mean value for FQ was calculated.

H1a was tested using a 2 (Soundtrack Type) x 3 (Flow Questionnaire Measurements) x 2 (Order) repeated measures ANOVA. There is not a statistically significant main effect of Soundtrack Type on self-reported Flow, F(1, 37) = 0.73, p = 0.788. H1a is not supported. However, the data do trend in the expected direction: collapsed across each of the three measurements per block the generative music soundtrack (M = 4.865, SD = .169) was rated higher on the FQ than the linear music soundtrack (m = 4.828, SD = .197).

Visually, the Soundtrack Types appear to be much more divergent at the second and third FQ measurements compared to the first (See Figure 1). This may be the result of a practice effect: participants were told that the first round of gameplay preceding the first FQ measurement in each block was for practice. Participants may not have taken the game seriously, thus equally inhibiting their ability to reach flow regardless of the Soundtrack Type. This explains why the first FQ measurements are so similar and the subsequent ones are divergent.


Figure 1: Soundtrack Type x Flow Questionnaire Measurement (n.s.) (click to expand)

An interaction was found when dropping the first FQ measurement taken following the practice round of gameplay and only considering the two subsequent measurements. Therefore, to test the hypothesis on the FQ scale data the final two collected measures per block were subjected to a 2 (Soundtrack Type) x 2 (FQ measurement) x 2 (Order) repeated measures ANOVA. This returned a significant interaction of Soundtrack Type by Order, F(1, 37) = 7.405, p = .010, in such a manner that participants first exposed to the generative music soundtrack version of the stimulus reported higher levels of flow across both blocks of gameplay than those that first experienced the linear music soundtrack stimulus. This interaction can be seen in Figure 2.


Figure 2: Soundtrack Type X Order interaction after dropping the first FQ measure. F(1, 37) = 7.405, p = .010. (click to expand)

There is a main effect of Order on the self-reported experience of flow, F(1, 37) = 4.090, p = .050, such that participants that played the generative music soundtrack version of the stimulus first reported higher on the FQ scale (M = 5.190, SE = .231) than those that played the linear music soundtrack version of the stimulus first (M = 4.503, SE = .249).

Additionally, the FQ measure split flow into 14 dimensions: absorption, actions happening of their own accord, perceived challenge, clarity of mind, ability to concentrate, control, flow, fluidity of action, fun, goal clarity, interactivity, lost in thought and the sensation of passage of time. When analyzing the individual dimensions to investigate H1, only two dimensions showed a main effect of Soundtrack Type that approached statistical significance: clarity of mind (p = .085) and fun (p = .102). Both were trending toward the predicted direction. Given that both of these dimensions only approach significance, and these are two out of the 14 dimensions of the questionnaire, this hypothesis is not supported by the self-reported data.

To test H1b, the EDA data were subjected to a 2 (Soundtrack Type) x 30 (EDA measurement) x 2 (Order) repeated measures ANOVA. EDA data was sampled once every second, but the data was reduced to 30 points for each of the blocks of gameplay for simplicity of analysis. The EDA data collected does not demonstrate a main effect of Soundtrack Type, F(1, 37) = 1.119, p = 0.297. This hypothesis is not supported by the data collected.

H2 predicted a main effect of Order on participants' self-reported excitement to return to playing the game after the distractor task. It was expected that participants that played the generative soundtrack version first will be more excited than those who played the linear soundtrack version first. The self-reported excitement data was submitted to a 2 (Soundtrack Type) x 2 (Order) repeated-measures ANOVA. This hypothesis is not supported by the results, F(1, 37) = 1.667, p = 0.205. Though not significant, means were in the predicted direction with participants that played the generative music soundtrack version of the stimulus first rating their excitement as higher (m = 3.619, SD = 1.627) than participants that played the linear version of the soundtrack first (m = 2.944, SD = 1.626).

H3 predicted that participants would achieve higher scores in the stimulus during the generative music soundtrack version of the stimulus. To test this, the first three scores within each soundtrack version of the game were submitted to a 2 (Soundtrack Type) x 3 (Score) x 2 (Order) repeated measures ANOVA. The results for the main effect of Soundtrack Type are not significant, F(1, 37) = 1.958, p = .170. None of the interactions were significant.


None of these hypotheses were supported. However, additional analyses suggest this may be due to too simple of an initial supposition of how the impact of generative music would manifest itself in game players. The music system impacts the participants' subjective experience, but not via simple main effects. There was an observed interaction between Soundtrack Type and Order in such a way that the version of the game that participants played initially impacted their experience of flow on the next block of gameplay. Participants that played the generative music version of the game first experienced more flow during both blocks of gameplay than participants that played the linear music soundtrack version of the game first. While this is not how the hypotheses were formally stated, the overarching question of "does a generative music system in a videogame impact the player's experience of flow?" is addressed by this result. The answer is "yes, it does."

In addition to the benefits of the generative music system on the participants' experience of flow, the generative music system was unobtrusive. Of the 39 participants, only 11 (28.2 percent) reported noticing any difference at all between the two versions of the stimulus. Of the 11 that reported a difference between the two versions of the game, only two (5.1 percent) of the participants suggested that the differences may be in the music soundtrack. This suggests that the experimental manipulation of Soundtrack Type was not apparent to the participants. While desirable in the context of this study, it seems counter-intuitive on the surface given how functionally different the two music systems are in terms of their responsiveness.

Generative music is one of the forefronts of experimental music and is fundamentally different from linear music (Herber, 2010). Generative music is "composed" in real time by a system that is responding to non-musical inputs that perturb a dynamic system. It should never repeat itself exactly. Linear music is what most people are familiar with: music composed and/or recorded that has a definite beginning and end and is repeatable. As such, generative music is sometimes associated with angular, complex and intellectualized music that is opaque or unenjoyable to the average listener. This association may make game designers apprehensive to utilize generative music systems in their videogames.

However; given the above results, there are clear benefits to generative music systems in videogames without being distracting to players. Generative music systems may act differently than linear music systems, but they do not have to be atonal, arrhythmic, or even particularly aesthetically challenging compared to linear music. The two can even sound similar enough to each other that they are indistinguishable in the context of the videogame, but can impact the player's subjective experience of the game.

Because the type of soundtrack was not obvious to the participants, it is logical to suggest that the knowledge they built about the game stimulus in regards to what it is and how it works is at least partially implicit. In this context, implicit means that they understand what the game looks and sounds like and how it acts. Interpolating Lang's (2014) Dynamic Human Centred Communication Systems Theory and Bailey and Lang's (2015) operationalization of the energy-efficient brain, it is reasonable to assert that participants perceive much of the game, including the soundtrack, as stable in the environment and that this is a judgment made early in the player/ game interaction. Just as Bailey and Lang (2015) found that participants were more likely to be unaware of changes made to objects that should be stable in the environment, participants in this study may have perceived the whole of the game as stable, other than the random selection of the blocks that fall from the top of the screen. That could explain why most participants were unable to correctly identify the experimental manipulation.

The overall goal of this study was to begin combining the bodies of literature regarding flow, videogames and generative music. To do so necessitated bringing together research from the Humanities and both qualitative and quantitative Social Science. The flow research straddles both qualitative and quantitative Social Science. The generative music research is entirely couched in the Humanities. In fact, there are no examples of scientific research on generative music prior to this study. The videogames research covers all three categories: Humanities, qualitative and quantitative Social Science. These modes of scholarship are radically different, and getting them to "speak" to each other is no small task.

Limitations and future directions

Flow, by its very nature, is difficult to measure. The methods employed here, and in any other flow research, are a balance between richness of data and invasiveness into the participants' experience of flow. Until it is possible to measure or indicate flow without interrupting the participants' experience of flow, flow research will be based on inference and a deft touch in the design of the study.

The decisions to block the gameplay and also how and when to collect data were not made lightly, however they are still imperfect because the procedure forces the participant to leave the game and fill out the flow questionnaire. Participants were able to play the game stimulus for an extended period of time without interruption, but that means the self-reported data collected about the experience of flow requires the participant to reflect on the past several minutes instead of moment-by-moment. Of course, Diaz (2011) avoided this by using CRM, but as previously discussed, CRM necessitates active participation to respond. This kind of active self-reflection is antithetical to the experience of flow.

Between the administrations of the flow questionnaire, most participants played multiple rounds of the Tetris-like game. It is possible that participants experienced flow in some of those rounds, but not others. The method of collecting self-reported data employed here does not permit knowing in which specific rounds participants experienced flow. Without the ability to tease this apart further, it is harder to appreciate exact conditions of the game and the participant that lead to the experience of flow. However, this method of carefully balancing the timing and iterations of self-reported data collected shows promise. As reported, the questionnaire tool shows extremely high agreement between its answers. This suggests the questionnaire is powerful and meaningful. Curiously, it is not seemingly different than other flow questionnaires in terms of the questions asked or the scales used to quantify the data but the reliability of the scale was quite high (Cronbach's alpha = 0.962). What may have been a factor here is how and when the questionnaire was administered. Both the questionnaire and the stimulus were shown on the same screen and required subjects to use the same input devices (keyboard and mouse). Additionally, while participants typically played multiple games of QuatrEno within the roughly 6-minute segments of gameplay, the length of time they are tasked to remember is generally shorter than most flow research. Another possible factor is that the questionnaire was administered in the context of interrupting gameplay at natural stopping points (between rounds) as opposed to the end of the activity. This may have helped make it easier for participants to reflect on the experience of flow.

The decision to collect the self-reported data infrequently compared to the number of rounds of gameplay necessitated a simplistic analysis of the EDA data. Without the ability to hone in on specific rounds of gameplay that elicited flow in a participant, the EDA data needed to be analyzed holistically over entire blocks of gameplay. Future research could prioritize examining the relationship between EDA and the experience of flow without complicating the procedure. Connecting the experience of flow to specific psychophysiological responses could be highly valuable to future flow research. Generally speaking, collecting common measures of psychophysiological data is only mildly intrusive from the participants' viewpoint, and once set up and the participant becomes acclimated to it, it requires no thought from the participant.

Another issue in this experiment is the choice of stimulus: QuatrEno. It seems that, overall, it was too adept at enabling a state of flow in participants, which may have aided in making it difficult to identify any main effect of Soundtrack Type. Reflecting on Sweetser and Wyeth's (2005) work, the stimulus used in this study features many of criteria for their GameFlow concept. The game grabs and maintains attention through the serial introduction of game pieces. There are no subtasks that might be distracting. The challenge level of the game gradually increases over time. Participants were presented with a game design that was familiar enough (it is essentially a Tetris clone) so as to need minimal, if any, training on how to play. The game stimulus provided immediate feedback and the game mechanics were simple and easy to learn. While GameFlow has not been empirically tested, the arguments made by Sweetser and Wyeth (2005) supporting it are in line with the findings of other scientific flow research. Ideally, pilot studies should be done to find a game that, with a traditional soundtrack, only sometimes gets participants to a state of flow. That way the effects of the soundtrack manipulation will be more obvious. Results in this experiment were impacted not only by a largely universal experience of flow across participants, but also a ceiling effect. This ceiling effect would have obfuscated any possible main effect of Soundtrack Type because participants in both conditions experienced flow.

Another factor that may have impacted the results of this study and obfuscated a main effect of Soundtrack Type is a practice effect. Because the effect related to the Soundtrack Type was only observed in the interaction of Soundtrack Type and Order, it is possible that by the second block of gameplay, nearly all participants were experiencing flow to some degree regardless of the Soundtrack Type. By the second block of gameplay, participants were more adept at playing the game and may have felt more confident in their abilities to play it, given that they had prior experience from the first block of gameplay. Practice effects were not measured, but should be in future research.

Lastly, in a concerted effort for environmental validity, what this study refers to as a linear music soundtrack is actually an interactive music soundtrack comprised of sections of linear music. Actual linear music systems are not commonly used in videogames, but interactive music soundtracks comprised of sections of linear music are common. This preference for environmental validity made the differences between the Soundtrack Types smaller because both were responsive to some degree, as opposed to generative and completely linear.

Future research into the relationship of flow and generative music systems in videogames is promising given these initial findings in this exploratory research. Now that an interaction has been uncovered, it can inform game designers of the value of a generative music system during the first several minutes of gameplay. These results suggest that flow can be achieved faster with a generative music system compared to a linear music system in videogames.



Bailey, R. L., Lang, A., & Rubenking, B. E. (2015). A Dynamic, Human-Centered Conceptualization of Flow, Presence and Transportation States.

Biocca, F., David, P., & West, M. (1994). Continuous response measurement (CRM): A computerized tool for research on the cognitive processing of communication messages. Measuring psychological responses to media messages, 15-64.

Bishop, S. R., Lau, M., Shapiro, S., Carlson, L., Anderson, N. D., Carmody, J., Velting, D. (2004). Mindfulness: A proposed operational definition. Clinical psychology: Science and practice, 11(3), 230-241.

Bowman, R. F. (1982). A Pac-Man theory of motivation. Tactical implications for classroom instruction. Educational Technology, 22(9), 14-17.

Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: the self-assessment manikin and the semantic differential. Journal of behavior therapy and experimental psychiatry, 25(1), 49-59.

Buskirk, E. V. (2018, January 13). Spore Will Include Music Generator Made in Part by Brian Eno. Retrieved July 31, 2018, from

Chen, J. (2007). Flow in Games (and Everything Else). Communications of the ACM, 50(4), 31-34.

Choi, B., & Baek, Y. (2011). Exploring factors of media characteristic influencing flow in learning through virtual worlds. Computers & Education, 57(4), 2382-2394. doi:

Collins, K. (2008). From Pac-Man to pop music [electronic resource] : interactive audio in games and new media / edited by Karen Collins: Aldershot, Hampshidre, England ; Burlington, VT : Ashgate, c2008.

Cowley, B., Charles, D., Black, M., & Hickey, R. (2008). Toward an understanding of flow in video games. Comput. Entertain., 6(2), 1-27. doi:10.1145/1371216.1371223

Czikszentmihalyi, M. (1990). Flow: The psychology of optimal experience. Praha: Lidové Noviny. Cited on page.

de Manzano, Ö., Theorell, T., Harmat, L., & Ullén, F. (2010). The psychophysiology of flow during piano playing. Emotion, 10(3), 301-311. doi:10.1037/a0018432

DeCastro, D. (2007). Quality Video Game Music Score, Considering Standards Set, and Personal Reflections.Retrieved from

Diaz, F. M. (2011). Mindfulness, Attention, and Flow during Music Listening: An Empirical Investigation. Psychology of Music, 41(1), 42-58.

Dillman Carpentier, F. R., & Potter, R. F. (2007). Effects of Music on Physiological Arousal: Explorations into Tempo and Genre. Media Psychology, 10(3), 339-363. doi:10.1080/15213260701533045

Empirisoft. (2012). MediaLab.

Engerson, C., & Herber, N. (2015). QuatrEno.

Engeser, S., & Rheinberg, F. (2008). Flow, performance and moderators of challenge-skill balance. Motivation and Emotion, 32(3), 158-172. doi:10.1007/s11031-008-9102-4

Gaser, C., & Schlaug, G. (2003). Brain structures differ between musicians and non-musicians. The Journal of Neuroscience, 23(27), 9240-9245.

Herber, N. (2008). The composition-instrument: Emergence, improvisation and interaction in games and new media Collected Work: From Pac-Man to pop music: Interactive audio in games and new media. Pages: 103-125. (AN: 2008-05059).

Herber, N. F. (2010). Amergent music : Behaviour and becoming in technoetic & media arts. (Bibliographies Theses Non-fiction), University of Plymouth. Retrieved from Available from EBSCOhost edsble database.

Jegers, K. (2009). Pervasive GameFlow: Identifying and exploring the mechanisms of player enjoyment in pervasive games.

Jin, S.-A. A. (2011). 'I Feel Present. Therefore, I Experience Flow:' A Structural Equation Modeling Approach to Flow and Presence in Video Games. Journal of Broadcasting & Electronic Media, 55(1), 114-136. doi:10.1080/08838151.2011.546248

Jin, S.-A. A. (2012). "Toward Integrative Models of Flow": Effects of Performance, Skill, Challenge, Playfulness, and Presence on Flow in Video Games. Journal of Broadcasting & Electronic Media, 56(2), 169-186. doi:10.1080/08838151.2012.678516

Joyce, C. (2016, October 05). How One of 2016's Most Talked-About Video Games Brought Generative Music to the Masses. Retrieved July 31, 2018, from

Kabat-Zinn, J. (1994). Wherever you go, there you are: Mindfulness meditation in everyday life: Hyperion.

Lang, A. (2014). Dynamic human-centered communication systems theory. The Information Society, 30(1), 60-70.

Larson, R., & Csikszentmihalyi, M. (1983). The experience sampling method. New Directions for Methodology of Social & Behavioral Science.

Münte, T. F., Altenmüller, E., & Jäncke, L. (2002). The musician's brain as a model of neuroplasticity. Nature Reviews Neuroscience, 3(6), 473-478.

Nakamura, J., & Csikszentmihalyi, M. (2002). The concept of flow. Handbook of positive psychology, 89-105.

Pantev, C., Oostenveld, R., Engelien, A., Ross, B., Roberts, L. E., & Hoke, M. (1998). Increased auditory cortical representation in musicians. Nature, 392(6678), 811-814.

Potter, R. F., & Bolls, P. (2012). Psychophysiological Measurement And Meaning: Cognitive And Emotional Processing Of Media (Routledge Communication Series).

Reeves, B., & Nass, C. (1996). How people treat computers, television, and new media like real people and places: CSLI Publications and Cambridge university press.

Rollings, A., & Adams, E. (2003). Andrew Rollings and Ernest Adams on game design: New Riders.

Schmithorst, V. J., & Wilke, M. (2002). Differences in white matter architecture between musicians and non-musicians: a diffusion tensor imaging study. Neuroscience letters, 321(1), 57-60.

Sherry, J. L. (2004). Flow and Media Enjoyment. Communication Theory, 14(4), 328-347. doi:10.1111/j.1468-2885.2004.tb00318.x

Solberg, D. (2015, November 17). How the magic sounds of Proteus are making their way into the real world. Retrieved July 31, 2018, from

Trevino, L. K., & Webster, J. (1992). Flow in Computer-Mediated Communication: Electronic Mail and Voice Mail Evaluation and Impacts. Communication Research, 19(5), 539-573. doi:10.1177/009365092019005001

Weber, R., Alicea, B., & Mathiak, K. (2009). The dynamic of attentional networks in mediated interactive environments. A functional magnetic resonance imaging study. Manuscript submitted for publication.

Weber, R., Tamborini, R., Westcott‐Baker, A., & Kantor, B. (2009). Theorizing flow and media enjoyment as cognitive synchronization of attentional and reward networks. Communication Theory, 19(4), 397-422.

Wooller, R., Brown, A. R., Miranda, E., Diederich, J., & Berry, R. (2005). A framework for comparison of process in algorithmic music systems.

Wrigley, W. J., & Emmerson, S. B. (2011). The experience of the flow state in live music performance. Psychology of Music, 41(3), 292-305. doi:10.1177/0305735611425903

©2001 - 2018 Game Studies Copyright for articles published in this journal is retained by the journal, except for the right to republish in printed paper publications, which belongs to the authors, but with first publication rights granted to the journal. By virtue of their appearance in this open access journal, articles are free to use, with proper attribution, in educational and other non-commercial settings.