1 Introduction

This paper specifically attempts to reconcile two aims. The first is the aim that John Dewey makes explicit at the beginning of his book Art as Experience, initially published in 1934: “the task is to restore continuity between the refined and intensified forms of experience that are works of art and the everyday events, doings, and sufferings that are universally recognized to constitute experience” (p. 2). The second aim is the one posed by Maria Brincker’s paper on the aesthetic stance (2015), where she contends: “aesthetic experiences are indeed a special subset of perceptual experiences, but distinguished through these relative dynamic relations rather than object features and attitudes alone […] [I]t cannot be stressed enough that under a process-oriented framework aesthetic experiences need not be ‘all-or-none’. Rather, an analysis of the aspects contributing to specific experiences could be used to elucidate rather than eradicate borderline cases and their respective temporal and contextual structures” (p. 132).

To overcome this tension, I will offer a theoretical model and discuss some cognitive processes that work as constraintsFootnote 1 on a progressively enacted aesthetic rhythm able to drive the unfolding of the experiences from which it emerges.Footnote 2 However, these processes are by no means necessary or sufficient conditions for every aesthetically relevant experience. In fact, I am sympathetic to the view that we cannot speak of one type of aesthetic experience (Gallagher, 2021). The aesthetic presents certain particularities depending on the specific activity that enacts it, whether we are speaking of a performer or a member of the audience, and the agent’s sociocultural context. Accordingly, the aesthetic rhythm has to remain open to additional or substituting cognitive processes related to more specific aesthetic contexts.Footnote 3 For this reason, it is essential to note that in this paper my interest lies in the raw contextual dynamics that play a role in the emergence of the aesthetic in its most general sense. I will not restrict the aesthetic component to particular events or objects, such as artworks; rather, I will focus on the impact of the aesthetic on ordinary, non-glamorous, yet potentially relevant, aspects of everyday life. I take aesthetics as a framework to address, explore, and discuss the co-regulation of the ‘how’ and the ‘what’ in experience. As Stephen KaplanFootnote 4 claimed: “[a]esthetics is not the reflection of a whim that people exercise when they are not otherwise occupied. Rather, such reactions appear to constitute a guide to human behavior that has far-reaching consequences. Many everyday behaviors, such as organizing one’s work space and arranging and maintaining one’s home, may reflect factors of this kind” (1988, p. 26).

In order to fulfill this goal, I will adopt an enactive perspective while also taking into consideration research from fields such as cognitive science, dynamic systems theory, and other non-representational, situated, and embodied philosophical perspectives, such as ecological psychology. Despite the tensions and differences between enactivism and ecological psychology, both approaches emphasize the role of bodily aspects, as well as of social, material, and cultural elements in the constitution of cognition (Chemero, 2009; Gallagher, 2005; Gibson, 1979; Malafouris, 2013; Varela et al., 1991). In the specific case of aesthetics, there have been some extremely significant contributions (Brincker, 2015; Carvalho, 2019; Gallagher, 2021; Noë, 2016; Stamatopoulou, 2018). Regardless of their different interests and slightly different theoretical frameworks, most of these works pay special attention to change and dynamic unfolding at different levels.

Following the lead of these and other relevant works, and paying special attention to some aspects of Dewey’s aesthetic theory – considered among the most significant precursor of temporally extended, embodied, and situated aestheticsFootnote 5 – I will offer an enactive model focused on some cognitive dynamics leading to the progressive enactment of an aesthetic rhythm that constrains experience. In Section 2, I discuss Dewey’s notion of aesthetic experience and propose a notion of aesthetic rhythm as a subtype of cognitive rhythm. Specific aspects of the model will be discussed in Section 3. Finally, in Section 4, I present the conclusions and introduce some potential future directions for research on aesthetics and related fields.

2 From aesthetic experience to aesthetic rhythm

One of the biggest issues we face when dealing with the term ‘aesthetic experience’ is the implicit assumption that the ‘aesthetic’ is one specific type of experience, detached from other non-aesthetic experiences. This idea points to a modularity of experience that goes against current ideas in cognitive science that emphasize the importance of global dynamics and interactions (Chialvo, 2010; Fries, 2015; Thompson & Varela, 2001). However, abandoning the concept of aesthetic experience leaves us powerless to deal with instances of experience capable, for example, of offering “affordances that short-circuit in a way that comes back to the perceiving agent, disrupting ordinary engagements, and creating possibilities that are not realizable in current or established frameworks” (Gallagher, 2011, p. 113). This and other similar characterizations of aesthetic engagements refer to particularly rewarding and meaningful experiences. Are we to believe that they are instantaneous all-or-nothing events with shallow cognitive roots? This hardly seems reconcilable with our own personal experiences. Aesthetic episodes often take time to develop and they are not under complete voluntary control. In the same vein, Dewey (1980) argued that when we see something pictorially (which might also be taken to mean aesthetically), “it is seen as a related part of a perceptually organized whole. Its values, its qualities as seen, are modified by the other parts of the whole scene, and in turn these modify the value, as perceived, of every other part of the whole” (p. 141). As this unified whole progressively emerges from the general stream of experience, it becomes an experience – which in Deweyan terms amounts to saying that it becomes an aesthetic experience. And Dewey claims that no aesthetic experience at all would be possible, were it not for the surrounding rhythms of nature.

For Dewey, rhythm is not a metaphorical concept. Dewey (1980) defines it as an “ordered variation of manifestation of energy” (p. 170). Rhythms tie together the environment and the phenomena that take place within it. Some examples of natural rhythms include ponds moving in ripples, the waving of branches in the wind, or the beating of a bird’s wings. (Dewey, 1980: 161). These and other natural rhythms like the cycles of plants, the alternation of seasons or animal migrations have always affected human existence. As Vincent Barletta (2020) argues, “rhythm for Dewey is always already there to condition our being and serve as the ground for what we see, feel, and do” (p. 110). We are always already partaking of this rhythmic fabric of nature and it scaffolds and constrains our cognitive processes; we live within this “kinetic and indivisible relation between organism and environment that serves as the ground of experience” (Barletta 2020, p. 111). Yet, while sensorimotor interaction is a precondition for general experience, it is not sufficient for having an aesthetic experience (Crippen, 2017, p. 190).

While discussing the effect of a painting, Dewey (1931) contends: “in every adequate union of sensory and motor actions, the background of visceral, circulatory, respiratory functions is also constantly called into action. In other words, integration in the object permits and secures a corresponding integration in organic activities” (p. 122). These bodily aspects not only modulate and are modulated by the sensorimotor engagement, but also affect the contents of experience, for “eye activities arouse allied muscular activities which in turn not merely harmonize with and support eye activities, but which in turn evoke further experiences of light and color, and so on” (Dewey, 1931, p. 122). But, what is this ‘adequate union’ that leads to the particular rhythmic form of aesthetic experience? The answer is an integration of doing and undergoing in a relationship (Dewey, 1980, p. 46). Only when “doings and undergoings fall into a rhythmic connection of ‘means-consequence’” (Crippen, 2017, p. 190), does the experience become a unified whole that is at the same time a summing up and fulfillment of what precedes it, carrying expectations tensely forward (Dewey, 1980, p. 179). This generates a mutual dependence within experience, whereby “[t]he living creature undergoes, suffers, the consequences of its own behavior. This close connection between doing and suffering or undergoing forms what we call experience. Disconnected doing and disconnected suffering are neither of them experiences” (Dewey, 1920, p. 86). The form of Dewey’s aesthetic rhythm is that of a rearrangement of energies that we perceive as a progressive integration of doings and undergoings by overcoming variations and tensions. These tensions are a consequence of our exploratory actions within the course of the experience. And, while we explore, we are intimately affected by an artwork: “there are released old, deep-seated habits or engrained organic ‘memories’, yet these old habits are deployed in new ways, ways in which they are adapted to a more completely integrated world so that they themselves achieve a new integration. Hence, the liberating, expansive power of art” (Dewey, 1931, p. 121). Differently put, “if the experience is aesthetic in Dewey’s sense, it will pull affective, cognitive, motor, and perceptual capacities into unity, albeit partly by challenging entrenched habits” (Crippen & Schulkin, 2020, p. 111). These challenges to old habits, along with the exploration of the situation and the achievement of a series of fulfillments, pull the experience into a unified narrative whole, while lending it a highly dramatic structure that makes it stand out from general experience (Crippen, 2017, p. 191). Given these situated, dynamic, and embodied aspects of the rhythm of an aesthetic experience, it seems that “Dewey has, in effect, written an enactive account of aesthetics” (Crippen, 2016, p. 246). However, despite Dewey’s emphasis on rhythm as a connecting form, he focuses almost exclusively on the temporal aspect, and not so much on how different rhythms combine and affect one another. At this point, the concept of entrainment fits perfectly.

Entrainment is a term from dynamic systems theory that denotes a process in which the frequencies of two or more oscillators exhibit a tendency toward a pattern of synchronization, either through a process of mutual influences or as one adapts to the other(s). Unlike the related but slightly different notion of resonance, entrainment does not immediately disappear once the oscillators have been separated, and can take place in systems with significantly different frequencies (see Pikovsky et al., 2001 for an analysis of entrainment, resonance, and synchronization).Footnote 6 Entrainment can bring about absolute coordination – whereby the phase or frequency of two or more processes becomes transitorily locked – but it can also result in relative coordination. This means that entrainment can be subtle and even go unnoticed, while still being a relevant dynamic phenomenon in all type of interactions. Entrainment has been identified in both non-living and living systems – e.g., a system of pendulums, a murmuration of starlings or a group of dancers. In the case of human cognition, researchers speak of perceptual, autonomic, physiological, motor, and social entrainment (Trost & Vuilleumier, 2013). Their particularities, as well as the ways in which they interact with one another, are still open to discussion; however, it has been proposed to consider them as different manifestations of the same phenomenon (Trost et al., 2017). What we already have is research showing non-linear couplings between environmental, brain, and bodily oscillations that are, at least partially, accountable through processes of entrainment (see Lakatos et al., 2019 for a review on neural entrainment; Azzalini et al., 2019 for a review on bodily oscillations affecting brain processes; Fusaroli, 2015 and Chemero, 2016 on the emergence of collective social entrainment).

Consequently, a notion of rhythm as “an evolving pattern of oscillations able to entrain other oscillations” (Vara Sánchez, 2020a, p. 88) offers the possibility to accommodate nested interactions between different oscillatory activities coming from the body, brain, and environment while retaining the temporal aspect of rhythms and emphasizing the variability of the rhythmic form. This relational definition considers rhythms to be particular patterns that emerge from the interaction of two or more oscillatory elements. Speaking of human beings, we can focus, for example, on the emergent rhythm of the contraction of the heart – caused by the interaction of electric impulses generated by cells of the sinoatrial node. But this rhythm can also be regarded as part of a bigger rhythm, along with other mechanical rhythms related to respiration and gastric activity that are reciprocally regulated. And the resultant bodily rhythm can be, in turn, considered an element that is part of a much more complex rhythm, together with brain and environmental oscillations, all of which enact nested dynamic constraints that affect the whole cognition through entrainment and have an effect on experience (Vara Sánchez, 2020b).Footnote 7 Yet, this does not mean that we can register a unitary, constant oscillation in the body and the brain. The various oscillations are nested in such a way that variations in one of these local rhythms affect rhythmicity as a whole. The different oscillations that we find in the body, brain, and environment not only serve their specific functions, but become part of an ongoing set of rhythmic constraints constitutive of cognition, for there is always a multi-layered rhythm intertwining us within the world. This rhythm can be quite simple if we are just lying in bed, with nothing particular in mind, or more complex if we are playing the piano. In any case, the temporary rhythmic layers that emerge with certain tasks constrain and are constrained by the pre-existing ones that were already part of the rhythmic pattern. A cognitive rhythm is not a fixed property, but an emergent interaction able to drive experience and the underlying levels of cognitive processes that enact it. That is, a rhythm registered during a cognitive process exerts a local-to-global and a global-to-local influence on its different components (See Thompson & Varela, 2001; Thompson, 2007; Di Paolo et al., 2010).

Getting back to aesthetics, the question is obvious: what constitutes a cognitive aesthetic rhythm? I will devote the next section to answering this question. For now, I will just note that a raw aesthetic rhythm typical of a non-artistic interaction may present an embodied and situated dynamic interaction between, on the one hand, the sensorimotor and affective processes taking place in brain regions at faster timescales and, on the other, the attentional and narrative processes taking place at the level of brain networks at longer timescales. This interaction will evolve differently depending on whether it remains on the pre-reflective side of experience, it reaches the reflective side of experience, or it becomes fully reflective. I will suggest that while this form constitutes the essential rhythm present in certain aesthetic experiences, an aesthetic rhythm – in line with the definition of rhythm – does not conform to any predetermined structure, but always remains open to additional or substituting cognitive processes specific to certain types of aesthetic experiences.

3 A dynamic model of aesthetic rhythm

In this section, I outline a model of aesthetic rhythm able to constrain pre-reflective and reflective experience at the three relevant timescales defined by Francisco Varela (1999): basic or elementary events (up to 100 ms), the integrative scale (up to few seconds), and the narrative timescale (more than a few seconds). I focus on a set of cognitive processes with accumulative effects that initially emerge as a pre-reflective aesthetic rhythm with the potential to become a reflective aesthetic rhythm in which previous processes enact a new dynamic pattern.

3.1 Pre-reflective aesthetic rhythm

I suggest that the pre-reflective aesthetic rhythm that occurs in certain common experiences is mainly constituted by two interacting, embodied and situated dynamics that become reciprocally linked constraints. Both components are subject to bodily and situated influences but, at the same time, the activity of each is also driven by that of the other in a non-linear way. The dynamics are the following: (1) asymmetric interactions between sensorimotor and affective processes at brain region levels, (2) changes in functional correlation between narrative and attentional brain networks. While the first dynamic mainly originates from processes that correspond to the elementary and integrative timescale, the second one is consistent with processes registered at the narrative timescale. Yet, these two dynamics should be regarded as the inner and outer borders of the same emergent system. Changes at one pole certainly have effects at the other one; taken together, they enact a dynamic landscape able to accommodate changes coming from the environment, brain, and body, taking place at different temporal timescales.

3.1.1 Sensorimotor and affective asymmetry

Incubation is one word used by Despina Stamatopoulou (2018) to describe the dynamic scaffolding of the background affectivity that takes place during the unfolding of an aesthetic experience. Stamatopoulou (2011, 2018) has worked on embodied approaches to different person-world interactions, paying particular attention to the dynamics of the engagement within the unfolding of the action. Stamatopoulou (2011) argues that infants’ scribbling enacts explorative behavior with an expressive potential, whereby the engagement in action contributes to the self-regulation of experience. These actions allow infants to experience unexpected contingencies as pleasurable, and this works as an affective force that glues the integrated embodied system between person and the other/object (p. 167). Stamatopoulou contends: “the child’s intentional/affective attitude, in reciprocal relation with the medium and the world and constrained by the child’s developing embodied nature, creates/constructs meaningful content in the scribbling pattern. In this sense, referential content is not predetermined but emergent through the schematization process of the emergent embodied self” (2011, p. 186). This dynamic contributes to symbolic development and maybe to aesthetic development too. In the specific case of aesthetic experiences, Stamatopoulou (2018) suggests: “by means of an affective-motivational attitude (that echoes the positivity of synchronization) and by means of mimesis as an embodied imaginative act that enacts lived significance to the unfolding action (expressive (re)enactment), we enter into an intensified meaning constructive action […] which constitutes ‘praxis’—not mere action” (p. 183). What I find particularly relevant here is Stamatopoulou’s emphasis on the aesthetic experience as an incubated experience in which we experience things through a background affectivity that keeps the ongoing perceptual engagement flexible enough to allow us to shift between different attentional perspectives.

Among the many possible mechanisms with the potential to be involved in the integration of sensorimotor and affective aspects (see Colombetti, 2014 for an enactive perspective on the subject), I would like to take into consideration the one proposed by Gil B. Carvalho and Antonio Damasio in a recent preprint. Carvalho and Damasio (2019) hypothesize that non-synaptic transmission plays a particular role in affectivity. The evolutionary persistence and pervasiveness of this type of molecular signaling in particular areas of the nervous system – including those linked to information about bodily states such as the limbic system, the brain stem, and parts of the autonomic nervous system, despite its lesser spatiotemporal efficiency – points to an essential role in affectivity. They claim: “Whereas movement and perception require precision and gain from speed, the world of affect – moods, and plenty of feelings – can tolerate some vagueness and slowness, perhaps even benefit from them. […] This would be in keeping with the fact that NST [non-synaptic transmission] occurs over relatively long timescales – seconds, minutes or even longer –, as opposed to the milli- or even microseconds range involved in synaptic transmission” (Carvalho & Damasio, 2019, pp. 16–17). Carvalho and Damasio argue that these two mechanisms are complementary: synapses predominate in areas that mediate the faster and more precise ‘what’ of the neural function – e.g., sensory perception, skeletal muscle contraction, fine-grained cognition – whereas non-synaptic transmission predominates in regions involved in the slower and less precise ‘how’ – e.g., affect, arousal (Carvalho & Damasio, 2019, p. 17). Following their theory, the sensorimotor and the affective can be regarded as two aspects of cognition with different but complementary temporalities. That is, some processes related to their functions originate in different areas of the nervous system and work at different timescales, but the one-to-many diffusive nature of non-synaptic transmission modulates a huge number of one-to-one synapses, while synaptic transmission also shapes the activity of non-synaptic cells. The alleged ‘slowness’ and ‘vagueness’ of affective non-synaptic transmission allows our changing actions to be consistent yet not bound by underlying affectivity. It is normal for our actions to be faster responses to a changing environment while our moods and emotions, in non-pathological states, usually emerge and change at a much slower pace.

The sustained activation of non-synaptic and synaptic mechanisms, along with other processes that differentiate affective from sensorimotor processes, may lead to an asymmetric background in which none of these processes is able to accommodate itself to the others’ activity. While sensorimotor processes adapt to the evolving situation in a faster and more precise way, they are entrained only to a level of relative coordination by the continuously changing affective background which, on its part, tends to cause more progressive and smoother changes. Likewise, affective components enact affective processes entrained by a sensorimotor scenario that keeps constantly changing due to new emerging features coming from the environment or from other cognitive processes. The persistence of this dynamic may generate an amplifying loop in which the effect of what has just happened destabilizes what is already happening and the reciprocally caused asymmetry keeps expanding and is constantly in need of additional processes to alleviate it. As a consequence, whereby sensorimotor and affective aspects become interwoven, the ongoing activity acquires an expressive quality. Borrowing Stamatopoulou’s words (2018), there is a progressive enactment of lived significance to the unfolding action, which leads us to intensified sense-making activity.

3.1.2 Narrative and attentional correlation

The aforementioned asymmetry between affective and sensorimotor processes has the potential to trigger a dynamic response by slower cognitive dynamics intended to balance it. Meaningful and powerful experiences, such as aesthetic experiences, are not usually immediate. They require time to unfold.Footnote 8 This is the way they become more rich, nuanced, and context-specific. In a sense, the processing of the affective/sensorimotor asymmetry constitutes the reservoir out of which the experience builds itself in the most appropriate way for this precise engagement. The affective/sensorimotor asymmetry generates raw aesthetic snapshots in which the sensorimotor and the affective are blended but in need of contextual integration. This asymmetry constrains slower processes in a local-to-global direction, and starts driving brain networks either to pay attention to aspects coming from the environment or to take into consideration some patterns rich in affective information. These dynamic changes in brain networks constitute the other process that is part of the pre-reflective rhythm.

Brain networks are groups of functionally related brain regions. The precise dynamics, borders, or even number of brain networks are far from having been settled. Yet, there seems to be some consensus as to the fact that the interaction between these networks is an essential factor to understand cognition (Bressler & Menon, 2010; Petersen & Sporns, 2015). Brain networks present particular patterns of correlation and anticorrelation depending on aspects such as the predominant cognitive process, body-related circumstances, and the current task. Among the various networks, the default mode network and the dorsal attentional network were believed to anticorrelate depending on whether the prevalent cognitive focus is oriented internally or externally (Fox, 2005). The default mode network is involved in processes such as self-reflection, autobiographical memory, future event simulation, conceptual processing, and spontaneous cognition (see Raichle, 2015 for a review). This network also seems to be involved in transitions between cognitive states – i.e., switching from perceiving to desiring, from knowing to feeling, etc. – (Smith et al., 2018) and some of its hubs are constrained by bodily oscillations such as heartbeats, driving the level of selfhood and agency of the experience (Babo-Rebelo et al., 2016). However, during cognitive tasks that demand external attention, its activity usually decreases. The network that takes its place is the dorsal attentional network, part of the ‘task positive network’. It would seem that cognition toggles between them as a way to optimize resources. Nonetheless, recent research has shown that the pattern and degree of the anticorrelation between the default mode network and components of the attentional networks varies enormously depending on circumstances such as the age of the agent, the degree of cognitive impairment, and the task at hand (Esposito, 2018; Fornito, 2012; Golland, 2008). Moreover, in some cases, there is no anticorrelation at all, but rather a corrrelation between parts of the task positive network and the default mode network. Examples include autobiographical future planning, creativity, memory recall, working memory guided by information, and social working memory (see Dixon, 2017 for a review); that is, processes in which we exert voluntary control over internally oriented processes. Neuroaesthetic findings suggest that six-second engagements with representations of intensely moving artworks might be another instance of correlation between the default mode network and task positive activity (Vessel et al., 2012; see Brincker, 2015 for a discussion of these results from an embodied aesthetics point of view). Belfi and colleagues (2019) have followed this line of inquiry, studying the effects of different exposure times (1, 5, and 15 s) to high, medium, and low moving images in regions of the default mode network and other brain areas. They concluded that the registered “dynamics suggest that the DMN tracks the participant’s internal state during continued engagement with aesthetically pleasing experiences, as well as during disengagement from non-pleasing stimuli” (p. 595).

Although much remains to be understood about brain network dynamics, following all the aforementioned results, I suggest that we can expect the default mode network to play a significant role in most types of aesthetic experiences, even if they involve what have usually been considered to be ‘task-positive tasks’. This significant role could take the form of either a higher correlation with attentional networks or of an increased amplitude in activation patterns. Changes in brain network shape aesthetic experience in a global-to-local direction, constraining the fine-grained sensorimotor/affective processes at the faster pole of the rhythm. A correlation between the default mode network and attentional networks may lead to changes in cognitive processing and to the enactment of memories, or direct our attention toward particular aspects of the aesthetic engagement. These and other potential outcomes would require more sensorimotor and affective processing in addition to what was already part of the ongoing experience.

Considering pre-reflective aesthetic rhythm as a whole, I suggest that this kind of rhythm will be triggered if an object or event from the environment entrains our attention in a narratively meaningful way; in other words, if the sensorimotor/affective raw aesthetic components that come to our attention resonate with the narrative background and partially destabilize it. At this cognitive ‘sweet spot’, we are moved by the environmental engagement, but neither shaken nor grazed. This is clearly a context-specific spot. Our ongoing moods, expectations, and previous experiences will modulate the attentional requirements and some potential outcomes of this nascent experience. However, the enactment of this rhythm only means that an experience might become aesthetic, although on many occasions it will not. Whether or not it becomes an aesthetic experience depends on the successful engagement of a reciprocal constraint between faster and slower dynamics. This would require the right amount of asymmetry coming initially from the sensorimotor/affective pole – too little makes the experience, at most, interesting; too much would trigger a non-aesthetic focus, most likely, restricted to one specific aspect. Only the right amount will change the correlational equilibrium between brain networks or affect their dynamics in such a way as to integrate the asymmetry with previous experiences or with other relevant attentional aspects of the ongoing interaction. If this integration, in turn, affects the background asymmetry, we will have a system of two processes entrained to an unstable equilibrium and whose non-periodic and unforeseeable tensions affect other cognitive processes and our experience during an aesthetic engagement. Dewey (1980) argued that in his definition of “rhythm as ordered variation of manifestation of energy, variation is not only as important as order, but it is an indispensable coefficient of esthetic order. The greater the variation, the more interesting the effect, provided order is maintained” (p. 170). I see pre-reflective rhythm as the dynamic at which this variation originates. ‘Variation’ is another name for the unstable equilibrium enacted when different aspects of the experience do not perfectly entrain to what causes the experience or to one another. This instability is an emergent quality of the experience resulting in a reciprocal modulation at different time scales and it is what makes an experience (at least) minimally aesthetic. Accordingly, this is the first significant borderline between certain aesthetic and non-aesthetic experiences.

The sustained production of this tension is necessary for the experience to continue. It will evolve as long as what we perceive resonates with affective processes that demand an attentional and narrative readjustment; and the ensuing sensorimotor/affective engagement, in turn, resonates again in a slightly divergent way with the attentional/narrative background. To put it in more dynamic terms, what is required is a recursive relation in which the positive feedback loops that amplify salient aspects of the experience subdue, by a small margin, negative feedback loops that try to stabilize and harmonize the experience (see Lewis, 2005 for a relevant dynamic model of emotions). A transient and fragile seesaw between competing dynamics in which chaos prevails by a small margin lies at the core of the pre-reflective rhythm of certain aesthetic experiences. This is a rhythm initially inaccessible to voluntary control and reflective consciousness, which nevertheless shapes the phenomenal content of experience. The longer the amplifying dynamics that generate asymmetry slightly overcome the stabilizing processes that try to bring order to the experience, the more brain, bodily, and other sensorimotor aspects will be entrained to the rhythm, and the more likely it will be that the reflective side of the experience is reached.

3.2 Reflective aesthetic rhythm

I contend that the reflective rhythm consists of a set of processes that emerge in some aesthetic experiences. Two interconnected processes constitute this rhythm: (1) an aesthetic affordance that invites exploratory actions and activities through its mineness, (2) the enactment of a set of metastable states between attentional and narrative processes that allow effortless transitions between cognitive perspectives of the ongoing experience.

3.2.1 Aesthetic affordance

Maria Brincker (2015) has introduced a remarkable framework for aesthetic perception from an embodied and situated point of view. She has identified several key dynamic aspects that constitute the ‘aesthetic stance’. Among them, the notion of aesthetic affordance seems central to her hypothesis: “an affordance of perceptual engagement but yet non-action, which opens up possibilities for using our minds – and brains – in ways we do not in our regular practically engaged modes of perception” (Brincker, 2015, p. 123). These ‘un-actable’ affordancesFootnote 9 preclude a goal-oriented reaction, opening us up “to an otherwise difficult intimacy with the perceptual experience and virtual other” (p. 125). This idea resonates with Shaun Gallagher’s characterization of the perception of certain artworks as involving non-realizable, non-practical, and non-interactionable affordances able to come back and make the one having the experience aware of his possibilities, by disrupting ordinary engagements (Gallagher, 2011, p. 109). It should be noted, though, that Brincker’s focus lies on the experience of beholders in artistic contexts. She makes it clear that ‘aesthetic affordances of non-interaction’ belong to a particular aesthetic stance, but are not constitutive of all aesthetic experiences. They are “neither necessary nor sufficient for aesthetic engagement: e.g. we daily look at images with goal-directed eyes, and often take an aesthetic stance towards practical objects” (2015, p. 124). For example, we might say that certain sculptures “invite asymmetric, non-interactive modes of perception, in that the beholder perceives the beheld but not the other way around” and that “this asymmetry and lack of reciprocity in the aesthetic affordances precisely invites a different kind of engagement” (2015, p. 123). However, not only do certain art-related aesthetic experiences require active engagement, but in many non-artistic aesthetic experiences, as in other everyday activities, overt actions contribute to the integration of the experiences (see Crippen, 2016 and Crippen, 2017 for a discussion of the similarities between Deweyan and enactivist views of this issue). In the same vein, I suggest that aesthetic affordances affect us in a meaningful way that we cannot completely anticipate,Footnote 10 and I believe that this is partly due to their significant mineness.

Roy Dings (2018) argues that there are three aspects which determine whether affordances invite an action or not: valence, force, and mineness. Mineness is the “extent to which an affordance is experienced as being close to ‘who I am’ or, more precisely, ‘who I take myself to be’” (p. 691). Dings draws this notion of mineness from Slors and Jongepier (2014), who define it as a product of “the external structure of experience; i.e. the way in which each experience is connected with and embedded in a context of other experiences” (p. 194). Slors and Jongepier contend: “the mineness of experiences may be accounted for in terms of their holistically fitting into a background of earlier and co-temporal experiences, thoughts, memories, proprioceptions, interoceptions, etc.” (Slors & Jongepier, 2014, p. 201). Arguably, the mineness of an affordance will be more salient if there are no pressing needs constraining us. This is not to say that in order for us to perceive aesthetic affordances as inviting we have to show disinterest toward the world; rather, aesthetic affordances will be most at hand when we are left to wonder and wander. It will be more likely for us to perceive the aesthetic affordances of a pond if we are neither thirsty, nor tired, nor lost. Aesthetic affordances do not need to entice us with a life-changing experience. The strength of their mineness will reside in its univocity: an object has to be experienced as the object. According to Slors and Jongepier (2014), the mineness of an experience “manifests itself in the absence of any further thought. The ‘naturalness’ of their occurrence, the fact that their occurrence makes perfect sense, given other earlier and co-temporal thoughts and perceptions, is what endows them with mineness” (p. 210). Considering aesthetic affordances, it seems that what we perceive is an invitation to this naturalness. We perceive aesthetic affordances as opportunities to modulate, sooth, enhance, rewrite, explore, feel, forget or merely reflect upon aspects of the narrative self, such as memories, interests, likings, desires or habits in a socially situated, extended, intersubjective, and embodied context.

Stamatopoulou (2018) argues that “the asymmetry on action tendency between the perceiver and the art object” as well as “the relational-interactive structure of the empathy processes that echoes the social” afford “the ‘subordination of the goal- directed action’ into to ‘the means of the action’s unfolding’, and it is this relational action-focus that shapes experience” (p. 176). Yet, while I agree that this quality of the experience is aesthetically relevant, I do not think that asymmetry of action is a prerequisite. It may play a significant role in more contemplative aesthetic experiences. But the mineness arguably permeates and glues the experience, contributing to the subordination of goal-directed actions into the means of the action’s unfolding. It is the mineness of the affordance that invites us to enjoy the specific experience of skiing, walking, or having dinner. And it invites us not so much to a goal, but to an unfolding that becomes an opportunity for us to explore and reconstruct ourselves by going beyond what we usually do, beyond our habits, beyond what we are usually comfortable with. We are not invited to a sensorimotor engagement in order to get something specific, but to experience this engagement in its uniqueness. And this promise of uniqueness that beckons us at the reflective border is partially dependent on the underlying processes of asymmetry and integration that have taken place on the pre-reflective side. The affordance has been affectively and narratively incubated by the work of hubs linked to the default mode network. Their sustained activity as part of the pre-reflective aesthetic rhythm ensures the background integration of different aspects, leading to the emergence of the experience as an experience of the here and now that, nonetheless, resonates with past episodes, opens us up to futures ones, and permeates social cognitive processes. In other words, the level of mineness cannot be attributed to one particular feature of an object or event, but only to a multi-layered interaction that encompasses the whole experience. The emergence of this mineness is the threshold at which the aesthetic rhythm is enacted on the reflective side of the experience. The entrainment between faster sensorimotor/affective dynamics and slower brain network correlations leading to a growing tension was the first threshold and it marked the emergence of minimally aesthetic experiences. This second border distinguishes minimally reflective aesthetic experiences. The experience reaches this threshold when there is enough mineness to make us perceive an aesthetic affordance as inviting. The level of mineness, up to a point, correlates with the tension generated on the pre-reflective side. That is, the greater the tension related to processes of generation of asymmetry and its integration into the narrative self is, the more mineness an object or event from the environment is endowed with and the greater the urgency is to act in order to soothe this tension.

However, only in some cases accepting an aesthetic affordance leads to an almost immediate soothing of the cause of the underlying asymmetry and the tension decreases. Imagine it is a spring afternoon and you are working at your desk when the sunlight enters the window. After a while, you feel like changing the music playlist you are listening to for another one full of songs that you listened to last summer. Suddenly, you feel a slight wave of heat rushing through your body and resume working in a better mood. We here find an environmental element – sunlight – generating a tension that affects attention in a narratively relevant way – e.g., lack of concentration, memories from the past, anticipation of the summer to come. The persistence of these background processes will invite, more and more intensely, to a meaningful action: listening to music that makes us feel like we are at the beach. Accepting this call diminishes the tension and gives rise to a pleasant feeling that permeates the whole experience. This could be considered a minimally reflective aesthetic experience. Nonetheless, what happens if, when you change the playlist, the experience becomes even more relevant, inviting further actions such as dancing and singing, while triggering richer evocations of an idealized summer? In this case, the initially perceived affordance – playing summer music – will have turned out to be just the ‘highest realized affordance’ (see Gallagher, 2020) of a much more complex field of nested affordances. We realize that the mineness previously concentrated in one affordance, rather than disappearing, has sprawled over the whole engagement. Cognitive processes working at longer timescales become even more entrained to the environment, since now there are more aspects in the engagement to account for, and, consequently, they will keep on driving the aesthetic rhythm.

Accepting the invitation of the initial affordances, in some cases, makes the agent open to other affordances. I suggest that this openness is the consequence of the emergence of the second relevant feature of the reflective aesthetic rhythm: the enactment of a metastable regime.

3.2.2 Aesthetic metastability

In dynamic systems theory, metastability is a type of coordination wherein the parts of a system combine moments of integration and moments of segregation. In cognitive science, metastability may denote a regime of activity in which the brain spontaneously transitions between periods of integration of neural activity, named dwells, that allow spatially disperse areas to work together and moments of segregation, named escapes, when the global pattern of activity reshuffles itself (Tognoli & Kelso, 2014 for a discussion on the role of metastability in cognition). The activity of certain brain networks is believed to either reduce – e.g. the attentional network – or increase global metastability – e.g. the default mode network – (Hellyer et al., 2014). The anatomical, temporal, and functional overlapping of these networks with other dynamics and aspects of cognition produces an oscillatory landscape in which metastability is easily lost and slowly regained. In other words, metastability is a fragile rhythmic equilibrium of maximum potentiality in which weak stimuli suffice to provoke relevant cognitive changes. Environmental engagements sometimes require quick responses, and for this reason, cognitive resources have to be readily at hand; metastability contributes to this responsiveness. According to several models and theories, metastability is the predominant dynamic regime at resting state (Hansen et al., 2015; Deco, 2017), while it has also been associated with cognitive flexibility and information processing (Córdova-Palomera, 2017). Kelso sums up its importance: “[M]etastability guarantees that the living brain […] never finds itself frozen for any length of time in a particular coordination state: no energy barriers need to be crossed to visit self-organized metastable tendencies. For this reason, it seems likely that natural selection has latched on to this aspect of self-organization, favouring metastability as necessary for adaptive behaviour” (Kelso, 2012, p. 914).

This metastable regime has been hypothesized to generate metastable behavior (Kelso, 1995). Bruineberg and Rietveld (2014), drawing from previous research,Footnote 11 have discussed its role in skilled actions: “[w]hile being skillfully engaged with a specific task, it is important that we can be affected by affordances on the horizon of our field and rapidly switch to another kind of adequate activity when something in the environment changes. Metastable dynamics are important for understanding the brain, because metastability is a prerequisite for the system to be able to effortlessly switch between different patterns” (p. 8). Metastability, thus, is characterized as a fluid state of quasi-equilibrium that affords both increased responsiveness to changes coming from any part of the brain-body-environment system and the possibility of reacting to these changes through sensorimotor and attentional shifts partially dependent on habits and skills. These characteristics seem to resonate with some relevant features of Dewey (1980)’s aesthetic experiences: the embodied and situated nature of aesthetic experience and rhythm as a progressively integrated transition between doing and undergoing. Yet, there is one significant difference. In skill-demanding activities metastability ensures the responsiveness that allows the enactment of the right sensorimotor habit at the right time, but in fully reflective aesthetic experiences metastability would have to afford the possibility of an exploration that affects the one who has the experience.

According to Crippen and Schulkin (2020): “Dewey’s account frames aesthetic experience as a mode of skilled exploration” (p. 109). Brincker (2015) considers that an aesthetic stance “might allow us to open ourselves to an otherwise difficult intimacy with the perceptual experience and virtual other” (p. 131). Stamatopoulou (2018) speaks of the “back and forth between perspectives and shifting attentional modes [that] enables the transformation of the awareness of how things are, here and now (deep immersion) to become awareness of how things seem, as relational imagined possibilities of the situated event” (p. 183). Gallagher (2021) contends: “The aesthetic experience of the performer […] is the unified experience that is both (a) an attunement to the character being portrayed (the music being played, the dance being danced) and (b) the self-awareness of the performer in the meshed cohesive gestalt of the performance itself” (p. 136). Arguably, all these examples refer to different aspects of a common exploratory quality that makes certain types of aesthetic experiences unique and meaningful. I would like to go further and suggest that this exploratory feature not only constitutes one differential aspect of fully reflective aesthetic experiences, but that this aspect is constrained by previously enacted components of the pre-reflective aesthetic rhythm.

If, instead of easing the tension, an interaction with an aesthetic affordance provokes the emergence of other affordances that aesthetic experience becomes fully reflective due to the emergence of an embodied and situated metastable coordination regime in which slight changes coming from the body, the brain, and the environment modulate the strength of the different invitations. In this situation, the potential paths of actions are not exclusively dictated by pre-existing habits and skills. Certainly, there are aesthetic engagements in artistic and non-artistic contexts in which skills and complex sensorimotor habits play a very significant role.Footnote 12 And, according to Dewey (1931), an aesthetic experience necessarily entails a reconfiguration of habits. However, I think that this reconfiguration of habits is a side effect of the differential aspect of aesthetic metastability: the influence of affective and narrative processes in the determination of the metastable states. This fact constitutes the system’s initial repertoire when the aesthetic experience becomes fully reflective. The progressive correlation between default mode network and attentional networks, driven by the sensorimotor and affective asymmetry that has been taking place since the start of the experience, modulates the initial possibilities when we enter aesthetic metastability. And these possibilities will be unique at every aesthetic experience because the pre-reflective processes have made the experience rich, nuanced, and context-specific through a tense integration of asymmetries constrained by a changing environment and non-linear processes. In other words, when I become aware that I am having an aesthetic experience, this experience has already begun and the pre-reflective roots that condition its reflective possibilities will always be different. The reflective aesthetic rhythm, therefore, is continuous with the pre-reflective one in terms of the cognitive processes involved, but is radically discontinuous with regard to its dynamics and their effects on experience. Metastability is not so much an additional layer added to preexisting ones, but an emergent dynamic regime in which components that were already at work enter a different organization as a result of being pushed beyond a critical threshold by the emergence of several concurring affordances.

Once we start to explore these different affordances, we will experience how these interactions affect narrative processes, capture our attention, or provoke changes in the environment, the body, or the brain. And we will become aware of how these modifications, in turn, shape the landscape of affordances generating new focuses of attention. Yet, paying attention to any of these new feature will again affect the whole experience, leading to the emergence of more salient aspects related to what is now being experienced, inviting new narrative resources that will have to be integrated into the whole. Attentional and narrative activities will keep shaping each other. Even if this reciprocal influence happens in many experiences, I suggest that the sustained and accumulating nature of these fluctuations is one specific aspect of aesthetic metastability. In the case of more specific or complex aesthetic experiences that require a specific attunement, the enactment of certain skills, or are subject to tight sociocultural constraints, we might also find a metastable regime. Yet, there will be additional or substituting cognitive processes, beyond the narrative and attentional ones, shaping the scope of the metastable quasi equilibrium. In general terms, aesthetic metastability affords an exploration, but what we explore are the consequences of the very same exploration on what we perceive and how it is affecting us. That is, the rhythm of a fully reflective aesthetic experience is made of integrations, disruptions, and tensions between the ‘how’ and the ‘what’ of the experience and their impact on ourselves.

Beyond this exploratory aspect, the differential aspect of a metastable regime is that it presents phases of integration and moments of segregation. Regarding aesthetic metastability, these spontaneous fluctuations add an element of unexpectedness to the experience. While the experience lasts, neither the evolution nor the outcome will be under our complete control. This seems to be consistent with Dewey’s claim that aesthetic experiences entail a deployment of old habits in new ways adapted to the ongoing engagement. Certainly, once we realize we are having an aesthetic experience, we can force ourselves to focus on an aspect of the experience because we desperately want to get something out of it, but this emphasis on dissecting what is going on will certainly put aesthetic metastability at risk: the attentional overcharge could cause a shrinking of the narrative component, making the experience more analytic than aesthetic. And this is because metastability also implies fragility. Metastable transitions can afford us unexpected insights that invite us to go toward an unforeseen path with the potential to expand and change previous habits, but they may also generate a tension that has to be overcome, or bring us to an aesthetic standstill in which the dominant affective and attentional aspects are not relevant anymore for the current engagement. Moreover, it does not matter how deep or moving the experience is, just a touch on our shoulder, an unexpected sound, or one distracting emotion suffices to draw our attention a little bit too much, making metastability disappear and with it our experience. Unexpectedness and fragility are two phenomenological aspects often attributed to aesthetic experiences, and they can be explained by an underlying regime of metastability.

To conclude, the dynamic system enacted by environment, brain, and body at the reflective side of a fully reflective aesthetic experience may be considered a rhythmic regime of aesthetic metastability in which the potential states of the engagement will be initially determined by the pre-reflective processes and, afterwards, by concurring cognitive processes and the meaningful highlights enacted during the experience that will produce sense and meaning by backstitching aspects of the present experience into previous ones. During a fully reflective aesthetic experience we will walk, interact with others, smile, frown, and think about or just feel what we are doing; and while all these actions can be relevant in themselves, they will also update our narrative self and add layers to the experience, granting to the one who lives it a feeling that this whole experience is his.

4 Concluding remarks and future directions

In this paper, I have introduced an enactive theoretical model of the raw dynamics behind the emergence of certain aesthetic experiences from the stream of general experience. The model focuses on three dynamically relevant nodes on the pre-reflective and the reflective side of experience that might contribute to shunt general experience toward different kinds of aesthetic experiences that we face in everyday situations. This proposal relies on some aspects of John Dewey’s aesthetic theory, while also taking into consideration embodied and situated philosophical research on cognition and aesthetics, and empirical results from cognitive sciences.

The three nodes that mark the transition between non-aesthetic and qualitatively different aesthetic experiences are:

  1. (1)

    The enactment of a reciprocal constraint, open to bodily and environmental influences, between faster sensorimotor and affective dynamics taking place at the brain region scale and slower attentional and narrative processes at the brain network scale. The entrainment of these components to an object or event from the environment happens at the pre-reflective level of experience and draws a border that distinguishes non-aesthetic from minimally aesthetic experiences.

  2. (2)

    Being invited by an aesthetic affordance to an interaction. This invitation is extended by the mineness of the affordance, which is consequence of the precedent pre-reflective aesthetic rhythm. Engaging with this affordance means that the aesthetic experience becomes, at least, minimally reflective.

  3. (3)

    The emergence of a metastable regime. Global metastability distinguishes the aesthetic rhythm that we find in fully reflective aesthetic experiences. It is a consequence of pre-reflective processes being pushed beyond a critical level when acting on an aesthetic affordance leads to the emergence of several affordances that, instead of soothing the underlying tension, invite us with meaningful and similar force.

These dynamics may be complemented or replaced by more context-specific cognitive processes. In the case of an aesthetic beholder engaged in looking at a painting, Brincker argues that we enter an aesthetic stance in which the experience affords a halt to the ongoing environmental interactions: a pause that causes an openness and a vulnerability of the perceiver linked to the lack of any goal-directed attitude (Brincker, 2015). Yet, to reach this point, she contends that temporally extended low-level physiological and emotional responses must contribute to the specificity of the encounter. That is, certain contemplative and more passive aesthetic experiences, at a certain point of the unfolding, would take a different path from other aesthetic experiences. Yet, the pre-reflective aspects and the metastability on the reflective side of experience could still be relevant. In the case of necessarily active aesthetic experiences, such as those of a performer or an artist, it has been proposed that a process of double attunement occurs within a meshed architecture that incorporates a vertical axis of minded and embodied-affective processes and a horizontal axis of extended and contextual scaffolding (Gallagher, 2021). Despite certain terminological and conceptual differences, I see Gallagher’s model as a particular case of skilled, socio-culturally constrained aesthetic engagement; that is, a model in which the raw narrative and attentional elements have been replaced or complemented by far more refined and context-specific processes.

The aim of this paper, though, was to reconcile the idea of strong continuity between aesthetic and non-aesthetic experiences posed by Dewey (1934), and the suggestion from Brincker (2015) of trying to elucidate, rather than to eradicate, the different dynamic aspects contributing to specific experiences. The model I have discussed acknowledges the existence of thresholds between non-aesthetic and aesthetic experiences and even between different types of aesthetic experiences. Then, I contend that we cannot speak of a unitary type of aesthetic experience and that there are both tensions and continuities between different types of non-aesthetic and aesthetic experiences. Minimally aesthetic experiences and engagements that we experience as interesting are similar in terms of attention, but different in terms of consequences on the narrative self. Both skillful interactions and fully reflective esthetic experiences lead to the enactment of metastability, but while the first one is strictly constrained by previously acquired habits and skills, the second one is much more context-specific due to the contribution of pre-reflective aspects of the experience.

The discussion between continuities and thresholds is of particular interest in non-artistic contexts. Being at a museum or knowing that we are listening to a song from a certain band may constrain the experience even before its beginning. However, exploring the borders and continuities between the aesthetic and the non-aesthetic allows aesthetics – as a research field – to become able to address, explore, and understand how certain interactions take place, as well as their roots, dynamics, and potential outcomes. If we agree about the transformative character of certain aesthetic experiences – i.e. about their capacity to challenge habits, and test beliefs and points of view – then the study of the mechanisms and dynamics that are part of different types of aesthetic experiences can be applied to other contexts. For example, educational or social-integrative activities could benefit from the enactment of aesthetic constraints aimed at fostering active engagement, creativity or social bonding. In addition, aesthetics could also help us to detect and identify undesired entrainments that are too strong to be avoided when they reach conscious awareness. Aesthetically relevant experiences are a far too common, influential, and powerful resource for individuals and society to be regarded as something detached from everyday life.