Student Production of Pencasting E-Learning Videos: What Drives Engagement?

This research paper shares innovative practice on a final year undergraduate module at a British university in which students create e-learning videos about key theories and concepts within their disciplinary field, Communication and Media. It analyses two student videos published on a class YouTube channel one of them the most popular video on the channel, driving thirty times as many subscribers as the other to develop understanding of factors affecting engagement. The videos use pencasting, an animation technique which has been shown to improve engagement, to visually represent and explain educational concepts and theories. This paper sets out current thinking on video as an educational tool, student video production, and the characteristics of engaging video content. Next, the module and assessment design are shared, together with an outline of teaching to support the pencasting production element. In conclusion, educators are encouraged to consider designing assessments in which students produce e-learning videos about key concepts and theories within their field of study, and five practical suggestions are offered for creators (both students and faculty) to improve engagement (1) create videos with a high proportion of visual representation, focusing on smooth, continuous flow approaches such as pencasting; (2) provide practical value through clear and simple explanation; (3) consider viewer emotional responses to the video; (4) create thumbnails that articulate the visual representation approach employed; and (5) employ an extensive range of tags to improve performance in search results.


Introduction
In recent years, a plethora of digital video platforms such as YouTube, Google Video and Vimeo have led to much greater and easier access to a wealth of video material, much of which is freely available to learners and teachers. The educational YouTube channel Khan Academy for example, which has 5.8 million subscribers, hosts thousands of educational videos about complex science, social science and humanities topics such as kinetic molecular theory, macroeconomics, medieval history, media bias and advanced coding skills.
During the 2010s video has become a more prevalent feature across the major social networks such as Facebook, Twitter and Instagram, and Chinese services such as RenRen, WeChat and Sina Weibo. During 2020, global lockdowns prompted by the coronavirus pandemic prompted a surge in the use of educational video. Khan Academy for example claims that global lockdowns have prompted a surge in engagement with its content which now stands at around 100 million watch minutes per day (Khan Academy, 2020). There have even been some notable cases wherein TikTok, a platform based entirely on short-form videos of no more than 15 seconds and notorious in some circles for its "clickbaity news and entertainment" (Fannin, 2019), has been used by teachers to engage learners through lockdown. In addition, video is increasingly produced for, and hosted by, the virtual learning environments of schools, colleges and universities around the world. This paper shares insights from the teaching design and student output of a final year undergraduate module in the Department of Communication and Media at the University of Liverpool, Viral Video. The module employs active learning strategies to support students to become creators of e-learning videos about their field of study. The videos are published on a class YouTube channel and promoted by students through social networks. In so doing, students not only develop subject knowledge but skills in the use of video as communicative tool, and understanding of the features of e-learning video that stimulate learner engagement.

Literature Review
The use of video within education is long-established. In 1946, the American teacher and educational theorist, Edgar Dale, in one of the earliest texts on the use of audiovisual methods in teaching, introduced the 'Cone of Experience'. The model served as a visual device to summarise his classification system for the varied types of mediated learning experiences (Kovalchick and Dawson, 2004, p. 161), and articulate his contention that seeing and hearing learning content is more effective than simply reading it. This principle and the wider application and efficacy of video for educational purposes have since been explored in a wide variety of contexts. In a study assessing the effectiveness of a psychodrama training video, for example, Bashman and Treadwell (1995) show that memory of pictures is much better than memory of verbal names of those pictures. Since video makes use of both audio and visual processing, it stimulates greater engagement with the content than when only one sensory system is used, promoting deeper learning (Mitra et al., 2010).
Alongside the multi-modal nature of video, the benefits of educational video include its capacity to introduce students to new information, provide background information, and enable active learning (ibid.), and novelty value, through its capacity to offer students and teachers a welcome break from traditional teacher-led learning activities (Alltinpulluk et al, 2020). Some suggest that video use is driving increasing numbers of students to enrol with higher education institutions that provide e-learning environments (Ritzhaupt et al., 2015).
Rapid technological advances and ongoing pedagogic developments in recent years have led to a rich body of knowledge and a diverse lexicon around the use of video in education, articulated across a wide range of literature. Terms are often used interchangeably and definitions often overlap, however the following section offers some attempt at unpacking the most common.

Types of educational video
The umbrella term educational video has been used extensively to describe a broad range of video used within an educational context. Education scholars Bruce and Chiu (2015) define educational videos as powerful technologies that dominate current elearning environments. Educational video has also been defined as a useful new tool for learning (Kalantzis and Cope 2008) and for developing literacy (Beach et al. 2010).
Myllymäki et al contend that educational videos can be utilized in multiple ways, such as recording and streaming online lectures, offering additional materials to supplement lecture-based teaching, as demonstrations and illustrations. Educational videos can replace or supplement face-to-face teaching, they may include slides with or without audio voice-over narration, or may be used to offer lecture summaries or explanations Video-based learning is defined as the "learning process of acquiring defined knowledge, competence, and skills with the systematic assistance of video resources" (Mikalef et al., 2016, p. 219; see also Giannakos et al., 2016). In a randomized controlled trial in medical education, Chotiyarnwong et al. (2020) found the efficacy of video-based learning on a par with traditional lecture-based learning, and that videobased learning reduces dependency on tutors, freeing them from lecture delivery to focus on more interactive forms of learning. In a similar context, Nazari et al. (2020) found that video-based learning reduced cognitive load requirements, resulting in fewer errors and better overall performance in post-learning assessments.
The educational psychologist Robin Kay uses the term video podcast to describe a wide range of video for educational use, such as recording and sharing explanation, lectures and talks, visits from guest speakers, supplementary materials for a course, powerpoint summaries and administrative tasks (2012). In a comprehensive review of the literature, Kay highlights numerous benefits to the use of video podcasts, including improved learning performance, improved control over learning, improved convenience and accessibility to learning, improved study habits, positive affective Both of the latter studies reported that students found video projects particularly enjoyable because they employed active learning, which they found to stimulate positive emotions and enhance motivation, and because the end product would have a use beyond the point of assessment. Thomas and Marks explain how video projects offer a "unique diffusion factor" (2014, p. 270), meaning students are much more keen to share their learning with a wide range of people, sometimes long after the course has ended, than with written work. While students are able to share essays and other written outputs with others, the medium of video has a broader appeal and offers "a faster, easier, and more digestible medium to convey a message to a large audience" (ibid.).
Students often enjoy creating videos, and there is evidence such projects are effective at embedding subject knowledge as well as communication and employability skills. Creation is the most advanced element of the cognitive process set out in Bloom's taxonomy of learning. Created in the 1950s and updated in 2001, the taxonomy establishes a framework for planning curricula and assessment at all levels of education, and articulates expectations for progression of student learning. It maps a hierarchy of learning development, starting with remembering, and progressing through understanding, applying, analysing, evaluating and creating. In the framework, creating is the highest level of learning, defined as producing new or original work through designing, assembling, constructing, developing, formulating, authoring or investigating (Anderson and Krathwohl, 2001). Interestingly, Dale's 1946 'Cone of Experience' model also argued that the most effective form of learning involves concrete, direct and purposeful experiences.

E-learning video
E-learning video is a form of educational video which is widely recognised in practice within online video industry and education communities, but has received little academic attention. The term e-learning video is used here to refer to short, focused videos used to support learning in easily digestible, bite-sized chunks. They are generally targeted at a specific audience, such as students of a particular level and/or discipline, and intended to support formal or informal learning on a topic. E-learning videos are generally designed to be widely accessible through publication on a public video-sharing platform, such as YouTube. Furthermore, they are designed to prompt broad engagement by encouraging discussion, comments and sharing.
E-learning videos may be produced by teachers or students as a way of sharing knowledge with others who may also be interested in the subject. This flexible conceptualisation of e-learning video production is intended to 'level the playing field' or reduce the distance between students and teachers. Through production of elearning videos that facilitate peer learning, students can themselves become cocreators of learning, reflexively recognising their own evolving knowledge and skills, and employing these to support others. The approach, at its heart, draws on the Latin proverb docendo discimus, meaning "by teaching, we learn". Often attributed to Seneca the Younger (c. 4 BC -65 AD), the concept of learning through teaching has been explored extensively by educational psychologists and others (see Fiorella and Mayer for a useful summary).
One of the world's largest open-access repositories of e-learning video is YouTube. In the book YouTube: Online Video and Participatory Culture, the platform's content archive is described as 'almost incomprehensively large and highly diverse' (Burgess and Green, 2018, p. 13). Overall, the platform hosts more than 5 billion videos, with 500 hours of new content uploaded, relentlessly, every hour (Omnicore, 2020). The YouTube Creator Academy reports that, every day, over a billion learning-related videos are viewed on the platform (YouTube, 2017). According to independent metrics site SocialBlade, some of YouTube's most popular e-learning channels include King of Random, wifistudy, TED Ed and a range of channels marketed to pre-school children (SocialBlade, 2020).

What drives engagement with online video?
There is much to learn about the reasons why large numbers of people engage with some online video content, but not with others, and there has been a surge in research in this area in recent years. A key factor is the content itself. The STEPPS framework proposes six contagious characteristics, namely Social Currency, Triggers, Emotions, Practical Value, Public Visibility, and Stories (Berger, 2013, see Figure 1). Drawing on a decade of marketing and psychology research, the framework maps the factors that affect user decisions about sharing media content. Some people, for example, share things that give them social currency, that make them seem intelligent or party to novel or exclusive information -quirky stories, unusual places to visit, invitations to exclusive parties and so on. People also like to share things that have practical value for others, like information on bargains or great days out, or reviews of terrible products or services. An important characteristic is emotion. Here, Berger demonstrates how people more often share things that stimulate high physiological arousal, such as anger, humour and awe, whereas things that stimulate low physiological arousal, such as sadness or mild contentment, are shared less often.

Figure 1: The STEPPS Framework
Image credit: Berger (2020) Guo et al. (2014) analysed the professional codes used in the production of e-learning videos to discover which codes achieved the greatest student engagement, measured through two proxies, the length of time spent watching a video, and whether students attempted post-video quizzes. The team sampled 6.9 million video watching sessions across four courses on the edX MOOC platform, including courses from MIT, Harvard and UC Berkeley, and interviewed six edX staff involved in producing the courses. They found that the most engaging videos are short (less than six minutes) with a clear aim or focus. This marries with findings from a study exploring the use of video on distance learning courses offered by a Turkish university which found that dividing videos into manageable modular parts reduced learners' cognitive loads and increased engagement in the learning process (Altinpulluk et al., 2019).
Guo et al. further found that videos recorded with a more personal feel, as if the instructor is speaking directly to the viewer, are more effective, as opposed to recordings of live lectures in a large hall, or high-fidelity studio productions, and that inclusion of an instructor alongside visual learning materials improved engagement, and that students were more engaged when instructors spoke quickly with high enthusiasm.
Interestingly, the study also found that videos featuring pencasting animations were more engaging than videos featuring powerpoint slides. Pencasts are a form of video that mimic hand-drawn illustrations on a canvas. The canvas is marked-up as the video progresses, generally with an instructor narrating the animation, as if they were creating the illustration in real-time. Often, pencasts have an informal, 'rough and ready' style, for example, errors in the illustration may be scrubbed out and re-drawn, rather than removed during the editing process. Text is presented in the creator's own handwriting, rather than stylised typefaces, and diagrams and other objects are often rather messy, more akin to a child's line drawing than a polished professional illustration. Guo et al suggest that it is the smooth, continuous flow of pencasting that drives strong engagement -viewers tune in to the ongoing emergence of visual symbols that slowly build a complete picture, supporting the simultaneous audio narration.
Other features of educational videos have also been explored. Wang and Antonenko (2017) found that student recall of information was better when an instructor was present, and that instructor presence positively influenced student perceptions of learning and satisfaction and led to a lower level of self-reported mental effort. Meanwhile, Schroeder and Traxler (2017) found that, while students rated videos including a human hand guiding the animation as more humanlike and engaging, postvideo test results showed that videos without this feature were more efficient at learning transfer.
Meanwhile, film theorist Wijnker and her colleagues, exploring the use of video resources in secondary science teaching in the Netherlands, found that videos that posed questions were associated with an increase in students' interest, and that highly informative videos with authoritative speakers were associated with an increase in students self-reported conceptual knowledge gains. Nahon and Hemsley (2013) argue that engagement is about more than the content itself, what they refer to as a 'bottom-up' strategy. They suggest that 'top-down' strategies such as global marketing campaigns, corporate gatekeeping and algorithms also play an important role in promoting certain content over others. The structure and consequences of such algorithms are tightly controlled by tech companies such as Google and Facebook, rendering them largely opaque to video creators and viewers, but YouTube claims that "thumbnails and titles act like billboards to help viewers decide to watch your videos" (YouTube, 2016), recommending that well-designed thumbnails and titles can attract more views because they shape viewer expectations and help them understand whether the video will serve their needs or preferences.
This study takes up these points, asking: what factors drive engagement with elearning videos?

Methods
To explore this question, the study focuses on the YouTube channel Media/Pool, run by the Department of Communication and Media in a British university. The methodological framework draws on Ivankova, Creswell and Stick's mixed methods sequential explanatory method, which entails "collecting and analyzing quantitative and then qualitative data in two consecutive phases within one study" (2006, p. 3).
International Journal of Management and Applied Research, 2020, Vol. 7, No. 3 -326 -Firstly, quantitative data was collected from the YouTube Analytics site to enable analysis of the total population of videos hosted on the Media/Pool channel. Secondly, a more detailed quantitative and qualitative textual analysis was undertaken on two videos with an identical title and topic focus. This approach thereby enabled the elimination of viewer perceptions of the title and topic as a potential cause of the significant variations in engagement between the two videos. Therefore, differences in video engagement would more likely be attributable to differences in visual representations of the concepts and theories explored in the videos, and engagement strategies.

Viral Video: The module
Viral Video is a 30 credit, year-long final year Communication and Media undergraduate module, designed and taught by the author since 2017. The module aims to provide students with the knowledge, skills and attributes to effectively employ video as a communicative tool. The module explores viral media theory and a range of popular genres including e-learning videos, digital activitism, digital marketing and digital storytelling. The module offers practical support for students to develop creative and technical skills including cinematography, audio production, pencasting and other animation, post-production and publishing. In the first semester, students work on individual projects. In the second, students work in teams on projects co-designed with industry partners who act as 'live clients', such as local visitor destinations, start-up tech companies and social enterprise. Large numbers of international students take the module. Groups are mixed according to gender, nationality and skills, and as part of their preparation for group work, students explore theoretical frameworks and practical approaches to teamwork, leadership and intercultural skills.

The task
During semester one, students are asked to design and produce a 2-3 minute video, representing 30% of the overall module grade. They are able to choose from a selection of project briefs, one of which is an e-learning video that explains a communication or media theory or concept. The brief requires that e-learning videos are aimed at first year Communication and Media students, with the aim of helping them understand key concepts, theories and approaches from their field of study. This video should be in simple, easy to follow language, simplifying complex concepts, theories or approaches, with visual illustration and audio narration. Videos should demonstrate a range of features of engaging e-learning videos as discussed in class and identified in research literature.
Videos must be suitable for publication to the Media/Pool YouTube channel, to a playlist entitled Media/Geek. This is one of the most-viewed playlists on the channel, with views peaking each year around winter and spring exam seasons. Videos and associated promotional materials must comply with university social media policies in that they must not bring the institution into disrepute, must not discriminate, bully or harass etc. Students are able to set their own topic. In doing so, they are encouraged to look back over their study notes from first and second year and reflect on the concepts, theories and approaches they learned about, particularly those they may have found

Support
A series of four two-hour workshops offer students the opportunity to develop understanding of the e-learning genre, explore case studies, reflect on their own use of e-learning video, and engage in group discussion activities. Demonstrations are provided of the pencasting process and opportunities and equipment are provided for students to try out the process in small groups.
For the pencasting element, students are able to borrow low-cost graphics tablets and pens from the University, such as the Huion H610 Graphics Drawing Pen Tablet, which offer a smoother and more sensitive alternative to a handheld or touchpad mouse. Illustration software such as Autodesk SketchBook (currently free to educational institutions and students) is used to create the illustrations, and the process of creating the illustrations is captured using screen capture software such as Corel VideoStudio, Zoom or Camtasia, as shown in Figure 2. The resulting video files are imported into a video editing application such as Black Magic Design's Da Vinci Resolve (currently free to all), where other materials can be combined with the pencasting content, such as voiceovers, backing music tracks, vlog-style introductions and links, and title/end clips. Elsewhere in the module, students explore a range of other social video genres as well as the wider video production process, including pre-production planning involving the production of treatments, scripts, storyboards, shot lists, location plans, safety and consent procedures; production techniques including camera operation, cinematography and audio recording; and post-production techniques including editing, graphics, animation, colour grading, rendering and publishing; and post-

Figure 3: A screenshot from the Media/Pool video "What is Footing? Introducing Goffman's Participation Framework"
Image credit: Media/Pool (2017b); Permission for use granted by YouTube channel owner.

The channel
The

Discussion: The videos
The two videos selected for analysis are both entitled Semiotics for Beginners. Semiotics is the study of the sign process -the ways in which signs such as written and illustrative symbols, activities and behaviours represent certain meanings, which are interpreted by the observer in certain ways according to their cultural knowledge and experience. The videos were both published during the semester one assessment period during the 2017/18 academic year. Video 1 was published on 22 December 2017. At time of writing, it has attracted 1,273 views, ranking it 19 th of the total 397 videos for view count, and has driven three subscriptions to the channel. Video 2 was published slightly later, on 23 rd January 2018. It is the most watched video on the channel, with 14,310 views, ranking 1 st out of 397 videos on the channel, and driving 95 of the channel's 506 subscriptions, almost a fifth of the total subscriptions, and more than thirty times more than Video 1. This section will explore the potential reasons for these variations, through discussion of the visual content and the engagement strategies employed.
The videos share many similarities. They both employ many features of effective elearning videos. They are short, both around three minutes long. Both provide a brief overview of semiotic theory. Both feature relatively fast-paced speech and enthusiastic presenters, and use pencasting as well as instructor presence.
Both videos begin with the channel's indent, a 3 second video clip of an iconic location within the local area, literally setting the scene and contextualising the videos for the viewer. Both videos then cut to title screens, a simple white typeface on a black background, first the playlist title Media/Geek, then their specific video titles, and the names of the creators. Around 13-15 seconds in, we meet the presenters, both dressed casually in typical student attire. Both begin, "Hello, and welcome to Media/Pool." At first glance both videos appear similar in content and approach, with both using simple examples and drawings to illustrate key concepts in semiotic theory, such as showing how the word 'cat' is an arbitrary sign which signifies a small furry feline, how the colours of traffic lights are widely understood to mean 'stop', 'get ready' and 'go', and how observing smoke might prompt you to look out for a nearby fire. However, closer analysis reveals some important differences.
In Video 1, the presenter is positioned against a blank background, her face well-lit with shadows encroaching around the edge of the screen. After a brief overview of the topic, it moves into a short discussion of two key theorists in the field, Swiss linguist Ferdinand de Saussure and American philosopher Charles Peirce, showing their Victorian portraits alongside the presenter. Around 40", Video 1 moves into pencasting, with a 'Semiotics' title pre-written along the top of the screen. Two columns are created, with a signifier of a set of traffic lights drawn and coloured on the left, and the signified meaning of each colour on the right (red = to stop, amber = to wait, green = to go).
Around 1'14" the two columns are 'rubbed out' and replaced with the word 'cat' and a small drawing of a cat, and at 1'20" the pencast is given a new subtitle 'Peirces triadic model' and the three concepts of iconic, symbolic and indexical signs are illustrated below by a road sign, traffic lights and smoke and fire, respectively. At 1'55" the pencasting ends, and for the final minute, we go back to the presenter, who introduces the concept of codes, describing how students might apply different speech codes when speaking to a lecturer and a baby, and illustrating different dress codes through photographs of people wearing office wear and sportswear. Finally, the presenter sums up the content and ends the video with thanks to the viewer and calls to action to like, comment or subscribe to the channel. Video 2 also begins with a brief overview of semiotic theory, its importance and a brief reference to the two theorists highlighted in Video 1. The presenter is positioned in front of shelves of library books, accentuating the academic nature of the video, and electronic lights from above providing a natural look and feel. At around 58", the pencasting begins with the word 'Saussure' being written along the top of the screen, a column on the left titled 'Signifier' and a column on the right titled 'Signified'. The voiceover explains the relationship between these two concepts, as the words 'caution hot' are written on the left, and a fire is drawn on the right. Around 20" are then devoted to explaining the concept of arbitrariness and how Saussure as a linguist focused on words, which have no visual relationship with the concept they describe. Around 1'39" the word and illustration relating to cat are 'rubbed out' and replaced with another example based around traffic lights.
The video then goes on to create a new pencast exploring Peirce's definitions of iconic, symbolic and indexical signs. The video draws a connection between the work of Saussure and Peirce, showing how linguistics, or words, are symbolic signs, because they are arbitrary, thereby reinforcing earlier discussion of this term. There then follows a new pencast explaining iconic signs through an illustration of a traffic sign, and another explaining indexical signs through illustrations of fire and smoke. At this point the pencasts end, and attention shifts back to the presenter, who sums up the content and ends the video with thanks to the viewer and calls to action to like, comment or subscribe to the channel.
Video 2 therefore offers a significantly greater range of pencasting, firstly 're-using' the first pencast three times (twice 'rubbing out' an example and providing alternatives) plus a further four new pencasts, whereas Video 1 re-uses one pencast three times, and then offers further explanation of new concepts using photographs. This suggests that viewers prefer a greater range of pencasting content over other forms of visual representation such as photographs. That post was liked seven times and shared once. Figure 6 suggests that this small number of shares provided a significant push to the sharing of these videos during their early days, particularly for Video 1. However, Video 2 has achieved a much greater number of views over a much longer timeframe. Why might this have been the case?
Video title and thumbnail (the small image used to represent the video in search results and video suggestions) play an important role in viewer decisions about whether to watch. In this analysis, since both videos have the same title, it can be assumed that the thumbnail is playing some role in driving the different engagement rates. The thumbnail for Video 1 is a screenshot of the presenter, with the video title overlaid in green. The thumbnail for Video 2 is a screenshot of pencasting content, showing the words Saussure, signifier and signified.
International Journal of Management and Applied Research, 2020, Vol. 7, No. 3 -333 - The number of times a thumbnail is offered to a viewer is known as an impression. As illustrated in Figure 7, data show that Video 1 has been offered as an impression 10,846 times, whereas Video 2 has been offered as an impression 89,693 times, more than eight times Video 1. Furthermore, analysis shows that click-through rates of the two videos vary significantly. Click-through rates refer to the rate at which viewers who are offered a thumbnail of a video click on it, when presented with choices of videos to watch next, at the end of each video. Higher click-through rates indicate more popular videos. The click-through rate for Video 1 is 4.0%, whereas for Video 2 it is 9.6%. It therefore appears that viewers are clicking on the Video 2 thumbnail at a higher rate because they are more attracted to the pencasting format than the presenterled format. YouTube's algorithms appear to work on an iterative basis whereby the more a video is clicked on, viewed and engaged with, the more YouTube chooses to promote it through future impressions.
The quality of engagement with the two videos also appears to differ. Viewer responses to the two videos can be seen in the like/dislike icons and comments posted under each video. Video 1 has been 'liked' by 33  "i'm read this for more than 10 times ,but didn't understand, but this video teach me in just three minutes ...thanks gemma.." As can be seen, for Video 2, comments frequently disclose prior frustration at the topic and the way it has been taught during their university or college classes. Viewers say that they have 'struggled' with the concepts, but now, having watched the video, feel relieved and thankful for the simple and straightforward explanation. The nature of these comments suggest that the video offers practical value, as does Video 1, but that Video 2 also stimulates an emotional response.
A further engagement strategy is tagging.

Conclusion
From this account of innovative practice at one UK university, and associated analysis of student e-learning videos published on YouTube, it can be concluded that decisions about visual representation, together with broader engagement strategies, play an important role in determining long-term engagement. The following recommendations are tentatively offered for e-learning video creators interested in improving engagement: 1. Create e-learning videos that contain a high proportion of visual representation, focusing on smooth, continuous flow approaches such as pencasting rather than static images such as photographs 2. Provide practical value through clear and simple explanation, providing visual representations that closely align with and support audio narration 3. Consider viewer emotional responses to the video, employ a supportive and approachable presentational style and aim to solve viewer problems and frustrations 4. Create thumbnails that articulate the visual representation approach employed 5. Employ an extensive range of tags to improve long-term performance in search results Furthermore, educators are encouraged to consider incorporating e-learning video design, production and publishing into their assessment repertoire, encouraging students to experiment with different approaches to visual representation and engagement, and helping them to develop communication skills through the use of video as communicative tool.