Gif-making: an exercise in digital media archeology
The gif is ubiquitous. And it’s a mess. A relic from the dial-up web, it was one of the first image formats on the web. It’s become an essential part of the language of online communication, and it’s become a form of art it own right, it’s even bleeding into other forms of media (see PBS Idea’s discussion of the gif aesthetic in music videos).
Let me ask the naive question, what is a gif? How are they encoded, created, inscribed and rendered by browsers? How can can they be taken apart? How can they be made?
Kittler claims “we do not speak language, but language speaks us – and we have to participate in such systems of language, which are not of our own making. [… T]he practices of language [take] place in the switches and relays, software and hardware, protocols and circuits of which our technical systems are made”. (Parikka 70)
To study the gif, I’ve chosen to approach it both as an engineering project as well as a scholarly one. I want to make a tool to study gifs. Consider Latour’s comments about engineers:
Engineers constantly shift out characters in other spaces and other times, devise positions for human and nonhuman users, break down competences that they then redistribute to many different actants, build complicate narrative programs and sub-programs that are evaluated and judged. (Johnson 309)
The building of this program (or any program) is to engage with this kind of engineering. The engineer conceptualizes actors, spaces, interactions – and then invites the user to participate in that system. Specifically, this tool is an attempt at a kind of media archaeology. I’m interested in cracking video open, and rendering manifest its inner workings.
[M]edia-archaeological theories are interested in going ‘under the hood’ to investigate the material diagrammatics and technologies of how culture is being mediatically stored and transmitted. (Parikka 65)
There are a number of tools that are readily available for creating and viewing gifs. Some even run in the browser (giphy), and many image and video-editing programs can export as gifs (Photoshop, GIMP). If my interest was in simply making a short, embeddable, looping video – these would serve me well.
However, I’m interested in going under the hood, and that means I’m going to have to step away from platforms. It’s not going to be enough to use a set of tools that were developed outside of academia. After all, there are a number of free and open tools that encourage exactly this kind of study.
In relation to our discussion of doors, consider the lock. Specifically, digital locks: DRM. How are we going to do scholarship on media if we are alienated from the tools that produce and embed that media?
Here’s a list of the tools I’m using, along with their licenses:
– Programming: emacs, the text editor whose role in the development of free software is important (Richard Stallman founded the Free Software movement in the wake of a dispute over who owned it). It uses the GPL-3.0 License
– Operating system: linux which uses the GPL-2.0
– Version control: git which also uses GPL-2.0
– Programming language – Lua uses the MIT license
– Framework (for graphics rendering): love uses the zlib/libpng license
– video encoding/decoding – FFmpeg – which exists in a legal gray area (it can decode so many proprietary video formats)
Why the attention to licenses? For one, there is an incredible range of diversity in software licenses. Some of the licenses above are more permissive than others. Particularly, the GPL-3 license requires that if any part of it is used in another program, that other program must also be licensed under the GPL-3. It has been described as virus-like:
In the views of Richard Stallman, Mundie’s metaphor of a “virus” is wrong as software under the GPL does not “attack” or “infect” other software. Stallman believes that comparing the GPL to a virus is an extremely unfriendly thing to say, and that a better metaphor for software under the GPL would be a spider plant: If one takes a piece of it and puts it somewhere else, it grows there too. (Wikipedia)
By using free software, it becomes much easier to “inscribe the act of investigation into the critical work”. (Drucker) Not only because of the way free software licenses resist DRM, but also in that they encourage the kind of remixing and disassembly that’s essential to the critical project:
“When we call software ‘free,’ we mean that it respects the users’ essential freedoms: the freedom to run it, to study and change it, and to redistribute copies with or without changes. […] In a world of digital sounds, images, and words, free software becomes increasingly essential for freedom in general. (FSF)
All of the above software is free.
With a set of tools that I felt were adequately open, here were my guiding principles:
- the practical :: design a simple interface for exploring and creating looping video
- the scholarly :: document the design methodology
So let’s begin.
Once I have a video, I can use ffprobe (a command line tool that’s part of the FFmpeg project)
to get the frame count, frames per second, and resolution of the video. The first step in extracting frames from a video is to choose the time in the video to export frames from.
“We need to be doing user interface design” (Drucker)
To choose the time, I began with a simple slider. The indicator shows the time in seconds:
After selecting a time in the video to load, the program exports ten seconds of frames into png files. The image data from those files is then loaded into the program, where I can display it.
This is a good example of one of the difficulties in trying to program something that uses different processes simultaneously:
I needed to export individual frames from the source video (external process: FFmpeg)
As well as import each of those frames’ image data (internal process: love)
Modern computers have multiple processing units that can be assigned specific tasks (threads). I like the imagery of the word. It gestures towards the origins of computing as a form of weaving (Jaquard looms). By exporting video in one thread, and importing them in another I allowed the program to continue exporting frames while it simultaneously imported the image data.
With the image data accessible to the program, the next step is to navigate through those images. There were two modes of navigation I wanted to implement:
looping playback – the black marker shows the position of the playback
mouse navigation – when the cursor is over the timeline, the frame follows the cursor
this gif illustrates both:
Now that I can navigate through the video, I need to be able to create a loop. A loop necessitates two points in time, so how can I demarcate them and the space between them? There are all kinds of conceptual problems that I ran into here. To enumerate some of them:
What happens when the second point is moved before the first
How to designate which marker is selected when two are overlapping?
Which marker is moved when the timeline is clicked?
I decided to introduce the concept of selection, so that when the timeline is clicked, two markers are created: one on the location that was clicked, and the other is selected and dragged with the mouse until it’s clicked again at another location and the loop is completed.
I’ve illustrated it here:
This last step was difficult, particularly because it meant that each step of creating the loop markers and designating the loop playback period meant a conversion between three axes:
– position of the markers on the timeline (pixels)
– position of the frame in the loop (frames per pixel)
– position of the frame in time (frames per second)
The math is relatively simple to convert from one axis to another, but I found it difficult to juggle the abstractions well enough to put them in relation to each other.
“The easier something is to use, the more work to produce it” (Drucker 34:05)
I felt it keenly in developing this program. I built it around an extremely simple use case, and yet I found it impossible to reason about this object without invoking very specific notions of space and time.
If there’s some trace of a knowledge model in how I approached this project, it’s probably the following: there is an ecology of software. An ecology in the sense that Jane Bennett describes in her discussion of Spinoza’s natura naturans: “a materiality that is always in the process of reinventing itself”.
There’s no guarantee that any of the tools I used to build this project will continue to be maintained – or that they will continue to work with eachother. That’s the nature of software development, it’s in constant reinvention.
There’s plenty more to be with this project:
– make it compatible with OSX and Windows
– create an interface for selecting a source file
– extract, loop and export audio
I feel as though I’ve only just scratched the surface of the this media – and I’m aware that there’s probably not much interest in the programmatic side of the project, nevertheless, I’ve put a link up to the program here, where it will be updated as I continue to work on it.
The hidden cost of free software is learning how to use it. Knowing how to learn new software is a privilege, that’s why I feel it’s so necessary for scholars in the digital humanities to be open about their interface design methodologies.
Works Cited
Bennett, Jane. “The Force of Things: Steps toward an Ecology of Matter.” Political Theory 32.3 (2004): 347-72. Print.
Drucker, Johanna. “Digital Humanities from Speculative to Skeptical.” Concordia University, Montreal. 9 Oct. 2015. Web. <http://www.mediahistoryresearch.com/digital-humanities-from-speculative-to-skeptical/>.
“The Free Software Definition – GNU Project – Free Software Foundation (FSF).” The GNU Operating System. N.p., n.d. Web. 21 Oct. 2015.
Johnson, Jim. “Mixing Humans and Nonhumans Together: The Sociology of a Door-Closer.” Social Problems 35.3 (1988): 298-310. Print.
Parikka, Jussi. What Is Media Archaeology? Cambridge, UK: Polity, 2012. Print.
“Overwhelming and Collective Murder.” YouTube. YouTube, n.d. Web. 21 Oct. 2015. <https://www.youtube.com/watch?v=ze9-ARjL-ZA>.
“GNU General Public License.” Https://en.wikipedia.org/wiki/GNU_General_Public_License#cite_note-114. N.p., n.d. Web. 21 Oct. 2015.
Super cool, Cody! In news unrelated to this sweet endeavour into “making,” (and impressive display of tech know-how), I love the strangeness of the second video you chose (esp. the 2D crocodile(?)’s facial expression)!
In class you brought up the idea of creativity in restriction and the fact that several pieces of software (particularly word processors) were paring down to provide minimal distraction. I couldn’t remember what it was called, so I looked it up today. This is an app I’ve heard talk of: https://draftin.com/. It is a word processing and collaboration tool. The interesting thing about this one is that it has something they’ve called the “Hemingway Mode” which essentially turns of ALL functionality except writing. You can’t edit, go back, rewrite, change fonts – you can only write. Then you turn it off and go back to edit. It’s a mechanism to force the “Write now, right later” mentality. I haven’t tried it myself yet, but I might. It’s generally how I write anyway but the idea of NOT being able to go back at all (even to fix typos) is a little intimidating!