You are on page 1of 18

VisualBox

An OpenGL Music Visualiser

Alexander Conrad Stevens 41719882

Visualization, Computer Graphics & Data Analysis


Computer Graphics Project

April 2012.

Contents
1 Project Overview 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Aim of Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The 2.1 2.2 2.3 2.4 VisualBox Platform The Development Environment . Sampling and Playing the Sound The Graphics Library . . . . . . . Miscellaneous Python Libraries . 1 1 1 2 2 2 3 3 4 4 5 6 6 7 9 12 12 14 15

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

3 Design of Visualisation 3.1 Inspiration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Design Direction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Implementation 4.1 GStreamer: Playing and Decoding . . . . . . . . . . . . . . . . . . . . 4.2 NumPy and the Fast Fourier Transform . . . . . . . . . . . . . . . . 4.3 OpenGL, GLU and GLUT . . . . . . . . . . . . . . . . . . . . . . . . 5 Results and Conclusions 5.1 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 What Could Be Improved? . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Appendices
A Program listings

15
16

ii

Chapter 1 Project Overview


1.1 Introduction

Stimulation through sound is such a large industry in todays culture, that there are many methods of satisfying the desire for audible stimulation. Many such ways to satisfy these desires, include playing an instrument, playing music through a device or attending live concerts. Many of these surely do stimulate the auditory senses of the subject, but they do not generally stimulate the visual senses. This is where VisualBox and many music visualisers ll in the gap.

1.2

Aim of Project

The general idea of a music visualiser is to take the music input frequency, level, tempo, etc. and convert it to a visual representation of the sound. This can be completed in 2D or 3D; however, the focus of this project will be 3D visualisations. So the aim for the project is to develop a 3D visualisation that takes at least frequency and level samples from the provided music, and convert them into an aesthetically pleasing and entertaining 3D animation. The 3D world does not need to be interactive, as the music being played through the visualiser can be considered the interactive medium. Though, the visual representation needs to clearly show to the subject that manipulation of the world is directed by the music.

Chapter 2 The VisualBox Platform


2.1 The Development Environment

For simplicity and prior knowledge and experience in development on the Ubuntu operating system, it was decided that the beta of Ubuntu 12.04 would be used to develop VisualBox. The Ubuntu platform is quick, and provides many libraries that can be used at the developers disposal. The Python programming language will also be the language of choice to develop the application. It is quick to prototype simple (and complicated) operations on the y, and can provide good insight as to what is happening within the program.

2.2

Sampling and Playing the Sound

Since VisualBox is being developed within Ubuntu, it made sense to use a multimedia framework that was installed by default and had a highly customisable pipeline. The obvious choice from a developers standpoint would be the GStreamer multimedia framework. It allows the modication of existing extensions to play any given audio le that GStreamer supports, as well as the ability to modify the decoder, mux, sinks, pads, and many other aspects of the pipeline. This would allow decoded sound samples to be acquired and played at the same time.

2.3. THE GRAPHICS LIBRARY

2.3

The Graphics Library

Once again, since Ubuntu is the operating system of choice for VisualBox, libraries that are compatible with the system are needed. Unfortunately, this automatically made Direct3D out of the question (since its a Windows/Xbox exclusive technology). There is however, the Simple DirectMedia Library (SDL) and OpenGL. SDL is an attempt to standardise the input and output between operating systems and platforms. This includes the standardisation of audio, keyboard and mouse inputs - however, since the visualiser does not need keyboard or mouse input and already has a medium in which to play music, SDL is made redundant. This leaves OpenGL along with its GLU and GLUT libraries to develop upon. This also enables a more in-depth learning experience into how OpenGL functions, rather than using a higher level of abstraction like SDL.

2.4

Miscellaneous Python Libraries

Libraries like SciPy/NumPy (algorithms and mathematics library) and threading libraries will be used to supplement the GStreamer (py-gst) and OpenGL (pythonopengl) libraries. Use of these libraries will be self explanatory (except as mentioned) and can be considered trivial to explain.

Chapter 3 Design of Visualisation


3.1 Inspiration

The attainment of an entertaining and aesthetically pleasing visualisation can be relatively dicult. Many people have dierent opinions on what is pleasing to them, and so a topic that generally appeals to most people would be chosen. Outer space is a mysterious place, and often unpredictable. In many cultures it can also be seen that people have a fascination with the void beyond their own little planet. So from a designers stand point, the sun and the stars and their mysterious secrets could be considered an interesting and appealing direction for a visualisation. That is why a star and black hole binary system can be used to appeal to a large audience.

Figure 3.1: Example of Black Hole and Star Binary System

3.2. DESIGN DIRECTION

3.2

Design Direction

So as can be seen in Figure 3.1, there are solar ares, solar activity on the surface of the star, and the gravitation eect of the black hole slowly devouring the larger star. Unfortunately, in real time, one could imagine that the process isnt quite as dynamic in a macro level. However, the activity on the surface of the star is quite dynamic and could be considered a micro level activity. If one were to imagine that this micro level activity could be represented in a macro sense, a visualiser could eectively show dynamic properties of the music on the surface of the star. This could include many particles or shapes, skipping about and moving around the star according to the beat of the music. VisualBox though, will assign frequencies to each particle, and have the level of that frequency dictate a miniature solar are. This gives the eect of an exploding star with music of infrequent, heavy beats. Hence, this would satisfy the requirement that a subject could determine that the visualiser is based on the music and not just a pre-set loop. Since each particle (or solar are) has its own frequency, there will be variance over the star for jumping particles. When a solar are jumps high o of the star and close to the black hole, one would expect the black hole to trap the solar are into a spinning orbit - until its impending doom. In the case of VisualBox, this is simple, in which if the particle is too close to the black hole, the particle will be trapped in orbit. Once in orbit, the particle retains its assigned frequency, but instead of the particle jumping (or aring), the particle will speed up and slow down according to the sound level. After a randomly assigned time, the particle will decay and then nally reinitialise itself on the surface of the star if any particles are in an idle state, they will just roam across the surface of the star. To populate the black void of space, simple stars can be added - they have no other purpose other than to add depth and a sense of vastness. To show the depth of 3D to the viewer, the camera can also rotate about the scene, using the centre point of the star as a reference.

Chapter 4 Implementation
4.1 GStreamer: Playing and Decoding

GStreamer is a pipeline based multimedia framework, this means that a developer can take a stream input (le, device, etc.) and pass the media data down the pipeline to be modied, decoded, played, saved to a le, or just sent into nothingness. This basic set of extensions and plug-ins can be rearranged to suit the developers interests.

Figure 4.1: Example of a simple audio pipeline in GStreamer GStreamer also has a very well implemented extension called Playbin2 in which utilises the GStreamer codecs already installed on the viewers system to play any Video or Audio le with little to no eort from the developer. That surely satises the requirement that VisualBox needs to play the music with the visualisation, but it surely doesnt link the decoded music to the visualisation. This is where the pipeline framework becomes extremely useful. The developer 6

4.2. NUMPY AND THE FAST FOURIER TRANSFORM

only has to make a separate bin/pipeline, plug up the original Playbin2 (at the audio playing sink, or ALSA element), and redirect that pipeline to a custom audio playing sink and decoded output sink - similar to that shown in Figure 4.2.

Figure 4.2: Layout of modied audio pipeline in GStreamer The audio sink will play the music, while the decoded sink will send the pull-buffer signal, allowing for the decoded Pulse Code Modulated (PCM) audio data to be saved. This PCM data is channel interleaved (Left Channel Sample, Right Channel Sample, Left, Right, etc.) and as specied in the GStreamer initialisation, has 16bits worth of depth - as can be seen in Figure 4.4. This data is saved as an array of 16-bit integers. For parallelisation, the GStreamer component of VisualBox will be threaded next to the OpenGL component.

Figure 4.3: Example of a Decoded PCM Audio Data Sample

4.2

NumPy and the Fast Fourier Transform

Since the decoded data has been acquired, the PCM format needs to be converted into something workable. Preferably, VisualBox needs a set of frequencies and their corresponding levels. This is where the NumPy libraries come into play.

CHAPTER 4. IMPLEMENTATION

NumPy has an algorithm for Fast Fourier Transforms, an ecient use of Discrete Fourier Transforms. This method essentially converts a signal into its frequency and level components. In the case of VisualBox though, it does not need to know frequencies in Hertz or levels in decibels, it just needs to know the dominating regions of music, and visualise it.

Figure 4.4: Real Example of the Mirrored FFT output from VisualBox So on each iteration of the OpenGL state, VisualBox will check the sampled data, deinterleave the left and right channels, apply the FFT upon this data, and save the output to two arrays of oats containing the levels in order of frequency. From this, the OpenGL visualiser can use the raw data to compute particle eects, and displacements according to ascending frequency. The only drawback to this method, is that not all samples are used. This is since there would be a 44100 Hz sample rate, split up by a framerate of around 20 frames per second, except the size of each buered data is 1152 16-bit samples; hence, only half the buered data would be captured, computed and utilised. However, the method of waiting for the OpenGL

4.3. OPENGL, GLU AND GLUT

state would be less CPU intensive, since youre only computing FFTs and deinterleaving when OpenGL can actually display the next visual.

4.3

OpenGL, GLU and GLUT

The OpenGL component of VisualBox (the class OpenGL Main), is where all of the graphical components are brought together and visualised. The OpenGL class is initialised, and started from a thread using the start() function. This will initialise GLUT and its display modes, display windows, window resizing functions, draw functions, and key press even handlers. It also follows on to the OpenGL initialisation component of VisualBox. Within the initialisation function (InitGL()), depth is set to check for any objects with a depth less than that of the stored depth (GL LESS), Polygons are congured to only render the outside (GL BACK) and only display the polygon edge lines inside (GL LINE) - this is to speed up rendering. To enable the depth checking between objects so that theyre displayed in order, GL DEPTH TEST is added, and nally the shade model is chosen to be GL SMOOTH to keep a smoother shade over objects. For nal touches, OpenGL will be hinted to use the cleanest rendering techniques instead of the most ecient (glHint and GL NICEST), as VisualBox aims to provide an aesthetically pleasing experience.

Figure 4.5: In order: Solar Texture, Star Sprite Texture, Particle Sprite Texture Next, the starmap is initialised into memory as static vertices that are randomly spawned over a large area of the scene. Then the particles that are spawned over the sun are initialised with random lifetimes, random inclinations, and random colours (tending to the red/orange spectrum) using the SphericalEmit() function. These particles are also assigned a unique frequency that theyll hold for the lifetime of the visualisation. Once this is complete, the textures can nally be initialised using InitTexturing(); where the solar texture is the only clamped texture, but all are loaded with alpha channels. These textures can be seen in Figure 4.5. Finally, the OpenGL draw sequence can start.

10

CHAPTER 4. IMPLEMENTATION

The main draw function initially starts with a buer clear, and then proceeds to translate everything 3 units into the scene. This is eectively positioning the camera, ready to rotate the whole scene by half a degree every frame. Alpha tests are then enabled with a check passing only if the alpha is greater than 0.1. The blend function is also set up to the recommended layout for transparency - this is mainly for the background stars and the particles as they pass over each other and solid objects. Once the precursing congurations have been set up, it is ready for the ARB Point Sprites to be loaded. These point sprites are essentially vertices in space, that are hardware rendered with a single texture instead of a pixel or quad. The quadratic is set up to dictate how distance should eect the size of the star or particle. The quadratic is pretty arbitrary in this case, it was chosen to give the best looking eect for distant stars. Maximum point size is then specied and the actual point size set to the maximum size. The only time that the sprites should fade out of the scene are when they are too small to be perceived (when their size is less than 3.0, from GL POINT FADE THRESHOLD SIZE ARB) or when the sprite is no longer on the screen. The point sprites are now set up, and the stars can nally be rendered into the scene. Textures are bound to the point sprites (in this case, the star sprite texture) and the pre-calculated star map vertices are sent via the glBegin(GL POINTS) function. The vertices for the stars are kept static throughout the whole visualisation. The particles that are orbiting the Black Hole and the Sun however, are dynamic. Generation of the dynamic particles will start only when GStreamer has signalled that a buer is ready to be pulled and is stored. Once this has happened, VisualBox will deinterleave the stereo channels that make up the single decoded audio buer, and turn them into two arrays of data called lftch and rhtch. The data will then be passed through a fast fourier transform, and scaled down by 5 orders of magnitude. The particle texture is then bound to the following point sprites, and the point sprite size is set to 2. When the point sprites for the particles are sent to the buer, every even particle will use the left channels frequency, and every odd particle will use the right channels. This allows for balance and beat for various streams in music. Each particle will then be given a nudge (through randomNudge()), according to their assigned frequency (from initialisation). This nudge will simulate a solar are and expel a particle by a displacement dictated by the level from the associated frequency. If the particle doesnt enter the radius of the Black Hole (half the distance between the Black Hole and the Sun) and get trapped, the particle return to

4.3. OPENGL, GLU AND GLUT

11

a orbit position on the surface of the Sun and continue to randomly shift and roam along the surface. However, if the particle does get trapped in the grip of the Black Hole, the particle will assume its orbit around the Black Hole, and orbit around the black hole with a speed dictated once again by the sound level from the particles associated frequency. The particle, once trapped will then start the countdown until the count hits the designated lifetime, which will then remove the particle from the Black Hole, and re-designate it upon the Sun, using SphericalEmit() again. Finally, the Sun - which is a simple gluSphere - can be textured with the Solar Texture using texture coordinates generated from gluQuadricTexture, and displayed at the [0, 0, 0] coordinate. It is considered a static object for the entirety of the visualisation. The Black Hole however, has no texture (since its supposed to be black), and randomly orbits the Sun at a randomly incremented radius and rotation. The new position for which the Black Hole moves to next is calculated by the blackHoleNudge() function. All of the calculations used in VisualBox use a polar coordinate system thats converted to the normal coordinate system to calculate the vertex positions around the Sun and the Black Hole. The Black Hole also uses this same system to orbit the Sun.

Chapter 5 Results and Conclusions


5.1 Result

VisualBox provided an entertaining experience while listening to music. It successfully utilised the Frequency and corresponding Level to calculate motion that could be clearly seen by the viewer. However, due to the method of sending the Points Sprites to the buer, the performance of VisualBox was not quite as high as expected on lower end graphics cards - in this case, the program was developed on an Intel Core i5-2557 with HD3000 graphics. This limited the number of Point Sprites on screen to about 1200 with 20 frames per second, instead of a potential 12000.

Figure 5.1: Close up of the stars rendered in VisualBox Python was also considered a limiting factor, since its a runtime based language, and the amount of data passed about VisualBox was signicant. Removing one of the audio channels from the process of interleaving and FFTs actually increased the framerate of VisualBox. It is considered that doing many array manipulation routines in a language like C would be far faster.

12

5.1. RESULT

13

Figure 5.2: Close up of Sun and Black Hole with Particles As can be seen in Figure 5.2, addition of particles randomly orbiting the Sun and the Black Hole proved to be a well worth inclusion. The particles help to obscure the stretching of the Solar Texture (as can be seen in the centre of the Sun, if observed closely), and the particles around the Black Hole are used to show that there is actually a medium on an otherwise, almost black object - since they pass around and behind.

14

CHAPTER 5. RESULTS AND CONCLUSIONS

Finally, Figure 5.3 clearly shows that all of the elements have come together in a neat and entertaining fashion. The user does not need to have much of an input to the visualisation, and the camera will slowly revolve about the Sun, giving a sense of depth and dimension.

Figure 5.3: A still of the whole scene from within VisualBox

5.2

What Could Be Improved?

As discussed in the last section, Python was considered a restriction in terms of performance. So a language like C - which is a compile time based language - would be used instead of a runtime based language like Python. OpenGL Vertex Buer Objects (VBOs) should also have been used, instead of the slow process of looping through an array of vertices and sending them to the buer using glBegin() and glEnd(). A true gravitational physics, rather than a simple displacement algorithm using the frequency and levels. This would provide a much more dynamic and uid experience if this model had been used. Particles could actually follow paths, and use the sound level to accelerate the ow of particles to the Black Hole.

5.3. CONCLUSIONS

15

Add the addition of a beat checking algorithm to nd the Beats Per Minute of a song. This however, is still a large topic of debate, as to which algorithm would be considered the best to use. Since dierent styles of music would surely have a dierent beat signature. With the combination of all of these improvements, a true, uid particle motion, with possibly 10s or 100s of thousands of particles could be implemented. This could potentially look like the inspirational Black Hole and Sun Binary system as shown in Figure 3.1.

5.3

Conclusions

Overall, VisualBox was a success; albeit, with a few improvements to be made. Though, it was a 3-Dimensional visualiser that utilised the Frequency and Level from sound samples to display an aesthetic and entertaining visualisation synchronised with the beat of the music. The code though, can now act as a base to develop even more complicated visualisations. Ports can be made quite simply between dierent programming languages, and a possible plug-in could be created for various Media Players. VisualBox could be considered a prototype and learning experience for other programmers looking into the world of GStreamer and OpenGL.

Appendix A Program listings


Currently, VisualBox uses the Bazaar revisioning system and stores its code on Launchpad. Install Bazaar, Python GStreamer, Python OpenGL, and SciPy/NumPy using: sudo apt-get install python-gst0.10 python-opengl python-numpy And get the source code using: bzr clone lp:alex-stevens/+junk/VisualBox The code can also be viewed online at this address: http://code.launchpad.net/alex-stevens/+junk/VisualBox The revision that is referenced in this version of the document is revision 36.

16

You might also like