Data Visualization with Perl and SVG

G. Wade Johnson

YAPC::NA 2010

SVG

How many people here are familiar with SVG?

Despite it's maturity, many people really don't know much about it.

SVG

SVG is an XML-based vector graphics image format
specified by the W3C

Unlike raster graphics, all objects are drawn with high-level descriptions, not individual colored pixels. These objects hold their identities in the completed drawing which allows for effects associated with the objects.

SVG Features

I could spend a long time trying to list all of the individual elements and features of SVG and still not really do them justice. We'll just hit the high points for now.

SVG Maturity

Implemented in a number of non-browser applications over the last decade. This has allowed programmers, designers, and artists to make use of the technology and help it improve. All of the major browsers, except 1, have significant portions of the specification implemented. Microsoft has announced that IE 9 will have support for SVG.

Data Visualization

What do we mean by Data Visualization?

What do most people mean by the term?

DV: Business Graphics?

Most business people and students probably think of the standard set of presentation graphics tools when trying to decide how to present data.

DV: Scatter Plots?

Scatter plots are great for data that includes randome error. Often you can get an idea of potential curves to fit from looking at the plot itself.

DV: Histograms?

Histograms are similar to bar graphs. But, they normally serve to quantize discrete or continuous data in categories.

DV: Radar Graphs?

Radar graphs seem to be falling out of favor, but you still see them every now and then.

DV: Sparklines?

Sparklines are word-sized graphics designed
by Edward Tufte for use within text. They can
display trends , the win/loss record of your
favorite ball team , or any other data
that would benefit from a quick summary.

notes

Data Visualization

Different meanings to different people.

I'm sure that most of you have used one or another of these tools in the past. There are plenty of others. People like Edward Tufte and Stephen Few have categorized and studied may ways of visualizing data (good and bad) for years.

Improvement

How can SVG improve on basic charts?

As we saw in the examples, SVG can easily duplicate the functionality of any data visualization format. But, it is fair to ask what can SVG do that other formats cannot? After all, if it doesn't provide any new benefits, what's the point of using it?

Scalability

Vector graphics are inherently scalable.

Zoom in or out to anmost any degree and vector graphics remain just as sharp. This is an inherent advantage of vector graphics.

Interactivity

What if you could interact with the data?

The definition of each object is not a matter of which pixels are which color. Objects are defined explicitly and a new representation can be generated, by the viewer, after any scaling.

As a consequence of the fact that the graphical objects remain in the representation of the image and SVG supports scripting, we have the interesting possibility of providing graphics that the user can interact with.

While this capability is often used more for games or user interface components, it can allow combining multiple instances of a single graphic into one. Given the appropriate clues, a user can then interact with the graphic to explore the data in new ways.

A Line Graph...Plus

A couple of years ago, I began doing some profiling work on the speed of the script engines in various browsers and SVG viewers. When I wanted to display the result, I naturally turned to SVG as my data visualization tool of choice.

Unfortunately, none of the libraries that I found out there quite displayed the data the way I wanted. A little bit of Perl and I had this output with a small amount of interactivity.

Data Exploration

Jeff Schiller's Web Statistics.

Jeff Schiller built a wonderfully interactive example showing percentage of users accessing his site from each of the major browsers. By using the interactive features of SVG, Jeff was able to provide a way to explore the data, not just display it.

Map-based Data Visualization

Cartographers display data on maps.

It turns out that one group that has really embraced SVG is the cartography community. Displaying data on maps is something that SVG does quite well.

Mappetizer

Ruth Lang provided an example generated with the Mappetizer tool.

This example was posted on the SVG Developers' Mailing List in part of a discussion of interactive controls. This shows one of the directions that many cartographers have followed in displaying map-based data with SVG.

Time

With SVG we can also display changes in time.

Use scripting or SMIL for graphs that change over time. This can be used for either real-time display or a tactic for displaying yet another changing variable on a single graphic.

Dynamic Data Displays

The instruments demo was one of my first serious uses of SVG. This is a variation of a tool I used to show data streaming in from an external server.

What About Perl?

So SVG is cool, but what has this got to do with Perl?

Although we normally think only of the graphics that result from data visualization. Generation of the graphics can be a separate issue from the display of the graphic.

Generating SVG with Perl

SVG is just XML.

Unlike many other graphics formats, SVG is just XML. XML is just (Unicode) text. Perl excels at parsing and generating text. There are many ways to write XML from Perl.

XML-Specific Modules

Since SVG is just XML, libraries that write XML can write SVG. (Assuming they support namespaces.)

SVG is Just (Unicode) Text

Although there is less of a safety net, anything that you can use to output raw text can also be used to write SVG.

SVG-specific Modules

There are a number of modules that provide a Perl interface for writing SVG.

SVG Example


#!/usr/bin/env perl

use SVG;

my $svg = SVG->new( width => 200, height => 150 );
$svg->line( class=>'axes',
    x1=>50, y1=> 20, x2=>50, y2=>130 );
$svg->line( class=>'axes',
    x1=>50, y1=> 130, x2=>150, y2=>130 );
$svg->path( class=>'data',
    d=>'M50,130l10,-50l10,-30l10,40'
      . 'l10,-50l10,20l10,30l10,-80'
);

print $svg->xmlify;
notes

SVG::TT::Graph Example


#!/usr/bin/env perl

use SVG::TT::Graph::Pie;
my $graph = SVG::TT::Graph::Pie->new({
    'height' => 200, 'width' => 350,
    'compress' => 0, 'expand_greatest' => 1,
    'fields' => [ qw/Jan Feb Mar Apr May/ ],
});

$graph->add_data({ 'data' => [ 50, 60, 53, 58 ], });

print $graph->burn;

Here's a quick example of how to generate a pie-chart with the SVG::TT::Graph module.

SVG::Sparkline Example


#!/usr/bin/env perl

use SVG::Sparkline;

my @temps = (61,67,67,77,83,84,82,84,84,86,78,50,44,47,
    54,76,72,78,78,80,80,82,81,77,82,74,77,64,72,75,71);
my $svg = SVG::Sparkline->new( Line => { values=>\@temps } );

print $svg;

Quick example using the SVG::Sparkline module to create a simple line graph.

Backend Feed Data to Visualize

JavaScript in SVG can call out to a server.

Remember Jeff Schiller's web browser application and the instruments demo. Both of these were designed to retrieve data from a server for display in the browser.

Perl Serving Data

Need I say more?

Server-Driven SVG

This version of the instruments demo is driven from a server. Depending on the venue, this is either a CGI script or an HTTP::Daemon standalone server.

The Server

A standalone server can be pretty simple.

notes

References: Web

This is a subset of the resources I use for information on SVG. Some are general information, and others give specific advice.

References: Print

The first book is a bit out of date, but it is the one I learned SVG from. The other two represent what I'm trying to learn about data visualization. Both provide examples that are far beyond my current abilities.

Questions?

notes