Simple Examples of Database Manipulation in Processing: Part 2

Continuing with where we left off in Part 1, let’s start with a bit of data from the cars.tsv file referenced in the Processing File 02 example. BTW, if you are on a Mac, go to your Processing Preferences and check “Place File Menu inside of Navigation to avoid Mac Java bug, that way you can get to these examples very easily right from File>Examples in Processing. Or, upgrade to Processing 1.0.9. Make a new text file and paste this data that I got from the cars.tsv file referenced in the program:

chevrolet chevelle malibu,18,8,307,130,3504,12,70,1
buick skylark 320,15,8,350,165,3693,11.5,70,1
plymouth satellite,18,8,318,150,3436,11,70,1

Do a Find and Replace, where you find the ‘,’ and replace with a tab. This is starting to get into some of the quirks of editing databases and working with Processing, so bear with me. I did not paste Tab separated text because HTML ignores tabs and you would not be able to copy it properly. I personally use Tabs because very often my data entry might include a comma and it would throw off our separation of fields if we had extra commas here and there. So, HTML ignoring tabs is actually good for helping to get clean data from web forms. If your Text editor does not allow you to make Tabs Visible, get one that does, it’s very helpful when copying a ‘tab’ to be able to see it.

Now, save the text file as “cars2.tsv” after you have replaced all commas with Tabs and put it in the same folder as a new Sketch.

Copy and paste the example code below into the new Sketch:

/**
* LoadFile 2
*
* This example loads a data file about cars. Each element is separated
* with a tab and corresponds to a different aspect of each car. The file stores
* the miles per gallon, cylinders, displacement, etc., for more than 400 different
* makes and models. Press a mouse button to advance to the next group of entries.
*/

Record[] records;
String[] lines;
int recordCount;
PFont body;
int num = 9; // Display this many entries on each screen.
int startingEntry = 0;  // Display from this entry number

void setup()
{
size(200, 200);
fill(255);
noLoop();

body = loadFont("TheSans-Plain-12.vlw");
textFont(body);

lines = loadStrings("cars2.tsv");
records = new Record[lines.length];
for (int i = 0; i < lines.length; i++) {
String[] pieces = split(lines[i], 't'); // Load data into arrayif (pieces.length == 9) {
records[recordCount] = new Record(pieces);
recordCount++;
}
}

void draw() {
background(0);
for (int i = 0; i < num; i++) {
int thisEntry = startingEntry + i;
text(thisEntry + " > " + records[thisEntry].name, 20, 20 + i*20); // Print name to console
}
}

void mousePressed() {
startingEntry += num;
if (startingEntry + num > records.length) {
startingEntry -= num;
}
redraw();
}

class Record {
String name;
float mpg;
int cylinders;
float displacement;
float horsepower;
float weight;
float acceleration;
int year;
float origin;
public Record(String[] pieces) {
name = pieces[0];
mpg = float(pieces[1]);
cylinders = int(pieces[2]);
displacement = float(pieces[3]);
horsepower = float(pieces[4]);
weight = float(pieces[5]);
acceleration = float(pieces[6]);
year = int(pieces[7]);
origin = float(pieces[8]);
}
}

If you run this you get an error. This is simply because you have not loaded a font that is part of this example. A lot of Processing examples have fonts used, don’t let this simple error stop you. Go to Tools>Create Font and select your font and size. Now you have to do 2 things €“ copy the exact name of the font from this dialogue box (you’ll paste it in the code) and then hit OK so that the Font is ‘loaded’. Now paste in the name of the font you copied, replacing the ‘TheSans-Plain-12.vlw’ with your own font choice. If you look in your sketch folder you’ll see that a file with that name has been added to your folder.

If you get this error – Exception in thread “Animation Thread” java.lang.ArrayIndexOutOfBoundsException: 3 This error occurred because you did not find the cars.tsv as mentioned above and you are asking for array entries that are not there. Make sure you have an entry when asking for it!

Once you get the code above running, you’ll see the names of cars and which number they are in the Array, starting with 0. Not all that exciting, but you can see in the Record class all of the other data available for our use. We’ll get into that shortly, but now might be a good time to navigate to the full cars2.tsv located in your Processing Examples folder. Again, on the Mac there is a quirk with how Java is implemented and the only way to get to the actual Example sketches is to CTRL-click on the Processing Application and select Show Contents. You will then be able to navigate to the Examples folder:

Applications>Processing>Contents>Resources>Java>examples>Topics>File IO>LoadFile2>data>cars2.tsv

Now let’s replace our cars2.tsv with the full 400 car database. Run it again to make sure it works and when you click it should cycle through the database. From here on out I am going to take a very <em>visual</em> approach to analyzing this data. I want to explore this data as though it’s a texture that I can use to manipulate form and structure. As an artist, I want to trust my eyes and intuition to create something beautiful, and I believe our nature is to equate truth with beauty. If we can make this data beautiful, some pattern will emerge that I feel will expose what is interesting or important in the data.

Start by adding this code to the setup:

size(800, 800);
fill(255);
noLoop();
smooth();
noStroke();

and add the ellipse to the draw loop:

void draw() {
background(0);
for (int i = 0; i < num; i++) {
int thisEntry = startingEntry + i;
text(thisEntry + " > " + records[thisEntry].name, 20, 20 + i*50); // Print name to console
ellipse(250,i*50,20,20);
}
}

As you know, an ellipse needs 4 parameters, the first is the x coordinate of the center, then the y coordinate and then the width and height. The only odd entry is the ‘i*50’ which you can see I got from the name line above. As we discussed, an array is a list of records, we have limited this page to 9 records with the ‘num’ variable and that means i will go from 0 to 8. Each time we draw a new name and a circle and we multiply it by it’s order down the y axis. So, 0 is at 0, record 1 is at 50, 2 is at 100, etc. Right now this is a list of white dots and names, sort of how a computer would draw it.

Now let’s get into the art-making part. Simply by adding color, transparency and then using our data to size the ellipse we can start making this pretty:

void draw() {
background(0);
for (int i = 0; i < num; i++) {
int thisEntry = startingEntry + i;
text(thisEntry + " > " + records[thisEntry].name, 20, 20 + i*50); // Print name to console

fill(200,100,100,100);
ellipse(width/2,20+i*50, records[thisEntry].horsepower, records[thisEntry].horsepower);
}
}

Here’s what I changed: the first entry is an equation that finds the center of our canvas, after the comma is the y coordinate which is now 20 + i *50. The third and fourth entry is the horsepower and is now sizing the ellipse to that literal number in pixels. Conveniently, the range of the number horsepower is about right for fitting on our canvas. BTW, I found that number by browsing the cars2.tsv file and noting that our class Records is converting the 4th tab separated field into a float.

I also left an error in there, why is the first car entry white? See if you can move the fill(200,100,100,100); line someplace else to fix it. Processing keeps using whatever fill color is specified until it encounters a new one, so it helps to understand that when cycling through data.

It’s worth looking at the Record section in detail again to see the order of entries and how I knew where to find displacement. If you are not familiar with how Processing declares variables and converts Strings into integers or floats, here we go:

class Record {
String name;
float mpg;
int cylinders;
float displacement;
float horsepower;
float weight;
float acceleration;
int year;
float origin;
public Record(String[] pieces) {
name = pieces[0];
mpg = float(pieces[1]);
cylinders = int(pieces[2]);
displacement = float(pieces[3]);
horsepower = float(pieces[4]);
weight = float(pieces[5]);
acceleration = float(pieces[6]);
year = int(pieces[7]);
origin = float(pieces[8]);
}

In Blue, this is simply declaring all of our variables for the class Records. All variables have to be some type or another and they cannot be intermixed. For example, I can’t tell an ellipse to have a dimension that is a String, it is expecting a float or integer value, basically a number. Very briefly, a String is any kind of character data like “Bryan”, but since it’s any character data a String can be “1”. It looks like a number, but if the program was told it is a String to Processing it is not a number.

If you were observant you would have noticed that our Array is a String Array. All of our database was read in as String character data, which is common since a complex database will usually have both names like “Bryan” and numbers. So, how could we resize an ellipse if all the data entries were String based? That’s what the code in Red is doing, it’s going through each tab separated entry in the String Array and converting it to either a float or integer. You’ll notice the first entry is missing any kind of instruction, that’s because it is already a String – “the name of the car”. Both for declaring variables and conversions, Processing is very brief in it’s code. It’s nice, but not explained just by looking at it.

A lot of errors working with databases will be something like “Can’t convert String to Integer” and that is caused by trying to use a String where something like an Ellipse is expecting a number. Even if it appears to be a number, it needs to have been parsed into a number for Processing or any other programming language to know what to do with it.

All of this was taken care of in the “setup” section of our code as Processing was starting up our program. The cars2.tsv file was read in, then it found the tabs to separate the fields of the data and then it converted the data to floats and integers for us. This is now in memory and allows our draw loop to immediately grab whatever we ask for by cycling through the array and calling up each record. I then can take that data and start to use it to change the way our ellipse displays.

Let’s ‘normalize’ a data source so that we can use it to alter the appearence even more, I think it would be nice to use the ‘weight’ field to alter the alpha channel of our fill. Right now, the fill uses RGBA and for alpha I have it set to 100. The alpha can go from 0 to 255 (8 bit color going from black to white). The weight seems to range from 1600 lbs to 5000 lbs, so how can I get a smooth gradation where the 1600 lb car’s ellipse is totally transparent and the 5000 pound car is opaque? We’ll declare a variable and use the Map function:

/**
* LoadFile 2
*
* This example loads a data file about cars. Each element is separated
* with a tab and corresponds to a different aspect of each car. The file stores
* the miles per gallon, cylinders, displacement, etc., for more than 400 different
* makes and models. Press a mouse button to advance to the next group of entries.
*/

Record[] records;
String[] lines;
int recordCount;
PFont body;
int num = 14; // Display this many entries on each screen.
int startingEntry = 0;  // Display from this entry number

void setup()
{
size(800, 800);
fill(255);
noLoop();
smooth();
noStroke();

body = loadFont("ACaslonPro-Bold-12.vlw");
textFont(body);

lines = loadStrings("cars2.tsv");
records = new Record[lines.length];
for (int i = 0; i < lines.length; i++) {
String[] pieces = split(lines[i], 't'); // Load data into arrayif (pieces.length == 9) {
records[recordCount] = new Record(pieces);
recordCount++;
}
}

void draw() {

background(0);
for (int i = 0; i < num; i++) {
//first make the text white
fill(255);

int thisEntry = startingEntry + i;
text(thisEntry + " > " + records[thisEntry].name, 20, 20 + i*50); // Print name to console

float alphavalue=records[thisEntry].weight;
alphavalue =map(alphavalue,1613,5140,0,255);//Found this value range by copying into Excel and sorting

//make the shapes below have transparency based on weight
fill(255,100,100,alphavalue);

ellipse(width/2,20+i*50, records[thisEntry].horsepower, records[thisEntry].horsepower);

}
}

void mousePressed() {
startingEntry += num;
if (startingEntry + num > records.length) {
startingEntry -= num;
}
redraw();
}

class Record {
String name;
float mpg;
int cylinders;
float displacement;
float horsepower;
float weight;
float acceleration;
int year;
float origin;
public Record(String[] pieces) {
name = pieces[0];
mpg = float(pieces[1]);
cylinders = int(pieces[2]);
displacement = float(pieces[3]);
horsepower = float(pieces[4]);
weight = float(pieces[5]);
acceleration = float(pieces[6]);
year = int(pieces[7]);
origin = float(pieces[8]);
}
}

Normalizing data is a subject in it’s own right, Ben Fry wrote excellent techniques to automate this task in his book Visualizing Data and he also talks about the validity of this technique. It can be used to minimize important difference or exaggerate them. Here’s a screen shot of what this has done to our data:

Picture 29

Visually, I don’t like how this looks. The big circles are making it hard to see the smaller ones and it feels washed out. I could reverse the mapping by map(alphavalue,1613,5140,255,0);, but that does not make the heavier cars feel “heavier” and so I think it would distort the data. I think it is better to change the alpha range to a more narrow range, where even the heaviest car is still somewhat transparent: map(alphavalue,1613,5140,25,100);. In addition, the small circles don’t completely disappear.

Picture 30

I think that gives a very simple overview of exactly how to get data into Processing and how to start using the data to effect things visually. For my final post in this series I am going to refine the display of data into something I think is visually interesting and useful.

Back to Part 1
Finished Example

2 thoughts on “Simple Examples of Database Manipulation in Processing: Part 2

Leave a Reply