Visualizing Movie Data | VMD_09 | The height of performers

While searching for data from the 200 films that we saw in 2015, I found that there was a reasonable list of actors who have published their physical height. On the IMDb-site I also found that there are always three actors per movie specified as a main actor. However, the 600 actors and actresses are not complete in their data. But it is quite strange that actors and actresses make their height public as interesting information. So I’ve spent some time to do some more research on what height exactly is (or what the meaning of it is). Human height or stature is the distance from the bottom of the feet to the top of the head in a human body, standing erect. During this research, I also found a list of average heights of people in countries of the world. The information is also broken down into heights for men and women. Eventually I decided to interpret this information in three variations. But maybe I’ll find more variations during the work itself.

To begin with, I must first create a list in a spreadsheet where only the actors and actresses are who have given their height and country of birth. Birth year and place are irrelevant because I only want to compare physical heights. Of those 600 actors and actresses only 302 actors and actresses gave me the complete data. But as with any statistical data, the accuracy of this data may be questionable for various reasons. The trajectory to get all this information in Processing is now ready. All 302 performers are now displayed in Processing’s console and message area. This is useful for checking. The first version I have made used five columns. The first column contains the name of the actor or actress. The second column shows the gender. The third column indicates the height. The fourth column contains the country of birth. The fifth column shows the average height of the population of the motherland. The columns are for the moment sorted by countries of birth (from Argentina to Wales).
VMD_09_01

The next step is to find a layout where I can compare the differences in height of actors, actresses and the average heights of the countries of birth. When I look at the average height of people from the countries of birth than the height ranges from 154 to 181 centimeters. But the heights of the actors and actresses vary from 150 to 199 centimeters. That seems a nice job for Processing map-function. The map-function re-maps a number from one range to another. For example, it converts the number 25 from a value in the range of 0 to 100 into a value that ranges from the left edge of the window (0) to the right edge (width) of the window. Through the map-function I drew horizontal lines representing the heights of the actors and actresses. The average heights of persons from the countries of birth is also shown.
VMD_09_02

The drawback of the previous approach is that it is difficult to compare the average heights of the countries of birth with the heights of actors and actresses. I have changed that in this version. The heights of the actors and actresses stand right next to the average heights of the countries of origin. Color is now used only to help distinguish the data better. But this can be done better.
VMD_09_03

In this version, the height of the actors and actresses is represented by the red numbers. The horizontal red bar represents the height too. But in a graphical way. It’s a bit strange to show vertical height as horizontal bars. This has to do with the text. That is better readable horizontally instead of vertically. The average height of the motherland has now the same color blue as the horizontal bar. I also have put the lines directly below each other so it’s easier to compare the heights.
VMD_09_04

The font does not have to be monospaced anymore. We do not have to compare any heights of names in this graphic. We compare heights of actors and actresses. In the previous version, everything is sorted by country. The list starts with Argentina and ends with Wales. It seems better to me to sort all data on the heights of the actors and actresses. And than we see that Chris O’Dowd, with 199 centimeters, is the longest male actor. The longest female actress is Nina Hoss with 180 centimeters. She is 19 centimeters shorter than Chris O’Dowd. But she is another 20 centimeters higher than the actor Denis Lavant which has a height of 160 centimeters. The smallest actress is Vanessa Martinez. She is 150 centimeters in height. That is 49 centimeters shorter than Chris O’Dowd.
VMD_09_05

Will the graphic become different if we sort the average height of a country as a starting point? The average height of a country of birth set against actors and actresses. To show that I gave the countries thicker lines. It appears that Rutger Hauer, with 181 centimeters, ends at the top. It is also remarkable that Nadine Labaki is ranked on 28th place with her 167 centimeters. Many men seem to average around to 175 centimeters. With Buster Keaton clearly below the national average. Also interesting is Olga Kurylenko (174 centimeters) compared to the national average of 160 centimeters. Or Regina Torné (173 centimeters) with the national average of 154 centimeters. Also notable is Joaquin Phoenix (173 centimeters), while the national average of his country of birth is 154 centimeters.
VMD_09_06

Visualizing Movie Data | VMD_08 | Performers names and full names

In the films that we have seen in 2015 (and a few years before) are almost always actors and actresses involved. The question I asked myself was: ‘Where do these actors and actresses come from?’ To answer this simple question I started importing a world map. I supposed to find the birthplaces of the actors worldwide so for a start a map seems to be a good start. Probably the color and size of the map is wrong, but I always can change that later. Gradually, I learned that it might be better to, instead of actors and actresses, only mention the directors of films. This is easier because directors usually consist of one person. It is certainly easier than, say, 25 actors in one movie. But than I found out that there are also films which are directed by several directors. That made me decide to name all directors. Mentioning the films is not very relevant because a director may have directed multiple movies. The downside is that you have to figure out this information yourself. Drop the data in a spreadsheet and check for typos. Another problem is that you can continue to add more and more columns of data because there is a lot of inconsistent information about actors, actresses and directors available. For example, the name of the actors. And the real name of the actors. And their parents. Plus the place where they live. And their birthplace. And the year of birth. Basically there is enough data to find. Finally I ended up with compiling a list of actors and actresses who appear in our list of 200 films from 2015. My original question: ‘Where do these actors and actresses come from?’ changed in comparing the name of the actor with his or her’s real name. The full name that is.

I started with a test where I import some actors with their known name and their real name in Processing. I especially made sure to import the longest name of the total list. I assume to write every name in a monospaced font because than you can compare the lengths easier. An i is not as wide as a w. And an o is less wide as a m. A monospaced font consists of characters which are all of the same width. And that goes for both uppercase, lowercase, numbers and punctuation. This first test demonstrates that the workflow is functioning. The list of actors with their name and full name are typed in a spreadsheet. This list is exported to a csv (comma-separated values) text file. That file is read into Processing. Through Processing I can create the layout. Furthermore, the number of characters are calculated from the length of each name.
VMD_08_01

Now it’s about time to think of the layout of the page. Firstly I use multiple names in two columns. But how many lines fit on a page? Maybe it’s a good idea to give the name in the right column a different color than the name in the left column? However, an urgent problem is that both columns are too long. At the size of a 1000 × 1000 pixels display window, you can load up to 47 names in height. But my complete list of actors includes 225 names. That’s almost five times as much. So I have to divide this text file in some way or another.
VMD_08_02

If I scale the list in the program at 21% the total file with names fits on one page. Perhaps there is a way in which the mouse is able to detect a name. And that name is than enlarged in the layout. But than this problem is still not solved because you loose the possibility to compare the names with each other. So I think a scrollbar is a better option.
VMD_08_03

Just to be sure I checked which is the longest name in the list of actors. And the longest name is: ‘Isabella Orsini Princesse de Ligne de La Trémoïlle’. An Italian actress who married in 2009, with his highness Prince Édouard Lamoral Rodolphe de Ligne de La Trémoïlle. An even longer name for a man who cannot act. And he does not have to in order to be able to survive. Here are some screen dumps where all names are displayed. Each name has a number in front that indicates the amount of characters in the name. On the right side of the right column are yellow numbers that indicate how many characters the real name is longer than the common name.
VMD_08_04

In fact, I could leave it at that. But I would like to make a version with a scrollbar. And I had never programmed a scrollbar so that’s a good reason to make one. Although it took a lot of time. But it is more worse when it took a lot of time and when I ultimately failed to make a scrollbar. In this case, I succeeded. These are two test files. To use the scrollbar I imported an image in which all names are displayed.
VMD_08_05

In this setup I used also a title and subtitles. The names are scrolling underneeth them. Actors by name length, actors by name en actors by full name. But perhaps the word performers is better because it covers both male and female actors. But the ultimate goal I have not found yet, ‘Where are the actors and actresses coming from?’ But that’s for later.
VMD_08_06

Visualizing Movie Data | VMD_07 | Movies by words

All films that we have seen in 2015 (and earlier) make use of a film script. What are the most often used words in this scenario? To find out, I need to look for movie scripts. I found them in the ‘IMSDb’ (Internet Movie Script Database). The first script that I found was the script of ‘Tamara Drewe’. Certainly not the best movie we’ve seen but that doesn’t matter much in this case. I loaded the script into TextEdit. Then I imported the complete script of ‘Tamara Drew’ into Processing. In Processing I used Ben Fry’s Treemap Library. In this version I have everything inverted. And I have added a sixth layout algorithm: PivotBySize. The words ‘the, a, is, you, to, tamara, nicholas, beth, and, in, i’ appear to be most often into this script.
VMD_07_01

Here I used the script of ‘The Master’. The starting point is now to use a gradient in the background. I do that by placing horizontal lines. This is a less advanced version. In this script are ‘the, freddie, to, you, and, a, master’ the most frequent words.
VMD_07_02

In this version, I’ve refined the gradient. I also changed the font to Avenir Light. And I turned off the frames. In this script of ‘Interstellar’ are ‘the, cooper, a, to, or, and, it, is, in’ the most common words.
VMD_07_03

This variation is only a variation in the code. I have programmed a few things slightly more functional. Used the script of the film ‘Blow’. The words ‘the, and, a, you, george, to, is, i, of, are most frequently used in the script.
VMD_07_04

I started again with studying the Treemap library. I have added two new classes. The BinaryTreeLayout and PivotByMiddle class. And I used a less harsh color scheme. I used the script from the film ‘Foxcatcher’. The most frequent used words are ‘the, to, mark, du pont, a, of, dave’.
VMD_07_05

A very colorful variation. The largest number that can be formed by w (width) in this version is 190 (pixels). All numbers that are higher do not have any influence on w for w can never be higher. Suppose I want to use six colors than that would be 190 divided by 6 colors (rounded up) is 32. So I can choose 6 shades of color in HSB color mode every 32 colors. As a script I used ‘Fargo’ here. The most frequent words are ‘the, a, and, to, margin, or, you, his’.
VMD_07_06

There are no scripts available anymore from movies that we have seen in 2015. So I have to search for an alternative. Films that we have seen but not in 2015. Also the range of colors has to be more advanced. If I want to use the entire 360 HSB range than I have 203 pixels for w (width) available. I have to divide 203 by 12 in this release. That’s 16. In the end this calculation doesn’t work. So I made increments of 40 in the HSB color mode. The script is from ‘12 Years a Slave’. Most common words are ‘the, a, to, and, solomon, or, is, in’
VMD_07_07

In this final version I have used a better color scheme. I also used the script of ‘Inglourius Basterds’. Most common words are ‘the, a, to, and, in, or, you, i, his’.
VMD_07_08

Visualizing Movie Data | VMD_06 | Films by flags

The idea is very simple. From which countries did we see films during 2015? Represent the number of movies with the flags of those countries.

The size in which I work is 1000 x 1000 pixels. There are exactly 50 countries who have made films. And the largest score of 69 films is achieved by the USA. Looking at the display size, in width, I have 1000 : 50 = 20 pixels available. In the height I have 1000 : 69 = 14 pixels available. One flag has a size of 20 x 14 pixels. That is very small. I have made a Processing test-file. The first column is made up of svg-files which are reduced in Processing. Processing or Adobe Illustrator are doing crazy things with this flag-file. The stars of the American flag are randomly placed outside the boundaries of the flag. But because the flag is reduced you cannot see the stars, and the errors, anymore. The second column is a png-file which blurs the stars and lines rather randomly together. The third column is a bmp-file. Manually corrected in Adobe Photoshop. The fourth column is the same bmp-file but now it has a black line between each separate flag. I choose the latter. Although it’s a lot of manual editing work.
VMD_06.01

The next step I’m going to make is to find the positions of the flags. There, too, I’m going to use the American flag. It’s just a placeholder. I need to make a minor adjustment to the image file. When the flags are 20 pixels in width I must sacrifice one pixel in width for a black line. Otherwise, the flags will visually not be separated from each other. And if you look at this example, you will see that it’s not a good visualization. We are dealing with very big differences in the rather small data set. The minimum score is 1 and the largest score is 69. In addition, the flags are not looking good. And I have no space to add extra information such as headers, country names or scores.
VMD_06.02

I have to solve this differently. Let’s start all over again. I have 50 films. That are two rows of 25 flags. Or two columns of 25 flags. Or 10 rows of five columns with flags. That seems most appropriate. Flags are usually wider than tall. And that would eventually produce a square. I pick a random image of a flag. In this case its the United Kingdom’s flag. Five flags in width and ten in height.
VMD_06.03

I also need margins on all four sides of the display window. I take 100 pixels margin. This produces flag sizes of 160 x 80 pixels. And that results in an exact square of 800 x 800 pixels.
VMD_06.04

How can you visualize the number of films that we saw in 2015 using these flags? One possibility is Processing’s tint function. But because this tint function does not work the way I want it to work, I go for a black rectangle with a little transparency. All flags are 160 pixels wide. A 100% score is 69 films seen (the American ones). The width of a flag 160 divided by the score 69 = 2.3. Thus the multiplication factor is 2.3. for all the scores. Bringing this into practice that is visually extremely disappointing. Although it represents the exact data.
VMD_06.05

The disadvantage of this representation is that the differences are too far apart from each-other. I would like to keep these differences but the relationship needs to be adjusted. I think I need to use Processing’s map function. It re-maps a number from one range to another which provides a more interesting image.
VMD_06.06

I have the map function applied excessively. Otherwise, the countries from which we have seen only one film stay almost invisible. Furthermore, I found little refinement in the image. I solved that with a small transparent black gradient. Finally, I have rotated the total image 90 degrees. This gives a better picture because it seems that the light comes from above. Which is much more natural. And why not? Flags on their side remain equally recognizable.
VMD_06.07

Visualizing Movie Data | VMD_05 | Small comparisons

At this stage of our Visualizing Movie Data project, I want to see how far you can go with a reduced view of our movie data. How far can you reduce the decoration in exchange for the functional display of data. However, the readability must remain intact. What kind of problems can we expect? And perhaps more importantly … what are the solutions to those problems. In fact, the charts I post ever week on Facebook are my starting point. This is just a smaller variation of it.

What determines the smallest size of a single graph? I think that’s the longest movie title! In this case, the longest movie title is: ‘What happened Miss Simone?’. I now use a point size of 10 pixels. As a typeface I use Futura Bold. I reserve a square of 100 x 100 pixels for every movie. So for a hundred movies I need a display window of minimal 1000 x 1000 pixels. But I also need some margin. And I have to position a legend somewhere. The lines of the bars could be formed by dots. Dots you can count easily. 10 Dots stands for 10 points. 5 Dots stands for five points. But now some interesting effects pop-up. I can now draw 13 dots in width. And 13 dots in height. But this layout seems to form vertical lines with dots instead of horizontal lines. A side effect which is not what I was looking for. Furthermore, it seems that the text at the top of the square belongs to the graphic which is above the text. Also not a desirable effect.
VMD_05_01

13 Dots in width would be just right for several categories. But three dots disappear in height because my scores range from 0 to 10. So the solution could be that the movie titles would be displayed under the graphics. And I have to use vertically stacked points instead of horizontally. Because the points only reach 10 points in a few cases the movie titles will belong visually closer to dots that are displayed above the movie titles.
VMD_05_02

I have changed the generation of the movie data into a function. The positioning is resolved and becomes easier than it was in the previous chapter of this project. There are still some parts that need more attention. The height of the column of dots. The color and the placing of the movie title. When the dots have to display a 10 then the topmost dot slightly overlaps the rectangle of the graphic.
VMD_05_03

All data is used. A hundred movies as a dot graph. Right now you can not distinguish the categories. But what you can see is that a number of films that are top-rated. If the rectangle was completely filled with dots than the film would have scored only tens. Unfortunately, there is not one movie that has made it. There are now five films which scored very high: Locke, Mr. Turner, From What Is Before, ’71 and Lilting. Amour Fou is slightly below that score.
VMD_05_04

At the moment, the program displays a point for each dot. I’m going to turn into a circle. On a longer term that gives me more possibilities to make variations.
VMD_05_05

I have used a number of colors for the different columns. It’s a multiple of 30 in the 360 hue scale. But I find it looks cheap. In addition, some colors are too close together. The green series for example. And the dark blue disappears into the background color. And, some colors are hard to distinguish from others.
VMD_05_06

I have made a few variations of the existing color scheme. A number of these variations are not acceptable. The number of colors must be reduced. But perhaps I can do something with the brightness of the colors. The 13 columns are very difficult to distribute in different colors. The red column pops out. It appears to me that it means something special. But that is not the case. That is also the problem with the 3 degrees of brightness in the columns. Those seem to belong together. But that is also not true.
VMD_05_07

Now I am starting to look for a range of colors where each color stands on itself. But the range should be acceptable as a fine color range. A color may not have a relationship with the other colors. I replaced the circles by squares.
VMD_05_08

The squares can also be replaced by rectangles. I’ve also added a legend. Otherwise, the addition of color is absurd. I give the category texts in the legend the same color as the bars to which they belong in the graphic.
VMD_05_09

Just a few variations with broken lines, and rectangles, and transparency.
VMD_05_10

And the last variation. Totally non-functional. But it does make an interesting picture.
VMD_05_11

Visualizing Movie Data | VMD_04 | Reviews by categories

As a next step, I find it interesting to see what our reviews are telling us when I show each category of a movie. I can imagine that the titles of the films are on the left. Suppose we start from the first 100 movies we have watched from the beginning of 2015? What does it look like? And what conclusions can we commit to? I’ll try programming this version slightly smarter than the earlier version.

I start by creating a grid of numbers. There are 13 categories (13 columns) with decreasing numbers from top to bottom and from 10 to 0. The size of the display window is a bit of guesswork. I now work on a size of 800 by 800 pixels. On the left side of the display window film titles have yet to be placed. And all 13 category labels should still come on top. I expect that I need much more space than 800 pixels in width and height. In the program I have added an empty draw block. Otherwise functions as keyReleased and timeStamp do not work.
VMD_04_01

Placing the film titles is a matter of creating a text file with 100 titles of films that we have seen since the beginning of 2015. Then read this text file into Processing and displaying it in the display window. The order (from top to bottom) corresponds to the viewing order. The list starts with the film ‘Boyhood’. Which is the first film that we saw in 2015. The list ends with the film ‘Restless’. And that’s the hundredth film we’ve seen. However, there is only one-third of the list visible. This is up to the film ‘Calvary’. And that is film number 38. Putting another 62 films in this display height makes no sense because the point size would become too small to read.
VMD_04_02

To get all the movies titles on the left in the picture, I have a few options. Reduce the line spacing. Reduce the point size of the font. Or I can increase the size of the display window. In this case I have used all three possibilities. I end up with 1500 x 1300 pixels. I also added the names of categories.
VMD_04_03

Another stage where I further optimize the distances. The category names (the labels of the columns) are still too far from the category columns. I’m going to put them closer and place them on an angle of 45º. The category numbers are now placed on an imaginary square. The display window is now 1460 x 1228 pixels. And the grid is built with squares of 90 x 90 pixels. Testing a first line which is drawn through the numbers who rated the film ‘Boyhood’. That does not look good. The lines are too stiff. It should be more fluid. VMD_04_04

In order to make more fluid lines I did one attempt with the curveVertex function. The problem here is that the curveVertex function uses Catmull-Rom splines. It does not make beautiful curves. In the end I opted for bezier curves. For the quality of the curve that is the best solution, but it requires more passes of data to describe the curve. Four anchor points and four control points per line. That means 13 x 8 points per bezier curve. That is 104 numbers for the first movie. Thus, in total there must be 10.400 points calculated to make the final visualization.
VMD_04_05

The first six films drawn using bezier curves.
VMD_04_06

I have now drawn 26 films with bezier curves. And it shows directly the weakness of this visualization method. Since all lines have the same color and thickness it is difficult to see which movie has scored which number in which category. At a later stage I will do something about that. But the problem is not completely solvable.
VMD_04_07

About half way with the positioning of bezier curves. I place the curves in a very straightforward way. I know that this can be done with more intelligence but I will not have time enough to solve this problem now. I think it requires an additional study which I might do in a later stage.
VMD_04_08

And about to place a fourth number of bezier curves.
VMD_04_09

All bezier curves are now positioned. On the left, it has become a pretty organized chaos. Looking at the line patterns you can conclude that most movies have brought us a 6, 7 or 8. What might also be said of our rating. Is our rating mediocre?
VMD_04_10

With all the lines in their place, it is now the time to bring in the Futura font. I have changed the background color to black. Font color is white. The color of the lines is gray with 50% transparency.
VMD_04_11

Time for a number of tests with line widths. Some are absolutely exaggerated. Others are functional. These variations also show that the number columns have to be written as a last item. Otherwise they will be overwritten by the bezier lines. And I shifted the column with movie titles slightly to create some space  between the start of the bezier lines and the end of the movie titles.
VMD_04_12

Trying to solve a problem that popped up in VMD_04_07. To what extent is it possible to get more distinction between the bezier curves themselves. I start with two colors. Red and green. There seems to be a strange effect to occur. When a certain amount of red and green lines overlap it creates an additional color. It looks like orange. At least that seems to be orange but if you make the lines thicker it seems to be some light version of something brown-ish.
VMD_04_13

Added a blue color. Now it seems that there are many more shades of additional color variations possible.
VMD_04_14

What happens if I make an ascending color scale from 0 to 360? I switch to color mode HSB. HSB is easier to work with (as a human).
VMD_04_15

Which movies have been honored with at least once the highest possible value of 10 points?
VMD_04_16

Which movies have been awarded with at least once the highest value of 9 points or higher?
VMD_04_17

And finally: which movies have been rewarded with at least once the highest value of 8 points or more?
VMD_04_18

A quick conclusion. I am tempting to say that if a film did not score one 8, 9 or 10 in the assessment it would be not a good movie. That means it is of a lower level than films who scored at least one 8. Or one 9. Or one 10. This visualization is showing the worst films of all 100 films we have seen since the beginning of 2015. In total these are only 27 movies. So a little over a quarter. That means that three-quarters of the 100 films that we have seen always had something of good quality in them. And that’s very reassuring. For the filmmakers, the film industry and for us.
VMD_04_19

Visualizing Movie Data | VMD_03 | Waltzing with Bezier

When I started this assignment I was interested in how much money is actually going on in the film industry. What costs a movie? What is the budget? How much money does it produce? And how do these figures compare with our ratings. I thought it was easy to check the data on the site of IMDb. But unfortunately all I found was very incomplete data. I checked all 150 films that we have seen since January 2015. And guess what. There are only 56 films that both show you the budget and the profits. In addition, all amounts are mentioned in different currencies. So I have to convert them to dollars or an other currency unit. Additionally, in all the movies descriptions that are not from the United States, there is almost no sign of costs and benefits to find. So I have to check at other websites if there is additional information.

After that extensive check this resulted in 69 films with complete financial information. I think I should leave out the series. These often run over several years and are applying varying budgets. While a film only runs once and receives just one budget. Another thing is that these figures represent only periods when movies are played. Some play longer periods than others. Because they are more popular they bring in more money. But that says nothing about the quality. Our list shows that there are only three films made which costs less than one million dollar. However, there are 14 films which benefits less than 1 million. I made two text files of them. One with the highest budget on the top. The other list has the highest gross at the top.

Then it is important to read the text-file into Processing and display it in the display window. A simple task. But that turned out to be more complicated than I thought. It comes down to that there is a lot of attention in the tutorials to get a text-file into Processing’s console. But how to get the data into the display window I could not find anywhere. I got my question answered 50% through the Processing Forum. And partly solved it myself. Been busy with it for one afternoon. And this is the first result. Not very impressive but all data that is in the text file is displayed in my Processing display window. And that was the first goal I had in mind.
VMD_03_01

The next step I need to take is to get the data lists separated. It should be possible to reposition the movie-titles, budget and revenue. If I cannot do that I cannot deal with the layout. Incidentally, at this moment the sort and reverse functions are quite handy. And I have changed the font to Futura Book.
VMD_03_02

How does the program know which budget and income are associated with a movie? That is a question for me too. For the two digit columns are mixed-up. The budget and the income lists are both sorted from large to small amounts. The budget list thus does not have the same order as on the income list. So I have added film titles both to the budget and the income list. In that manner it is easy to check for me if the lines of the budget is written to the right amounts of the income list.
VMD _03_03

Changed the background colour to a very dark grey. Furthermore, now the budget and income-lists are connected by a line to one another. Everything looks pretty cluttered. But that will change in the next design. What’s striking is that the biggest blockbuster has a horizontal line. ‘Interstellar’ with a budget of 165,000.000 dollar and a total income of 675,020.017 dollar.
VMD_03_04

I’ve started checking the film titles. Whether they are written correctly and without mistakes. All non-English-language film-titles translated. Les Petits Mouchoirs is Little White Lies. Loin des Hommes is Far from Men. Relatos Salvajes: Wild Tales. Marie Heurtin: Marie’s Story. Elddfjall: Volcano. And that is one side of data visualization. You must be an administrator, Sherlock Holmes, graphic designer, translator, animation designer and programmer at the same time. I have given the chart some more space. And the distance is increased to the lists of numbers. Which suddenly brings me to a new idea.
VMD_03_05

I now work in Processing 2. Its time to download the new Processing 3 and fund the Processing Foundation. That is the least I can do because I work daily with Processing. In Processing 3 you can use the Table Class. It’s easier to work with because everything is now in one text file.
VMD_03_06

The $ sign was added but I do not find it successful. Maybe find another solution. Right now you do not see what the amounts of the lists are. I know that the left-hand amounts are for the budget. The right column represents the amount of income.
VMD_03_07

Because ‘Interstellar’ is misrepresented I have thrown this film out. I think the columns should have proper labels. And I need room for doing that. I have also added the sequence of 0-10 to the right. The numbers 0-10 represent the ratings we have given to the films. The idea is that I’m once again going to draw the lines but now from the income-list to our ratings.
VMD_03_08

I have adapted the total graph a bit. Lines start and stop now slightly closer to the lists of numbers. I have added the vertical text ‘Amounts in American Dollars’. The overall chart remains somewhat chaotic but I think the result is not disappointing.
VMD_03_09

Added colour. I chose green for the films that cost less than their revenue. And I choose red for the films that have cost more than their revenue. Now the graph begins to show a disadvantage. Because the lines are thicker it is difficult to see to what amounts they belong.
VMD_03_10

I replaced the line function by the bezier function. Now it is better to see which amount belongs to which line. And the overall chart looks slightly smoother. Of the 64 films, 26 films have made a loss. 38 Films have made profits. Mr. Turner eventually made losses but was still on top of our rating. Locke is a movie made for 2,000,000 dollar. It made a profit of 5,000.000 dollar and received a 10 in our rating. The Salvation has cost 11,524.796 dollar. To our knowledge it has brought 5000 dollar (which I strongly doubt). But it still gets a 7 in our rating. In short, data visualization is very interesting, very time-consuming and precise puzzling. Actually I had to code the program much smarter. But that would cost even more time.
VMD_03_11