Visualizing Movie Data | VMD_02 | Time Series

A dozen years ago I heard the word ‘ubiquitous’ for the first time. I wondered for what it stands for. Looked it up and the word ‘ubiquitous’ means: present, appearing, or found everywhere. So these time series graphs are a type of graphs that you can find anywhere. Because this project is about visualizing our movie data I need three or more data sets. The idea is that I will compare these data sets with our own data set. I hope to find out how our qualifications relate to, for instance, the IMDb (Internet Movie Database), Metacritic and/or Rotten Tomatoes. Suppose I would like to see the first one hundred films compared to results of these websites I should be able to draw the necessary conclusions. This is going to be a lot of handwork. But that’s okay because I’am in a learning process.

I could imagine that you have the numbers 1 to 10 on the left side of the graphic. And at the bottom are all the film titles. That seems logical. But it is not. Movie titles may be very long. For example: ‘A Pigeon Sat on a Branch Reflecting on Existence’. So you would expect the film titles to the left side of the graph. And then, the numbers 1 to 10 at the bottom. At this moment I think the best solution would be if you place your mouse cursor on a data point that the movie title is displayed at that point. But maybe I run too much ahead of myself. I have now read the original data of Ben Fry’s Time Series chapter in the program and I changed the display format.
VMD_02_01

Let me concentrate on the data. The first thing you notice about the IMDb, Metacritic and Rotten Tomatoes reviews is that they work with floats. Our own movie data works also with floats but the end results in ints. So I actually have to run all 100 film programs again and see what the endresult is using float’s. When I have those results, I have to type them in a text file. And then I do the same with the results of IMDb, Metacritics and Rotten Tomatoes. I left out Metacritics in the end. It sometimes happens that we have seen a film but that it is not found on IMDb or Rotten Tomatoes. In that case, the film gets a zero. The first thing I noticed in our chart, which uses our own data, is that it looks quite messy. There is not really some logic to find in the positioning of the points. The reason for this is that our films are chosen randomly. This results in random positions for the positioning of the set of points. The sequence is the real sequence of the first 100 films we have seen in 2015 though. Furthermore, the points are positioned at the bottom. This is caused by the largest value in the other data series. Our data set ranges from 0.0 to 10.0. While the other two data sets a range from 5.1 to 46.4. Therefore these other two sets have still to be adjusted. But I do not have the right data for them yet.
VDM_02_02

At this moment I have added all the scores from all the IMDb and Rotten Tomatoes. I can now on hit the “]” key and the “[” key to go through the three different graphs. It all looks a bit scarce. But you do get an impression of how the scores are distributed. I’ve also added titles as a placeholder (We, IMDb and RT (Rotten Tomatoes)).
VDM_02_03

I have increased the number of films to 150. It now looks somewhat less scarce. Eleven films from Rotten Tomatoes are not evaluated. That makes them stand to zero. At the bottom chart of the chart. However, these films are evaluated on IMDb and by us.
VDM_02_04

At the bottom, I added the amount of films we have seen in numbers. I also reduced the white background space slightly. This ensures that everything is shown less cramped in the display window. It would even be better when you could read the titles of the movies instead of our numbering. But perhaps I can add that at a later stage. And perhaps not at all. Maybe. Because after all these graphs are only about comparing our voting behavior with IMDB and RT. A quick conclusion about it teaches that our differences are slightly wider spread. It ranges from 3.3 to 5.9 points. IMDB ranges from 4.1 to 9.3. Rotten Tomatoes series go from 4,5 to 9,8 (if you do not count the 0.0).
VDM_02_05

I have added horizontal and vertical grid lines that may be helpful to compare the data points better. On the left side of the graph are now the scores of 0.0 to 10.0 displayed. And as a result, there is no need for the positioning of additional tickmarks. The horizontal and vertical lines do their work instead. I think that score numbers are displayed too long. I have now four digits after the point because we are working with floats. The function ceil does not help in this case. Because that rounds everything off upwards. Floor rounds everything downwards. The feature I’ve used now is nf. This means that there is just one number after the point shown. I use two versions of the Futura. Futura Medium and Bold. Furthermore, I also labeled the numbers. That makes the chart clearer.
VDM_02_06

I now go ahead replacing the points with a line. Actually this is a bit rubbish. The scores of the films have nothing to do with each other. Each score of a film state is a value on its own. So there is no mutual connection with a line necessary. But as a variation it is perhaps interesting. I also changed the colors. The white field is replaced with a dark gray. Because then the colored lines stand out better.
VDM_02_07

In this version all scores are displayed on top of each other to see where the differences are. The title of the data sets should change with it if you choose another data set. But I don’t like it anyway. It is a poor and chaotic whole. So this seems to be not a good option.
VDM_02_08

I now have retrieved some items from one of the earlier sessions. The line connections remained blue and the points themselves are white. The points are most important so they are allowed to stand out. I’ve made them a little smaller. This has as a result that (when points are close to each other) they overlap each other less.
VDM_02_09

This proposal introduces rollovers. I now get feedback that I already can see on the x and y axes but much more precise. But actually you would like to see the movie title when your cursor is at a data point. I think I’m going to do that at a later stage. But I am unsure about it. I think it’s it’s more important that I get some sense of what you can do with the data.
VDM_02_10

I do have the feeling that the lines have become too dominant. Especially now that you’re getting direct feedback on the cursor. The lines are no longer functional. I will also try if I can make the middle block more squared. You lose that  the smaller rectangles are not square anymore. However, it does create more room in the width. I also reduced the proximity of the cursor and increased the point size of 10 to 12. And Futura Bold is used for the values under the cursor.
VDM_02_11

Replacing vertex in drawDataLine by curveVertex actually does not make much sense. The data points are most of the time so close together that no fluid line between the points can be made. But if you make a plane field to the lower right point right and the lower left point it makes more sense and it gives a different picture. The question then is whether the horizontal and vertical lines are still functional. So I have them  removed. I think this looks better than all the previous versions. And along with the feedback you get when you stand with your cursor on a data point it looks just fine.
VDM_02_12

I have made the background of the chart the same color as the background color. That gives a completely different picture. I initially had accentuated the vertical lines. But I think the horizontal lines can better be accentuated. These lead you too much more meaningful data. I have given the horizontal lines 50% transparency in the beginning. But afterwards I got a better result by decreasing the line width to 0.5 pixels. Which is basically logically impossible.
VDM_02_13

It seems silly to transform this graph to a bar graph. I must then let the program draw rectangles instead of one flat plane. But then I have a problem. Because I have 150 bars in a width of 600 pixels. This means that the width of one bar can be a maximum of 3 pixels or less. At 4 pixels, the total lower surface is filled again by overlapping bars. But with 3 pixels I think it’s just about acceptable and it even has some form of sophistication.
VDM_02_14

As a last proposal I introduced tabs for the three different data sets. But I found the Futura Bold far too heavy in these white tabs. So I opted for the Futura Medium.
VDM_02_15

Now I have to do a few more things. The white area behind the title is way too loud and is almost visually independent of the graph. Plus the bar chart layout is not the best I’ve seen so far. As a final detail I go back to the design of VDM_02_12. I now only use the Futura Medium. I also adjusted the color. I chose red and green. Two distinctly different colors. The strong contrast between the two colors allows the separation-line between the two planes extra stand out. And thus it seems to me that this session is finished. But there is one more thing.
VDM_02_16

I have made a very simple animation of the three datasets. The datasets of us, the IMDb and Rotten Tomatoes interpolate their points. Unfortunately, the interactive version is not available. I captured the animation so that there is atleast something to see.
VDM_02_17

 

Advertisements

Comments? Leave a reply.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s