Καλή Φάση: Λόντρα, Παρίσι, Νιού Γιόρκ...
About this Dataroom
Please note that my submission to the City Contest is composed of "this visualisation" (i.e. a graphviz diagram) and the document "EUI_AA_ModDisMet_FINAL.pdf".
The rest of the text in this overview was written during the experimentation period for this project and is not to be considered as material submitted to the contest.
Thank you for hosting this contest, preparing for it has really been an enjoyable process.
The "Lifestyles Apart" series of visualisations was motivated by the simple idea of finding out "how far apart" was the city of Athens - Greece from other cities described by the EIU dataset and it was eventually generalised to "How far apart is [any city] from the rest".
The title is hinting at the story within the EIU livability dataset. The set of 39 indexes for each city essentially describes a lifestyle. Getting to downtown Vancouver - Canada for work or pleasure is a completely different story than getting to downtown Karachi - Pakistan on any given day....These two cities are not picked at random by the way!
This is made possible by the fact that each of the 39 indexes is "normalised" within a scale of 5 values with coherent semantics. In other words, the 39-dimensional space that is described by these indexes is not skewed by the dominance of the values of a subset of indexes and all indexes are monotonic, meaning that the "negative->positive" direction is the same across all of them.
Therefore, it is possible, to use each set of city-index as a vector composed of 39 components. Through this it then becomes possible to answer questions like:
1) What is the "distance in lifestyles" between a city and all the other cities in the dataset?
2) How does this multidimensional space looks like? Is it possible to "get a feel" for it? In other words, is there a way to position the cities, prefferably in a plane, given their "distances"?
3) Which two cities have the greatest distance? Which city is about in the middle?
Question number one is answered with a small set of visualisations and a quantitative dataset. Question number two is answered through the use of Multidimensional Scaling and as far as question number three is concerned it is answered through a simple processing of the distance matrix.
Some Results:
The two cities that appear to be the furthest appart are Vancouver - Canada and Karachi - Pakistan. It is worth noting here that there are other cities that are comparable to Karachi and Vancouver, but these two are the ones with the absolute longest distance between them.
The "city in the middle" is taken as that whose distance vector (that is, the distance to the rest 139 cities) has the lowest standard deviation. This means that its distance vector is the "smoothest" of all the others. This city is Sofia - Bulgaria. (Of course a point far far away from the point cloud (a freak outlier) would throw this quick method off, but there's no such case in this dataset)
The multidimensional scaling visualisation places Sofia - Bulgaria a little to the North - East of the centre of the elipsoid that would enclose the cities, while the closest city to that centre would most likely be Manila - Philippines. (Please note that the sum of Sofia's indexes is 97, while Manila's is 90). It is also worth reminding viewers at this point, that the EIU city indexes are derived relative to New York - USA.
Just as a note, the (non-classical) Multidimensional Scaling stress factor for squeezing 39 dimensions down to 2 was ~0.087, while for 3 dimensions it was ~0.053.
Recommended Similar Datasets
Linked Datarooms
3D - Trials: Perspective View of The Shortest Branches
3D - Trials: Perspective View of Paris Branch
3D - Trials: Perspective View of Auckland Branch
3D - Trials: Top view of ordering - Subset of cities
City Ranking - Modified Distance Metric
Also — if you plan to submit to the contest and would like to make your work private again, you can always switch back under the 'Admin' tab of your submission ...
Right, thanks for that, i was going to create a new dataset but i guess i could continue with this one as well.
Heyo! Glad to hear you're considering submitting an entry! I have been looking into the Cost of Living datasets (not to compete, just for fun) and hope to hear back about the numbers. I suspect I'm not interpreting them correctly because some of the costs per given city are quite frankly astounding!
Hi momoko...It seems that the prices do not correspond to exactly the same product across the world but to its category. (There is a brief explanation regarding "bread" here: http://www.worldwidecostofliving.com/asp/wcol_WCOLHome.asp). I am sure you can find 1kg of something that qualifies as "bread" in a New York supermarket for less than $6 :-)
(It may also be that certain "bread" products are specially processed and therefore "pulling" the price tag upwards.)
This is very cool ... are you still working on it?
I ask because I'd love to tweet it :)
Hello momoko, thanks for checking back at this!
I did an alternative color scheme available through <a href="http://www.flickr.com/photos/42973403@N07/6838254469/">this flickr set</a> and there are a few more things that have emerged from this work but not yet documented anywhere. Also, i might be going for a "proper" contest entry after all :-)
Thank you very much John :-)
Cities Apart - Multidimensional Scaling on the Livability Dataset
Just a short addition to the previous visualisations that shows the "shortest" path through the cities, described by the EIU liveability data. The graph is the multidimensional scaling output (with large errors, i am afraid :-/ ). Distance is taken as the "Euclidean Distance" as defined by the index vectors.
Basically, the graph shows what would be the smallest steps to be taken by Karachi so that it "progresses" as a city towards the markers of Vancouver.