Parking in Toronto, II

A part of when are issued the parking tickets, another very important aspect  is the where. On the Dataset provided by the city of Toronto, there are approximately 400000 addresses. This would mean about 25 tickets for location over the past 4 years. However, only 100,000 accumulates 98% of the total offences, actually 7,000 represents around the 60% of the total! Even more, the top ten locations accumulate nearly of the 2% of all tickets of the city. In the future I will talk about this kind of distributions where a few elements have the main weight of the population, a very common kind of distribution. Below,  you can see the  7,000 most ticket locations of Toronto during the last 4 years.

Screen Shot 2016-02-21 at 23.45.24On over 6,000 of those locations, the average is a ticket or less per day. And in general the downtown shows a mayor density of hot locations, but surprisingly the 10 most ticketed address are not focus in the city center, and in fact are quite scattered around the map.
I checked the most ticket addresses and with a couple of exception most of them correspond to hospitals, university campuses and shop centres or malls. In the first position, very prominent, there is the Sunnybrook Hospital. This hospital, one of the largest in Canada, is very busy during the day. I can imagine that people generally underestimate the time it will spend in the hospital and perhaps how severe is the control of parking. However, the hospital has its own parking control staff. And they take their work very seriously, considering that the average is about 25 tickets a day. And like the Sunnybrook Hospital, the University have their own staff to control the parking, so there is no offence without punishment in the campus.

 Ranking Address Total Tickets
Ratio (%) Description
1 2075 BAYVIEW AVE 37399 0.3512 Sunnybrook hospital
2 20 EDWARD ST 25711 0.2414 World’s Biggest Book Store
3 1750 FINCH AVE E 19036 0.1787 Seneca College
4 JAMES ST 13108 0.1231 Eaton Center
5 941 PROGRESS AVE 12816 0.1203 Centennial College
6 700 LAWRENCE AV W 11497 0.1079 Lawrence Square Shopping Centre
7 1265 MILITARY TR 11129 0.1045 University of Toronto Scarborough Campus
8 225 KING ST W 10926 0.1026 Ticket King
9 60 BLOOR ST W 10300 0.0967 GAP
10 3401 DUFFERIN ST 9814 0.0921 Yorkdale Shopping Centre

Among all the top positions, one caught my attention , 20 Edward Street. The number of tickets on this location is not a joke, however, currently in this address there is only a condo under construction, that’s it. At the beginning I thought  that might be the source of all those tickets were the development, but a quick search of the address on internet finally I give me a better answered. In that location, until 2014 (the last year of the dataset), in 20 Edward Street were the World’s Biggest Book Store. Apparently it was quite popular, and judging by the type of fines accumulated – most were for being illegally parked (~% 50) followed by nonpayment parking (~ 20%), I can imagine the behaviour of the  drivers. They prefere to risk to get ticket instead to search for a spot to parking, or why pay, if only is going to be a minute.

Lately, and drive my small analysis over the parking dataset of Toronto, I have been paying more attention to the parking behaviour. Thus, I got the feeling that just a few people pay at the parking meter. This impression is corroborated by the dataset since more than 30% of all fines are city park without paying.

UPDATE: Talking about a co-worker I realize why the Lawrence Square Shopping Center is in the list. This Mall is next to a subway station and the 401 highway. So many drivers commuting to  Toronto to park here and take the subway to go to downtown. Apparently parking there is for free but with some limitations.


Stop the presses!

In science, especially in biomedicine, the relevance of a scientist is measured with his record of publications. Moreover, not only the number is important but also in which journals they have been published. Actually, there is a ranking of how important is a journal base on the number of year citations of the journal, the impact factor.  Also, there are journals with open access  and others in which you have to pay a subscription or buy the papers. That is funny when you considerer that often the author have to pay to publish as well… But before we get into that, a scientist usually follow with regularity only a few number of journals – often the most important in its field – However, those scientists  will use with frequency a search engine to find, in a comprehensive way,  what it has been done about a topic, or  methodology, no matter the journal. The classic search engine in biomedicine is Pubmed. This is a free search engine  maintained by the national library of the United States and is accessible since 1996. Pubmed has indexed  papers  since 1966 and prior to that date there are only a selection of the most relevant papers. But not all the journals are in Pubmed, only those journals that accomplish Pubmed’s  scientific standards are indexed. Today  in PubMed, despite these limitations, there are indexed nearly 25 million papers, and during the last years, Pubmed growth in about 1 million of new papers per year. 


This growth, like many other phenomenas , can be explained  by a combination of different reasons factors:  i) Money:  Although,  right now I think it is decreasing, investment in science has been increasing during the last decades, as well the number of scientist working. In fact, I would guess a high correlation between the number of papers and the number of scientist. In other hand, there are two subtle differences in growth, one after the 70, (The total war against cancer started in the early 70) and a second one after the 2000, Human genome era. ii) Motivation: This is more complex but can be summarized as , Publish or perish . Most biomedical research is maintained at public expense , therefore, Those dollars should be given to the best projects and leading people to develop these promising projects . As I mentioned earlier in this post, the literary production of a scientist is the most important scale to do so . Not only that, a good publication record in your early stages as a scientist determine where you can go to work in your next stage, and better center , usually comes with more money, thus better  chances to publish more relevant and quantitate of papers. Papers you will need to apply for more money …. Do you see the feedback?

Journal Total papers  (mid 2014) First paper indexed Date
The Journal of biological chemistry 167622 1946
Science (New York, N.Y.) 164612 1946
Lancet 128337 1946
Proceedings of the National Academy of Sciences of the United States of America 117711 1946
Nature 101682 1946
British medical journal 97098 1946
Biochimica et biophysica acta 93736 1948
PloS one 85750 2007
Biochemical and biophysical research communications 76442 1960
The New England journal of medicine 70721 1946


Well, let’s  rank Pubmed’s journals by the total number of publications. In the top 10 we find what we expected, very old and prestigious journals with thousands of papers. In fact, the first paper published by “Lancet”, “The New England Journal of Medicine”, or “Science”  were published in 1823, 1812 and 1880 respectively (very previous to their first indexed papers). However, among all these journals, one call our attention, “PLoS One“. This journal founded in 2006 has been published more papers than a journal founded almost 2 centuries ago. Well, it is clear that the rate of publication 80 years ago is not the same of now a days,  but how do you publish in 6 years such amount of papers?  What is special about “PLoS One”?

Firstly, it’s completely open, your articles are accessible to everyone without paying any subscription and completely online. This is a smart move, because that papers can be cited by more people. However, the main reason of this high ratio of publication is its criterion for acceptance or rejection of papers. Unlike the most tradicional journals where a research has to proof a certain novelty, impact and scientific rigour, Plos ONE, instead, only verifies whether experiments and data analysis were conducted rigorously. To provide a frame of referee:  PLOS ONE have a ratio of accepted papers of around 70%, while “Nature” only 8% are accepted. There is a big controversy over this model. I personally think, that it should exist a space to publish works that might be less relevant. For instance, work already scooped or less sexy than the author thought before the experiments. Although, When I see papers like “Fellatio by fruit bats prolongs copulation time” I feel piss. Not only,  those research as been funded by public money, also  the journal get pay for publish a often dispensable research.Then, Plos ONE is taking advantage of the necessity of publish? Is this journal a new business model more than an academic model?  In addition, I afraid, this model can be exploited to artificially increase the number of papers of some authors, sending just cancelled projects, or one week projects with no aim at all. Just long hanger fruits that will increase is publish record to increase the chances of get a new grant. Recently, the impact factor of Plos ONE is decrassing, thus looks like this can be a general believe. However,  there is no dubt, this model is a very profitable model and other journals are planning to start apply  the same approach.

Spurious Correlations

Do a chart with two variables to measure the degree of correlation between them, It is probably the most used statistical tools. And like any other analysis, We must be very careful drawing conclusions because it does not always reflect reality. Two variables can be strongly correlated in many ways. For example, the number of libraries in a city is strongly related with the absolute number of crimes. However, that does not mean that libraries encourage crime. This example is very clear, and it sure is easy to find more examples of this kind, but I would like to mention the spurious correlations. These correlations ocurre when two variables with no logical connection have a strong correlation coefficient. For instance, the annual number of PhD in computer science in the United States and annual revenue of Americans arcades. …… In Tyler Vigen’s website you can find more of those correlations and if you are really  interest, you can buy his book on the subject, Spurious Correlations.


Parking Tickets in Toronto, I

Toronto is the largest city in Canada and the fifth in North America. According to a census carried out during 2011, Toronto has 1.3 cars per household. Meanwhile  this rate is  around 0.6 and 1.1 for New York and Chicago, respectively. Cities with equivalent characteristics but the fact that Toronto is the most car-oriented, won’t surprised  anyone that has ever been in Toronto and it has used its public transport.


So, according to this census, in Toronto there are approximately 1.1 million cars and in average, in year 2.8 million parking fines are administered in the city Toronto. In other words,  nearly 3 tickets per year for parking after hours or in places prohibited by vehicle. A fairly high average, but if you have a car, or you know someone who has car in Toronto. That average will not surprise either. In way, the system is broke, for instance recently there was a general amnesty, cancelling nearly one million  violations. This mainly  because the amount of claims was impossible to handle. Something that also says a lot .


Fortunately, the city hall  of Toronto provide the raw data of all of those parking violations. A dataset I plan to analysis in order to learn more about Toronto. First, I have plotted the number of total tickets per day since 2011. As you can see above, it stands out  how spiky is the plot. Furthermore, these spikes are almost all of them, falls. If we sort the data by number of fines, we can appreciate how the minimums correspond with holidays, especially, Christmas, New Year and Thanksgiving Day.

date_of_infraction      N
2013-12-25             235
2011-12-25             322
2014-12-25             425
2014-01-01             536
2013-01-01             577
2013-12-26             587
2012-12-25             673
2013-12-23             943
2012-01-01            1059
2013-12-24            1126
2013-12-22            1225
2014-12-26            1307
2012-12-26            1488
2011-01-01            1527
2013-02-08            1695
2013-10-14            1966
2011-12-26            1996
2014-10-13            2028
2012-10-08            2076
2011-10-10            2172

Actually, in the top 20 there is only one day that breaks this rule , 02.08.2013. That day, it felt a major snowstorm in Toronto. Also, we can notice how, the smaller minimums , are repetitive. Those minims corresponds to weekends. Due to parking prohibitions are generally not influenced by holidays, this behaivoir can be explained by a combination of two factors: i) Many people does not move or leave the city during the weekend and holidays, ii) There are fewer parking agents regulating parking. I will think about it, because I would like to figure out which one of these factors have a higher weight.

In other hand, in summer many people move by bike , so it caught my attention , how small is the difference between summer and winter. Something you can appreciate in As  in the plot of 2010, the year were this difference is more pronounced.


Let’s see, what ele we can learn from the Parking Tickets of Toronto. This dataset is provided by the City of Toronto itself, under license  Open Government.