Top Github Languages of 2014 (So far)
It’s that time of year again! Today, we’ll look at just over a half-year’s worth of Github data to draw unsubstantiated conclusions about the relative popularity of programming languages. Ok let’s go!
Showing my work
SELECT repository_language, count(repository_language) AS repos_by_lang FROM [githubarchive:github.timeline] WHERE repository_fork == "false" AND type == "CreateEvent" AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('2014-01-01 00:00:00') AND PARSE_UTC_USEC(repository_created_at) < PARSE_UTC_USEC('2014-07-31 00:00:00') GROUP BY repository_language ORDER BY repos_by_lang DESC LIMIT 100
I also made the same query for the same period in 2013 and 2012, so that we’d have something to compare our results to.
Then, I did some Python munging, which for your benefit I’ve tossed in a gist. I wanted to pull out the ranks to compare between years, and also the raw numbers, for all languages which have cracked the top 20 since 2012.
So, here are the results.
Keep in mind that these results are surely skewed, since the rely heavily on the quality of Github’s repo language assignment heuristics. In fact, given the huge variation between years, this is probably responsible for several languages’ presence on the list.
|Rank||# New Repos Created|
Some of the more notable points:
The jumps for
TeX, and the fall of
Prolog, can probably be ascribed to bugs and/or improvements to Github’s heuristics for detecting languages.
Objective-C. Again, highly suspicious.
Since last year,
Chas seen big jumps, while
C++suffered for it.
Cactually gained almost 100,000 extra repositories created this year, although it might have stolen some of those from
Javawas the biggest gainer, jumping up by almost 100,000 repos, although it apparently lost
People like to put their Vim and Emacs config files on Github.
The rise of
Luais pretty interesting. I wonder if there’s some major project or product I don’t know about driving that.
Looking just at the numbers for 2014 (And ignoring languages that aren’t really programming languages),
you can see some clear “tiers”.
PHP have nearly the same count, around 170k, while
Python is close
enough to be lumped in with those.
After that is
C#, a trio of C-variants rounding off the top ten at around 60-80k.
(please keep pedantry about C#’s lineage in a single comment thread).
Shell is off by itself, head of the minor-league languages.
Coffeescript are 20k+,
hover lonelily around 16k and 13k respectively, and then
Haskell occupy a
continuum of 7-10k languages.
But that’s enough of my Ouija-board ramblings. What does your confirmation bias tell you about this data?
For those of you making serious decisions on the basis of this analysis, I recommend also checking out RedMonk’s rankings for January 2014, and for something more considered when choosing which language to use, how about Thoughtworks’ Technology Radar