Top Github Languages for 2013 (so far)
I just discovered the Github Archive, a dataset of Github events queryable using Google BigQuery. What fun! So I decided to count how many repositories have been created this year by language.
SELECT repository_language, count(repository_language) AS repos_by_lang
FROM [githubarchive:github.timeline]
WHERE repository_fork == "false"
AND type == "CreateEvent"
AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('2013-01-01 00:00:00')
AND PARSE_UTC_USEC(repository_created_at) < PARSE_UTC_USEC('2013-08-30 00:00:00')
GROUP BY repository_language
ORDER BY repos_by_lang DESC
LIMIT 100
The results:
Top 20 Languages for 2013
By # of repositories created on Github so far this year:
| Rank | Language | # Repositories Created |
|---|---|---|
| 1 | JavaScript | 264131 |
| 2 | Ruby | 218812 |
| 3 | Java | 157618 |
| 4 | PHP | 114384 |
| 5 | Python | 95002 |
| 6 | C++ | 78327 |
| 7 | C | 67706 |
| 8 | Objective-C | 36344 |
| 9 | C# | 32170 |
| 10 | Shell | 28561 |
| 11 | CSS | 17813 |
| 12 | Perl | 15412 |
| 13 | CoffeeScript | 11133 |
| 14 | VimL | 7857 |
| 15 | Scala | 6918 |
| 16 | Go | 6884 |
| 17 | Prolog | 5829 |
| 18 | Clojure | 4904 |
| 19 | Haskell | 4681 |
| 20 | Lua | 4048 |
Commentary
Hey, Clojure cracked the top 20! It’s neck-and-neck with Haskell, too.
The top 10 are no surprise at all, although there are definitely some specifics from Github’s early popularity with the Ruby crowd, and a general skew towards web languages.
The high positions of Shell and VimL are pretty odd, but can be explained by people putting their dotfiles on github.
Prolog is a big surprise here. If anyone can explain that, I’d be interested.
Maybe we could learn more if we had the 2012 rankings for the same period (Jan 1 - Aug. 30). So here are those:
SELECT repository_language, count(repository_language) AS repos_by_lang
FROM [githubarchive:github.timeline]
WHERE repository_fork == "false"
AND type == "CreateEvent"
AND PARSE_UTC_USEC(repository_created_at) >= PARSE_UTC_USEC('2012-01-01 00:00:00')
AND PARSE_UTC_USEC(repository_created_at) < PARSE_UTC_USEC('2012-08-30 00:00:00')
GROUP BY repository_language
ORDER BY repos_by_lang DESC
LIMIT 100
Top 20 in 2012
By # of repositories created on Github from Jan. 1 through Aug. 30, 2012
| Rank | Language | # Repositories Created |
|---|---|---|
| 1 | Ruby | 344825 |
| 2 | JavaScript | 296564 |
| 3 | Java | 265223 |
| 4 | C | 212393 |
| 5 | PHP | 173938 |
| 6 | Python | 173727 |
| 7 | C++ | 93764 |
| 8 | Shell | 72006 |
| 9 | Perl | 48620 |
| 10 | C# | 43665 |
| 11 | Objective-C | 41536 |
| 12 | VimL | 18077 |
| 13 | Go | 16224 |
| 14 | CoffeeScript | 15722 |
| 15 | Scala | 14262 |
| 16 | Haskell | 10402 |
| 17 | Clojure | 9748 |
| 18 | Tcl | 9633 |
| 19 | Emacs Lisp | 8567 |
| 20 | Groovy | 6973 |
I’m not sure if I trust the raw numbers here being so much less than in 2013, but the rankings are hopefully accurate.
Some highlights:
- Perl appears to have suffered a drop in 2013 compared to 2012
- Tcl appears out of nowhere in 2012. Maybe a quirk of the language recognition Github applies?
- Groovy went away in 2013 (actually, dropped to 22)
- Go was more popular than Scala in 2012, but less in 2013. I compare those two because I think people are using them to solve similar problems.
- CSS showed up nowhere in 2012
Well, that’s all the analysis I care to do today, but I submit this data for discussion. Who else has opinions?
Oh, before I go:
The Full Results (i.e. the top 100)
2013
| Rank | Language | # Repositories Created |
|---|---|---|
| 1 | JavaScript | 264131 |
| 2 | Ruby | 218812 |
| 3 | Java | 157618 |
| 4 | PHP | 114384 |
| 5 | Python | 95002 |
| 6 | C++ | 78327 |
| 7 | C | 67706 |
| 8 | Objective-C | 36344 |
| 9 | C# | 32170 |
| 10 | Shell | 28561 |
| 11 | CSS | 17813 |
| 12 | Perl | 15412 |
| 13 | CoffeeScript | 11133 |
| 14 | VimL | 7857 |
| 15 | Scala | 6918 |
| 16 | Go | 6884 |
| 17 | Prolog | 5829 |
| 18 | Clojure | 4904 |
| 19 | Haskell | 4681 |
| 20 | Lua | 4048 |
| Rank | Language | # Repositories Created |
| 21 | Puppet | 3437 |
| 22 | Groovy | 3372 |
| 23 | R | 2980 |
| 24 | Emacs Lisp | 2919 |
| 25 | ActionScript | 2413 |
| 26 | Matlab | 2395 |
| 27 | Arduino | 2238 |
| 28 | Erlang | 2061 |
| 29 | OCaml | 2049 |
| 30 | Visual Basic | 1854 |
| 31 | ASP | 1268 |
| 32 | Processing | 1207 |
| 33 | Common Lisp | 1153 |
| 34 | Assembly | 1051 |
| 35 | Logos | 1027 |
| 36 | TypeScript | 972 |
| 37 | Dart | 950 |
| 38 | D | 936 |
| 39 | Delphi | 901 |
| 40 | Scheme | 882 |
| Rank | Language | # Repositories Created |
| 41 | FORTRAN | 794 |
| 42 | PowerShell | 771 |
| 43 | XML | 632 |
| 44 | Racket | 610 |
| 45 | Elixir | 573 |
| 46 | ColdFusion | 507 |
| 47 | XSLT | 496 |
| 48 | Apex | 484 |
| 49 | F# | 473 |
| 50 | Haxe | 455 |
| 51 | Verilog | 444 |
| 52 | Julia | 387 |
| 53 | Tcl | 338 |
| 54 | AutoHotkey | 338 |
| 55 | Vala | 321 |
| 56 | VHDL | 313 |
| 57 | Rust | 282 |
| 58 | LiveScript | 192 |
| 59 | SuperCollider | 151 |
| 60 | Standard ML | 139 |
| Rank | Language | # Repositories Created |
| 61 | AppleScript | 121 |
| 62 | DOT | 118 |
| 63 | Ada | 109 |
| 64 | Coq | 99 |
| 65 | OpenEdge ABL | 86 |
| 66 | Gosu | 76 |
| 67 | Pure Data | 73 |
| 68 | Smalltalk | 63 |
| 69 | Kotlin | 61 |
| 70 | Lasso | 57 |
| 71 | Eiffel | 55 |
| 72 | Io | 53 |
| 73 | M | 53 |
| 74 | XQuery | 52 |
| 75 | Nemerle | 49 |
| 76 | Scilab | 44 |
| 77 | Objective-J | 43 |
| 78 | Awk | 42 |
| 79 | Slash | 38 |
| 80 | XProc | 35 |
| Rank | Language | # Repositories Created |
| 81 | Xtend | 33 |
| 82 | Nimrod | 31 |
| 83 | CLIPS | 24 |
| 84 | Boo | 24 |
| 85 | Ceylon | 23 |
| 86 | ooc | 22 |
| 87 | MoonScript | 22 |
| 88 | DCPU-16 ASM | 19 |
| 89 | Rebol | 17 |
| 90 | Factor | 17 |
| 91 | Ragel in Ruby Host | 15 |
| 92 | Bro | 14 |
| 93 | Dylan | 13 |
| 94 | Monkey | 12 |
| 95 | Nu | 11 |
| 96 | Arc | 10 |
| 97 | Augeas | 9 |
| 98 | PogoScript | 8 |
| 99 | Turing | 6 |
| 100 | XC | 5 |
2012
| Rank | Language | # Repositories Created |
|---|---|---|
| 1 | Ruby | 344825 |
| 2 | JavaScript | 296564 |
| 3 | Java | 265223 |
| 4 | C | 212393 |
| 5 | PHP | 173938 |
| 6 | Python | 173727 |
| 7 | C++ | 93764 |
| 8 | Shell | 72006 |
| 9 | Perl | 48620 |
| 10 | C# | 43665 |
| 11 | Objective-C | 41536 |
| 12 | VimL | 18077 |
| 13 | Go | 16224 |
| 14 | CoffeeScript | 15722 |
| 15 | Scala | 14262 |
| 16 | Haskell | 10402 |
| 17 | Clojure | 9748 |
| 18 | Tcl | 9633 |
| 19 | Emacs Lisp | 8567 |
| 20 | Groovy | 6973 |
| Rank | Language | # Repositories Created |
| 21 | Lua | 6474 |
| 22 | Erlang | 5784 |
| 23 | ActionScript | 4777 |
| 24 | Puppet | 3926 |
| 25 | R | 3386 |
| 26 | Matlab | 2828 |
| 27 | D | 2740 |
| 28 | Common Lisp | 2529 |
| 29 | Arduino | 2459 |
| 30 | Assembly | 1882 |
| 31 | Visual Basic | 1821 |
| 32 | Vala | 1614 |
| 33 | Scheme | 1565 |
| 34 | Delphi | 1370 |
| 35 | OCaml | 1330 |
| 36 | Smalltalk | 1313 |
| 37 | FORTRAN | 1269 |
| 38 | Dart | 1174 |
| 39 | ASP | 1042 |
| 40 | HaXe | 983 |
| Rank | Language | # Repositories Created |
| 41 | ColdFusion | 966 |
| 42 | Prolog | 956 |
| 43 | F# | 670 |
| 44 | PowerShell | 652 |
| 45 | Racket | 614 |
| 46 | CSS | 530 |
| 47 | Verilog | 523 |
| 48 | VHDL | 473 |
| 49 | Eiffel | 406 |
| 50 | Parrot | 270 |
| 51 | Apex | 265 |
| 52 | AutoHotkey | 258 |
| 53 | Rust | 234 |
| 54 | Scilab | 230 |
| 55 | DCPU-16 ASM | 229 |
| 56 | XML | 206 |
| 57 | Elixir | 189 |
| 58 | Ada | 182 |
| 59 | Coq | 174 |
| 60 | XQuery | 155 |
| Rank | Language | # Repositories Created |
| 61 | Julia | 151 |
| 62 | Pure Data | 147 |
| 63 | SuperCollider | 131 |
| 64 | Standard ML | 127 |
| 65 | XSLT | 102 |
| 66 | Kotlin | 98 |
| 67 | Powershell | 93 |
| 68 | Io | 92 |
| 69 | Objective-J | 84 |
| 70 | TypeScript | 81 |
| 71 | OpenEdge ABL | 76 |
| 72 | Nemerle | 61 |
| 73 | AppleScript | 57 |
| 74 | Haxe | 54 |
| 75 | Gosu | 47 |
| 76 | Factor | 44 |
| 77 | Logos | 43 |
| 78 | Processing | 40 |
| 79 | Logtalk | 34 |
| 80 | Dylan | 34 |
| Rank | Language | # Repositories Created |
| 81 | Nimrod | 32 |
| 82 | Ceylon | 32 |
| 83 | ooc | 30 |
| 84 | Opa | 30 |
| 85 | Boo | 27 |
| 86 | Fancy | 26 |
| 87 | Turing | 26 |
| 88 | Mirah | 22 |
| 89 | Max/MSP | 21 |
| 90 | Bro | 17 |
| 91 | Xtend | 14 |
| 92 | Rebol | 13 |
| 93 | LiveScript | 12 |
| 94 | Lasso | 11 |
| 95 | Arc | 11 |
| 96 | Augeas | 8 |
| 97 | DOT | 6 |
| 98 | Fantom | 5 |
| 99 | Awk | 5 |
| 100 | Max | 4 |
Disclaimer
Here are a lot of reasons why analysing Github data might not be accurate:
- Who knows how accurate the Github Archive is
- Github users/open source projects are not a representative demographic of all programmers or programming everywhere
- Maybe I screwed up the query
Further Reading
- Aug 6, 2013: 3 new programming languages to watch
- Mar 25, 2013: The same app 4 times: PHP vs Python vs Ruby vs Clojure