What programming language would I use for the back end of a big, new project in a startup which wants to offer a web service? Sure, on the client side there is pretty much only JavaScript (including variants like CoffeeScript and TypeScript) in combination with HTML and CSS. I've used MySQL and Redis databases and I'm quite happy with that. But the choice for the server side is not that easy. I've been using PHP for quite a while now, because it was the cheapest and easiest choice when I started programming. But things have changed (and I have more money, so I don't have to take the super cheap hosting services). Although my experience with web projects is very limited, I want to share a few thoughts.
Definitions: Back End and Security
Just for clarification: I am only talking about the back end. A back end is the data access layer which manages requests comming to the server. It needs to server many requests (> 100 requests/second) fast (< 300 ms in average). It should not execute computationally heavy jobs which can be pre-computed or do not need to be displayed instantly to the client. This can be done by another system which does not need to be programmed in the same language. The back end does also not deal with presentation to the user. This is what the front end does. However, you should have more than a good idea in which form the front end gets the data. The cleanest approach I've seen so far is a pure RESTful API for all interactions between front end and back end.
The backend language should also make it easy to validate / sanitize input data, connect with databases, store/get stuff on/from the file system.
In the following, I will write that some languages are "secure" or "not secure". This does not mean that you can / cannot write code which is secure. It means that the compiler (or other widespread tools) give you guarantees about bugs in your code. For example, C is a very insecure language as the compiler does no bounds checking. The types of errors which can be detected by automatic tools (without further testing) are:
- Syntax errors,
- Out of bounds (reading),
- buffer overflow (not checked in C/C++, but not possible in Java (source)),
- unused variables (which might indicate other problems; at least code smell),
- type problems: This is a bit fuzzy, as you can write stringly typed code (see
New Programming Jargon)
in probably every language, but in some languages it is more common than in
others. Some languages also make it easier to use the type system to detect
errors. For example, PHP is very insecure in this sense as
123 == "123ab"
, Python is a bit more secure, but you can return whatever you want, Java is much more secure. Haskell is even more secure in this sense, as it has real functions (without side effects, checked by the compiler). See What can Haskell's type system do that Java's can't and vice versa? for more.
There are also some errors which can be detected at runtime. The handling of those runtime errors differs from language to language. For example, C and C++ fails silently (e.g. this question). This is bad. For example, there are some silent out-of-bounds errors in C / C++ where Rust would fail loud (I think Heartbleed is one example; see Would Rust have prevented Heartbleed? Another look if you're interested in that specific example).
Of course, all of those problems can be detected with good testing. But the more is done automatically, the less can go wrong when you don't write (good) tests.
Java
Java is an object-oriented language which runs on the Java Virtual Machine. Java is probably the most used language for big business websites. Why is that the case?
- Java is old: It first appeared in 1995.
- Java is taught at many universities and many people know at least a little bit of Java. So companies don't have problems finding developers. At least that might be the impression of people who don't realize that there is a big differencee between people saying they know Java and developers who can actually work with it.
- I guess the Java ecosystem is pretty mature:
- eclipse, IntelliJ IDEA and Netbeans as IDEs,
- Jenkins for continuous integration,
- GlassFish,
- Apache Ant/Apache Maven or Gradle for automatic building,
- JUnit, Mockito, Powermock for automatic unit tests,
- log4J and log4J 2 for logging,
- Apache JMeter for load testing
- Jersey for RESTful Web services,
- Apache Tomcat / WildFly (former JBoss): application server / web server / servlet container
- Grizzly / Jetty: Web server
- FindBugs, SonarQube for code quality / static code analysis
- Hibernate for ORM,
- OSGi: Apache Felix / Equinox - see 10min clip for a high-level explanation of OSGi,
- Frameworks like Spring, JSF, JSP, Apache Struts 2, Apache Wicket
- Java is developed by Oracle. Hence you can make contracts with Oracle to get support when things don't work.
That was what we have on the positive side. What is not so good about Java?
- VERY clumsy syntax. This is more than just a inconvenience. You have to type a lot to get things done which makes you slow. Of course, you can (and need to) use autocompletion, but it is still a lot to read. That makes maintaining the code a mess.
- Tools are hard to get to work.
- Unnecessary super-abstract constructs used for eventually never happening future extensions (see Geek-and-poke.com).
- A bit more secure than C/C++ as you cannot access out-of-bound arrays, you don't have pointers. So buffer overflows are almost impossible in Java (see SO for more details). However, you buy this security with much less easy syntax and you don't get as much security as would be possible with just a bit more effort. See rust for more details.
- Speed and memory usage: Again, Java might be better in speed than many other languages, but not as good as some others are. And Java seems to need A LOT of memory. However, I am not too sure if that is really a problem. (see Surprise! Java is fastest for server-side Web apps)
See also:
- Is Java a Compiled or an interpreted programming language?: The short answer is no. But Java guys don't like to hear that ☺
- Why do I hear about so many Java insecurities? Are other languages more secure?: The short answer is no, thinking about C/C++.
- Security of JVM for Server
- C++ performance vs. Java/C#
JavaScript: Node.js
Node.js is a runtime environment which was initially released in 2009 and became quite popular since then. Node.js is asynchronous, event-driven and scalable. Node.js applications are written in JavaScript and hence have all the advantages of JavaScript:
- They profit from heavy development in JavaScript engines / JIT compilers like V8.
- The syntax is flexible and light-weight.
- Just like Java, JavaScript first appeared in 1995. So the language itself is old and stable.
- Lots of easy tutorials
What is still to say?
- Node is FAST and scalable! (see Performance Comparison Between Node.js and Java EE)
- JavaScript is very insecure. Even simple syntax error will only get revealed when they are actually executed. So Unit testing is very important.
- Node.js is used by LinkedIn, Yahoo!, Uber, PayPal (source)
- There are quite a few people moving from Node.js to Go (1, 2, 3, 4)
See also:
Go
Go is a statically-typed, compiled language developed by Google. It first appeared in 2009, so it is very young.
- Go offers the basic tools you need for web development:
- Good tutorial and also some material for web development
- Some tasks are much more complicated than they should be. Sorting, to name one example (see SO).
- Go is different from some other languages, e.g. if you want a method to be public, the first character of the method name has to be capitalized. Or unused variables result in a compiler error.
See also:
C#
C# is a compiled, statically typed language (with dynamic features, see Understanding the Dynamic Keyword in C# 4) developed by Microsoft. It was publically announced in 2000. The initial release of its web appliction framework ASP.NET was in 2002.
The ecosystem seems to include:
- nuget.org
- IIS: Web server
- Entity Framework: ORM
- LINQ: SQL queries
- Visual Studio: IDE
- ASP.NET MVC Framework
But I don't know enough about C# / ASP.NET to write something meaningful about it.
Coding Horror described why they use ASP.NET for StackOverflow and why he doesn't recommend it for OpenSource projects (source). StackExchange also describes what they use (1, 2).
I see a big problem in the Microsoft-centric technology stack. You have to use
everything from them. (Almost) everything is closed source. If they discontinue
the development or if they don't fix stuff which might be relevant for you,
you're fucked.
This seems to change. Microsoft moved some important parts of their stack to GitHub (see dotnet.github.io). Most important seems to be that the compiler Roslyn is licensed under an Apache License. But there is also ASP.NET, the Entity Framework, and the .NET runtime. The Visual Studio Community Edition is not available for free (but only for Windows).
Python
Python is one of the oldest programming languages which are still in use. It first appeared in 1991. Python is dynamically typed, interpreted, object-oriented and includes functional programming features.
Although I use Python for many projects, I didn't use it by now for a web project. So I might not know the important tools / frameworks. Please keep that in mind.
- Ecosystem:
- pypi.python.org and
pip
: Package hosting and package management - Sphinx: (Semi) automatic code documentation, e.g. the scipy docs are generated with Sphinx from Python code. This is one of the best documentations I have ever seen.
- Django/ Flask as frameworks
- pytest/nose for testing
- gevent: a coroutine-based Python networking library
- Tornado: Web server
- pypi.python.org and
- Some Python people switch to Go (1, 2)
- Many tutorials and often very good documentation:
- Flask and Django work with PyPy (source). That might make them much faster.
- Used by big players:
I think one of the main advantages of Python is that it is really easy to write code which is easy to read (because of docstrings, Pythons weird intendation semantics and very nice syntax) and quite hard to write unreadable code. I am sure I have a biased view regarding Python, but I am also sure a lot of people share this subjective impression.
PHP
PHP is a server-side scripting language which appeared first in 1995. It is dynamically typed.
- Language inconsistencies are really bad with PHP - see also PHP: a fractal of bad design
- The ecosystem is ok:
- PHPCI for continuus integration.
- Zend Framework / Symfony
- Smaller Frameworks like CakePHP and Code Igniter
- Drupal / Joomla / TYPO3 / WordPress
- PHPUnit for unit testing,
- Composer for package management and packagist.org to find packages
- cruisecontrol for Continuus Integration
A big advantage of PHP is that it is easy to learn. You can run PHP everywhere and hosting is cheap. Wikipedia makes use of PHP, so it is obviously possible to create systems which have HUGE numbers of requests and still work fine.
Hack
Hack is a programming language introduced in 2014 by Facebook. It is a PHP dialect. Key differences to PHP are:
- Function arguments and return values can be annotated with types.
- Hack does not support some language features which are supported by PHP (source). Which is good. For example, goto, variable variables, string incrementing, ...
See also:
Rust
Rust is a very safe language, but seems not to be ready for productive usage.
I am a big fan of Rust, but as it aims to be a better C++, it is probably a better fit for OS development, game engines, embedded systems, databases, complex desktop applications (Photoshop/Word/Chrome), etc. While Rust is quite expressive for a systems programming language, its banner features are the borrow checker (+ lifetimes, etc) and powerful static type system. Rust emphasizes zero-cost abstractions with compiler-enforced memory & thread safety. The popular web development languages are dynamically typed and interpreted, with an emphasis on rapid development, which is a very different niche than Rust claims to fill.
Source: news.ycombinator.com
See:
Others
- Ruby with Rails: I know it is quite well-known and used by many people. But I don't know Ruby enough to write anything meaningful. The Ruby syntax is similar to Python.
- Scala seems to be noteworthy
See also
- Web Framework Benchmarks
- Usage of server-side programming languages for websites
- todobackend.com: A lot of different back end technology stacks
- bento.io: Seems to offer many tutorials
- The RedMonk Programming Language Rankings: January 2015
- Comparison of programming languages
Conclusion
Thinking about it that carefully, I see three languages which seem to be suitable for back ends for me:
- Go: Fast and compiled
- node.js: Good scalability
- Python: It is the language I know best and of which I like the syntax best. Besides that, it has a very nice and clear syntax, good community-developed coding style standards and is very easy to read and well-documented.
Not suitable seem to be:
- PHP: Because of the language inconsistencies which seem to make it pretty hard to make a reliable back end
- C#: The technology stack is too Microsoft centered.
- Java: Too clumsy syntax, too hard to get it work.
The other programming languages could be very good choices. I simply don't know it. I am very curious if rust will be used for back ends. Hack is very young, let's see if it will spread in a few years.
Credits
As I don't have much experience with web development, I asked a few friends to have a look at the different parts of the article. They looked especially at plain wrong statements, if I named "all" the important frameworks / tools. They might not completely agree with the comparison to other language (after all, I wrote the article), but they helped me a lot to get things not too wrong:
- Sören Liebich (@liebsoer) has several years of experience with Java web development and helped me to name the important tools / technologies used in the Java stack.
- Henning Dieterichs helped me to fix some of the mistakes in the C# part and reminded me of Hack and the positive sides of PHP.
- Stefan had a look at the PHP section.
Thank you!