The Incredible Growth of Python
In this post, we’ll explore the extraordinary growth of the Python programming language in the last five years, as seen by Stack Overflow traffic within high-income countries. The term “fastest-growing” can be hard to define precisely, but we make the case that Python has a solid claim to being the fastest-growing major programming language.
[…]
June 2017 was the first month that Python was the most visited tag on Stack Overflow within high-income nations. This included being the most visited tag within the US and the UK, and in the top 2 in almost all other high income nations (next to either Java or JavaScript). This is especially impressive because in 2012, it was less visited than any of the other 5 languages, and has grown by 2.5-fold in that time.
[…]
With a 27% year-over year-growth rate, Python stands alone as a tag that is both large and growing rapidly; the next-largest tag that shows similar growth is R.
[…]
Outside of high-income countries Python is still the fastest growing major programming language; it simply started at a lower level and the growth began two years later (in 2014 rather than 2012). In fact, the year-over-year growth rate of Python in non-high-income countries is slightly higher than it is in high-income countries.
These analyses suggest two conclusions. First, the fastest-growing use of Python is for data science, machine learning and academic research. This is particularly visible in the growth of the pandas package, which is the fastest-growing Python-related tag on the site. As for which industries are using Python, we found that it is more visited in a few industries, such as electronics, manufacturing, software, government, and especially universities. However, Python’s growth is spread pretty evenly across industries. In combination this tells a story of data science and machine learning becoming more common in many types of companies, and Python becoming a common choice for that purpose.
Update (2017-10-12): Jeff Knupp:
The buffer protocol was (and still is) an extremely low-level API for direct manipulation of memory buffers by other libraries. These are buffers created and used by the interpreter to store certain types of data (initially, primarily “array-like” structures where the type and size of data was known ahead of time) in contiguous memory.
The primary motivation for providing such an API is to eliminate the need to copy data when only reading, clarify ownership semantics of the buffer, and to store the data in contiguous memory (even in the case of multi-dimensional data structures), where read access is extremely fast. Those “other libraries” that would make use of the API would almost certainly be written in C and highly performance sensitive. The new protocol meant that if I create a NumPy array of ints, other libraries can directly access the underlying memory buffer rather than requiring indirection or, worse, copying of that data before it can be used.
And now to bring this extended trip down memory lane full-circle, a question: what type of programmer would greatly benefit from fast, zero-copy memory access to large amounts of data?
Why, a Data Scientist of course.