NumPy: Awesome Architecting for Amazing Arrays

This is the second installment of our 4-essay-long series about the NumPy project. For the first essay about the stakeholders and project in general, please visit this page. In this second essay, we will take look at NumPy from different architectural perspectives which are based on literature. These different views aim to give the reader insight into how NumPy implements it’s key properties.

NOTE: The formatting in this essay has been optimised for the online version.

Architectural Views

Kruchten

In his 1995 IEEE paper1, Kruchten describes four architectural views on software architecture:

  • The logical view, which describes the design’s object model when an object oriented design method is used. To design an application that is very data-driven, you can use an alternative approach to develop some other form of logical view, such as an entity relationship diagram.
  • The process view, which describes the design’s concurrency and synchronization aspects.
  • The physical view, which describes the mapping of the software onto the hardware and reflects its distributed aspect.
  • The development view, which describes the software’s static organization in its development environment.

Since NumPy is a library for numerical computations and not a software program, a view that is concerned with dynamic system-components and communication between components does not seem very applicable and because of this, a process view is not relevant to the architecture of NumPy. For the same reason, a physical view is irrelevant: you cannot discuss the system in terms of a physical topology, since it is only a library consisting of code modules.

Considering a development view makes more sense for this project, as the development view focuses on modules in the development environment. NumPy contains multiple different modules in different layers. Besides this, reasoning about reuse and portability is inherent to a library like NumPy. The development view on NumPy is discussed here.

Furthermore, a logical architectural view could also be of good use. On one hand, a logical view is concerned with the system’s functional requirements, which are very clear for a system like NumPy. On the other hand, a logical view describes a system in terms of objects and their relations and since NumPy’s core functionality is written in C language, which is not an object oriented- but a procedural language, defining objects does not really make sense. As the functional requirements have already been discussed in our previous essay, we will not discuss the logical view here.

Rozanski and Woods

Rozanski and Woods present some other views in their Software Systems Architecture book2. We will consider the views that don’t overlap with Kruchten’s views:

  • The deployment view, which describes the environment in which the system will be deployed.
  • Operational view, which describes how the system will be operated once it is deployed.

The deployment view has some relevance as the library is used with Python, which is available on many different platforms. Therefore, it will need to be able to run on these different platforms without causing the users trouble. The deployment view on NumPy is discussed here.

The operational view is less appropriate, as NumPy is not some kind of service that needs to be monitored or administered. It is simply a library that provides data structures (arrays) and tools to work with them, therefore, we will not look at the operational view.

arc42 Documentation

The arc42 documentation provides similar views to the views mentioned above. Included is the runtime view3, which is also relevant for NumPy. This view is relevant because, contrary to popular belief, the performance of NumPy varies a lot depending on the runtime dependencies used. The runtime view on NumPy is discussed here.

Development View

As described by Kruchten, the development view focuses on the organization of the actual software modules in the software-development environment1. This view is is also relevant to the architectural style of NumPy, which is component-based. Therefore, we will also discuss the architectural style of the library in this section.

Architectural Style

At the start of the project, we analysed the NumPy repository and created an overview of the different components. This overview was then sent to SIG and the codebase was further analysed by them. We noticed that the architecture of the NumPy library is fairly simple, it is component-based and there are not a lot of dependencies between the different components. In the section below, you can find an overview of the components as created by SIG.

System decomposition

The following diagram was created by SIG using our analysis of the component structure:

Component structure, provided by SIG. ('-new' is an artifact from the analysis)

As can be seen from the diagram, the core-src component is the largest. This module contains most of the functionality in NumPy and has been written in C for speed. The core module contains Python wrappers that allow the library to be used as a Python library. This structure can be seen in most of the modules: source code in C in combination with Python wrappers.

The f2py component provides functionality for converting Fortran code to Python code. Fortran is an older languages for scientific computing and as NumPy deems itself “the fundamental package for scientific computing with Python”4, it is not strange that they provide compatibility with Fortran.

The other components provide tools for specific mathematical applications, such as random for generating random numbers, linalg for linear algebra and fft for (fast) Fourier transforms.

The last noteworthy module is the distutils module, which is used for overhead and making NumPy compatible. It provides support for different compilers, more about these can be read here

Relationships

Using the component analysis, SIG also created an overview of the relationships between the components in NumPy, which can be found below:

Component relationships of NumPy, provided by SIG

As can be seen, the core-src and random-src are decoupled from the rest of the system. The other components are slightly coupled and the distutils component has an important place in the structure. This is most likely due to the fact that it manages overhead and compilation.

The component-based structure can clearly be seen in the diagrams presented above. This structure allows contributors to contribute more easily to the system, as functionality is split into different modules.

Runtime View

As was mentioned in the section about architectural views, NumPy can have a large range of performance, even on the same system and OS! The reason behind this discrepancy in performance is in the underlying BLAS/LAPACK implementations used by NumPy. BLAS (Basic Linear Algebra Subprograms) is a specification for vector and matrix operations such as multiplications, dot products among others5. The reference implementation was done by Netlib in Fortran. LAPACK (Linear Algebra Package) is both a specification and reference implementation of routines for least squares solutions, eigenvalue problems, matrix factorization, singular value decomposition among others6. LAPACK tries to move most of the computational effort to BLAS routines in order to gain performance. There exist other implementations for BLAS and LAPACK that are tuned for specific systems/processors7. An example is the Intel MKL library, as you might have guessed, tuned specifically for Intel processors.

NumPy can use different implementations of BLAS/LAPACK. For BLAS the order of preference is8:

Preference rank (1=highest preference) BLAS LAPACK
1 MKL MKL
2 BLIS OpenBLAS
3 OpenBLAS libFLAME
4 ATLAS ATLAS
5 Accelerate (MacOS) Accelerate (MacOS)
6 BLAS (Netlib) LAPACK (NetLIB)

Benchmarks that compare these different libraries show that there is a large difference in performance. Below are the results of a benchmark ran by Markus Beuckelmann for different implementations of BLAS/LAPACK on an Intel i5 processor9.

  Default BLAS & LAPACK ATLAS OpenBLAS Intel MKL
Dot product of two 4096x4096 matrices 64.22 s 3.46 s 3.97 s 2.44 s
Dot product of two 524288 vectors 0.80 ms 0.73 ms 0.74 ms 0.75 ms
SVD of a 2048x2048 matrix 10.31 s 2.02 s 1.96 s 1.34 s
Cholesky decomposition of a 2048x2048 matrix 6.74 s 0.51 s 0.46 s 0.40 s
Eigendecomposition of a 2048x2048 matrix 53.77 s 29.90 s 32.95 s 10.07 s

As can be seen the results for this benchmark vary wildly, with speedups of up to 25x. This shows that there is a significant difference in use of different runtime dependencies. Now one might think that this difference is only visible if you actively change the order of preference, but this is not the case. The Anaconda distribution for example, installs the Intel MKL library by default, while installing NumPy via pip uses OpenBLAS. Therefore this subtle difference in runtime dependencies can have major implications on the runtime of your NumPy program.

Deployment View

The deployment view describes the environment into which the system will be deployed and the dependencies of the deployment process. From the GitHub repository multiple files related to CI (Continuous Integration) are present. More on this will be discussed in our next blog post.

As for the CD (Continuous Deployment) part of the pipeline there doesn’t seem to be a central location that automatically pushes the new version of NumPy to PyPI (Python package index). From PyPI users can download the package using pip. These releases seem to be deployed manually (on PyPI), the same goes for the releases on GitHub10. The versions are tagged but they are posted manually by maintainers.

Non-functional properties

Finally this last section will not so much be about another architectural view (we already had quite a lot of those), instead we will discuss the the non-functional properties of NumPy. Non-functional properties are the properties regarding the operation of a system rather than the functionality of it11. These are also important to highlight as they heavily influence the usability and effectiveness of the system. In the case of NumPy we focus on three non-functional properties: performance, test coverage and compatibility.

Performance

Performance is undoubtedly one of the most important non-functional requirements for NumPy as it is a library focused on the efficient representation of n-dimensional data-structures and mathematical operations on those. NumPy satisfies this requirement by using C code and using efficient implementations of the BLAS/LAPACK protocols mentioned previously in the runtime view.

Test coverage

The SIG analysis points out that the test code ratio of NumPy is 55.1%, meaning 55.1% of the total code is consisting of test code. This is a good example of how important testing and test coverage is for the NumPy library. As described earlier, multiple CI pipelines are run to check the code and its new additions.

Compatibility

One of NumPy’s non-functional requirements is that it should be compatible with many software projects. Running with as few dependencies as possible helps in achieving this requirement and it is realised as NumPy only requires Python. It also offers compatibility with Fortran as described earlier. Another aspect is that it should be relatively intuitive to use. While this is a subjective manner, we think that NumPy is indeed quite intuitive: if you can work with Python, you’ll be able to work with NumPy. NumPy is, however, not compatible with Python 2.7 anymore, as elaborated upon in essay 1. Reason for this is that the team: “has found that supporting Python 2 is an increasing burden on our limited resource.”[^python2.7]

Conclusion

In this essay we analysed NumPy from different architectural points of view. This approach can give new insights one would not gain from just using NumPy or any piece of software in general. For example, most NumPy users will never look deep into the runtime dependencies of NumPy. However, when explicitly viewing it from a runtime perspective, one discovers that NumPy is more intricate than just a combination of Python and C and can actually make use of high speed linear algebra libraries to satisfy non-functional properties such as performance. This is of course only one of the views. All the mentioned architectural views conglomerated, assert that the functional and non-functional key capabilities of NumPy are implemented.

  1. P. B. Kruchten, “The 4+1 View Model of architecture,” in IEEE Software, vol. 12, no. 6, pp. 42-50, Nov. 1995. DOI: 10.1109/52.469759  2

  2. N. Rozanski and E. Woods. 2005. Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives. Addison-Wesley Professional. 

  3. arc42 Documentation Runtime view. https://docs.arc42.org/section-6/ 

  4. NumPy homepage, retrieved on 2020-03-19. https://numpy.org/ 

  5. C. L. Lawson, R. J. Hanson, D. R. Kincaid, and F. T. Krogh. 1979. Basic Linear Algebra Subprograms for Fortran Usage. ACM Trans. Math. Softw. 5, 3 (September 1979), 308–323. DOI:https://doi.org/10.1145/355841.355847 

  6. LAPACK website. http://www.netlib.org/lapack/ 

  7. LAPACK vendor implementations. http://www.netlib.org/lapack/faq.html#_what_and_where_are_the_lapack_vendors_implementations 

  8. NumPy build instructions. https://numpy.org/devdocs/user/building.html#accelerated-blas-lapack-libraries 

  9. Boosting NumPy: Why BLAS matters. https://markus-beuckelmann.de/blog/boosting-numpy-blas.html 

  10. Releases on GitHub. https://github.com/numpy/numpy/releases 

  11. L. Chen, M. Ali Babar and B. Nuseibeh, “Characterizing Architecturally Significant Requirements,” in IEEE Software, vol. 30, no. 2, pp. 38-45, March-April 2013. 10.1109/MS.2012.174 

NumPy
Authors
Robbert Koning
Pravesh Moelchand
Erwin van Thiel
Jim Verheijde