NumPy Arrays¶
NumPy's numpy.ndarray
s are similar to arrays
(please read Buffers and Arrays before reading this)
but with the following new conditions.
NumPy has it's own type specifiers. Use
numpy.asarray(arr, dtype=dtype)
to enforce data types.Arrays are not generally C contiguous. Either C code should respect
numpy.ndarray.strides
or wrapping Python code should guard against non contiguous arrays usingnumpy.ascontiguousarray()
ornumpy.asarray(arr, order="C")
.Arrays may be multi-dimensional - something which C's general support for is pretty poor. Pointer arithmetic is needed to navigate a multidimensional array.
Basic 1D Examples¶
Let's wrap the sum()
function from totals.c but use NumPy instead of
array
.
The C code remains the same but the type checking/normalisation in Python
differs.
The following is a correct way to wrap with NumPy.
import numpy as np
def sum_(arr):
# Ensure `arr` is an array, is of the correct dtype and is C contiguous.
arr = np.asarray(arr, dtype=np.double, order="C")
# Find its sum.
return slug.dll.sum(ptr(arr), len(arr))
The
dtype=np.double
enforces correct data types (see Data types).The
order="C"
enforces contiguity (see Contiguity).
And for completeness, the cumsum()
example for writing an
array.
def cumsum(x):
# Normalise `x`.
x = np.asarray(x, dtype=np.int32, order="C")
# Create an empty array of the correct size and dtype.
# Note that ``order="C"`` is already the default - no need to set it.
out = np.empty(len(x), dtype=np.int32)
# Call cumsum() C function.
slug.dll.cumsum(ptr(x), ptr(out), len(x))
return out
Contiguity¶
Contiguity describes how an array's elements are laid out in memory. In C this is always just one after the other in order with no gaps in between. Multi dimensional arrays are flattened column major so a \(2 \times 3\) array is really:
a := [ a[0, 0], a[0, 1], a[0, 2], a[1, 0], a[1, 1], a[1, 2] ]
This is the opposite of Fortran which uses column minor arrays. NumPy, which is written in both C and Fortran, not only bounces between the two formats but invents all kinds of other formats in between.
The effect of contiguity will be easier to explain when you see it go wrong.
Let's start by bypassing the contiguity enforcement in the sum example
above.
The following creates an array of the correct dtype
but without setting order
and passes it to our C function.
>>> arr = np.arange(10, dtype=np.double)
>>> slug.dll.sum(ptr(arr), len(arr))
45.0
# Compare to the correct answer.
>>> np.sum(arr)
45.0
Hmm, everything seems to be working ok without the order="C"
. But…
that is only because NumPy writes C contiguous arrays by default so arr
is already contiguous.
>>> arr.flags.c_contiguous
True
NumPy only creates non-contiguous arrays if it thinks it can avoid a redundant copy operation by using a strided array instead. Slicing with a step is one example of when this may happen.
>>> arr[::2].flags.c_contiguous
False
Instead of copying out every second element of arr
into a new array,
NumPy has simply said "use the same raw data as arr
but skip every second
element".
>>> np.shares_memory(arr, arr[::2])
True
Whilst NumPy is smart enough to work with strided arrays:
>>> arr = np.arange(10, dtype=np.double)[::2]
>>> arr
array([0., 2., 4., 6., 8.])
>>> arr.sum()
20.0 # <-- Yes, that is the correct answer.
Our C code isn't.
cslug.ptr()
actively prohibits us from making this mistake.
It raises a ValueError
error if its given a non-contiguity buffer:
>>> ptr(arr)
ValueError: ndarray is not C-contiguous and, by using `ptr()`,
you have specified that a contiguous array is required. See
the bottom of `help(cslug.ptr)` for how to resolve this.
To remove this safety net and see why arr
should be contiguous, use
cslug.nc_ptr()
:
>>> from cslug import nc_ptr
>>> slug.dll.sum(nc_ptr(arr), len(arr))
10.0 # <-- This is the wrong answer.
Because arr
was created using arr = np.arange(10, dtype=np.double)[::2]
,
under the hood arr
is just np.arange(10, dtype=np.double)
but with a
flag set to say, "skip every second element".
The above code is therefore really running:
>>> slug.dll.sum(ptr(np.arange(10, dtype=np.double)), 5)
10.0
Another way you can see that a sliced array is the same array as the one it was
sliced from is to compare pointers.
Again, because arr[::2]
is non-contiguous, you will have to use
nc_ptr()
.
>>> a = np.arange(10, dtype=np.double)
>>> ptr(a)
<Void Pointer 94425197003296>
>>> nc_ptr(a[::2])
<Void Pointer 94425197003296>
>>> nc_ptr(a[::2]) == nc_ptr(a)
True
Data types¶
Data types are controlled in NumPy with the dtype
property.
The mapping of C types to numpy.dtype
s is relatively intuitive par a
few special cases.
Again there are no safety checks. If you get it wrong, anything can happen.
Note
NumPy's default data types are platform dependent.
For example, np.arange(10).dtype
returns int32
on Windows but int64
on Linux.
If your code works perfectly on one OS but spits out nonsense on another
then the cause is likely a forgotten explicit dtype.
Integers¶
The basic
int
in C isnumpy.intc
. Notnumpy.int
which is just the builtin Pythonint
type.Inexact integer types such as
short
,long
andlong long
are named the same (minus any spaces):numpy.short
,numpy.long
,numpy.longlong
.Exact types such as
int32_t
have the_t
stripped, givingnumpy.int32
.Unsigned integers have a
u
prefix:numpy.ushort
,numpy.uint16
.ssize_t
andsize_t
arenumpy.intp
andnumpy.uintp
respectively.
Floats¶
float
isnumpy.single
.double
isnumpy.double
.
String-like types¶
char
isnumpy.void
.wchar_t
isnumpy.unicode_
.
Multidimensional arrays¶
Assuming C contiguity, multidimensional arrays are really just 1D arrays. Each row is laid out end to end in memory. For example the array:
[[0, 1, 2], [3, 4, 5]]
Is in fact just:
[0, 1, 2, 3, 4, 5]
With the metadata specifying that the array is shaped (2, 3)
stored
elsewhere (in numpy.ndarray.shape
).
Some basic arithmetic is needed to convert multidimensional indices into
one dimensional ones.
Note
C does allow arrays of arrays but they're not fully featured. Most importantly, all but the outermost dimension must be known at compile time. This rules out almost all practical usages.
Contiguous Multidimensional Arrays¶
For the less masochistically mathematically inclined,
below are the formula you will need to convert multidimensional indices
into flat ones.
In these formula, shape
is the numpy.ndarray.shape
attribute
which may be passed directly to C as an size_t *
using
ptr(arr.ctypes.shape)
.
NumPy syntax |
Convert dimensional indices (i, j, k) to flat index (n). |
---|---|
|
|
|
|
|
|
And, in the less likely case that you want to convert flat indices back to
multidimensional ones,
the formula below will undo the formula above.
Note that, in Python, you may simply use numpy.ravel_multi_index()
and
numpy.unravel_index()
which perform the same calculation as these
formula.
NumPy syntax |
Convert flat index (n) to dimensional indices (i, j, k). |
---|---|
|
|
|
|
|
|
The following trivial reimplementation of numpy.ravel()
demonstrates the above for a 3D array.
#include <stddef.h>
void flatten_3D(double * in, double * out, size_t * shape) {
/* Copy the contents of a contiguous 3D array into a flat array. */
size_t n = 0;
for (size_t i = 0; i < shape[0]; i++) {
for (size_t j = 0; j < shape[1]; j++) {
for (size_t k = 0; k < shape[2]; k++) {
// Convert from 3D indices to flat index.
size_t flat_index = (i * shape[1] + j) * shape[2] + k;
// Copy the value.
out[n++] = in[flat_index];
}
}
}
}
def flatten_3D(arr):
"""Wrapper for ``flatten_3D()``."""
# Normalise `arr`.
arr = np.asarray(arr, dtype=np.double, order="C")
# Sanity check that `arr` is 3D.
assert arr.ndim == 3
# Create a flat empty output array to populate.
out = np.empty(arr.size, arr.dtype)
# Pass `arr`, `out` and the shape of `arr` to the C function.
# Note the use of `arr.ctypes.shape` which is a ctypes size_t array,
# instead of `arr.shape` which is a tuple.
slug.dll.flatten_3D(ptr(arr), ptr(out), ptr(arr.ctypes.shape))
return out
The more mathematically inclined may observe that n
is always equal to
flat_index
in the above example and that the whole triple for loop is
therefore redundant in this case.
Strided Arrays¶
For those wanting to squeeze out every last clock cycle,
strided arrays can be supported directly (without conversion)
relatively easily if the number of dimensions is fixed.
Use the numpy.ndarray.strides
attribute and the formula below
to locate elements.
In these formula,
itemsize
is the arr.itemsize
attribute.
NumPy syntax |
Convert dimensional indices (i, j, k) to flat index (n). |
---|---|
|
|
|
|
|
|
To demonstrate, the code below is functionally equivalent to the flatten example above except that it is faster for non contiguous input.
void flatten_strided_3D(double * in, double * out, size_t * shape, size_t * strides) {
/* Copy the contents of a non-contiguous 3D array into a flat array. */
size_t n = 0;
for (size_t i = 0; i < shape[0]; i++) {
for (size_t j = 0; j < shape[1]; j++) {
for (size_t k = 0; k < shape[2]; k++) {
// Convert from 3D strided indices to flat.
size_t flat_index = (i * strides[0] + j * strides[1] + k * strides[2]);
// NumPy's stride offsets are in bytes rather than items.
flat_index /= sizeof(double);
// Copy the value.
out[n++] = in[flat_index];
}
}
}
}
def flatten_strided_3D(arr):
"""Wrapper for ``flatten_strided_3D()``."""
# Normalise array dtype but no need to enforce contiguity.
arr = np.asarray(arr, dtype=np.double)
assert arr.ndim == 3
# As before, create a flat empty output array to populate.
out = np.empty(arr.size, arr.dtype)
# Pass the arrays, the shape and strides to C.
# Note, you must use `nc_ptr(arr)` instead of `ptr()` because `arr` is
# not necessarily contiguous.
slug.dll.flatten_strided_3D(nc_ptr(arr), ptr(out), ptr(arr.ctypes.shape),
ptr(arr.ctypes.strides))
return out
Vectorization¶
Often in NumPy, parallelizable functions are applied to arrays simply for speed, rather than because they expect an array of inputs. A simplest example of this is:
arr + 1
which will work if arr
is scalar or implicitly use a for loop if arr
is
an array of any dimension.
Getting this to work for arbitrary dimensions and strided arrays efficiently
is far harder than it looks.
Travis E. Oliphant (the original author of NumPy) wrote his PhD on it.
There is cheat however that's almost as efficient.
Ravel arbitrary dimensional arrays into 1D arrays.
Write your C function to use a single for loop to traverse a 1D array.
Unravel the output back to the original shape.
So to vectorize the add_1()
hello cslug example
add a single for loop to the C function:
#include <stddef.h>
void add_1(int * in, int * out, size_t len) {
/* Add 1 to every element in the 1D array `in`.
Write to the 1D array `out`. */
for (size_t i = 0; i < len; i++) {
out[i] = in[i] + 1;
}
}
Then to wrap it with:
def add_1(arr):
# Ensure `arr` is an array, is of the correct dtype and is C contiguous.
arr = np.asarray(arr, dtype=np.intc, order="C")
# Store the original shape, then flatten `arr` to make it 1D.
old_shape = arr.shape
arr = arr.ravel()
# Set up an empty output array with the same type and shape as `arr`.
out = np.empty_like(arr)
# Call the C function on our 1D arrays.
slug.dll.add_1(ptr(arr), ptr(out), len(arr))
# Return the output after restoring the original shape.
return out.reshape(old_shape)
Ravelling and reshaping have no effect on the underlying contents of an array provided that the array was C contiguous before being reshaped or ravelled.
In the above Python code, order="C"
was explicitly specified,
meaning that arr
and arr.ravel()
use the same memory
and therefore ptr(arr) == ptr(arr.ravel())
is guaranteed to be true.
A consequence of this is that the use of arr.ravel()
and
out.reshape(old_shape)
is redundant and the above wrapper can be simplified
to:
def add_1_(arr):
# Ensure `arr` is an array, is of the correct dtype and is C contiguous.
arr = np.asarray(arr, dtype=np.intc, order="C")
# Create an empty output array with the same type and shape as `arr`.
out = np.empty_like(arr)
# Call the C function on our arrays.
# It doesn't matter that they're not 1D - they look the same to C.
# Note the length parameter ``arr.size`` instead of ``len(arr)``.
slug.dll.add_1(ptr(arr), ptr(out), arr.size)
return out