Unfortunately, this does require some type-casting (between "pointer sized int" and pointers) which is a good way of hiding logic errors, so I'm not 100% happy with it.
def my_func(list list_of_arrays):
cdef int n_arrays = len(list_of_arrays)
cdef np.uintp_t[::1] data = np.array((n_arrays,),dtype=np.uintp)
cdef np.uintp_t[::1] shape = np.array((n_arrays,),dtype=np.uintp)
cdef double x;
cdef np.ndarray[double, ndim=3, mode="c"] temp
for i in range(n_arrays):
temp = list_of_arrays[i]
data[i] = <np.uintp_t>&temp[0,0,0]
shape[i] = <np.uintp_t>&(temp.shape[0])
x = my_func_c(<double**>(&data[0]), <np.intp_t**>&shape[0], n_arrays)
The way you've done it is probably a pretty sensible way. One slight simplification to your original code that should work
shape[i] = <np.uintp_t>&(temp.shape[0])
To solve your problems you could use std::vector
:
import numpy as np
cimport numpy as np
from libcpp.vector cimport vector
cdef extern from "my_func.c":
double my_func_c(double ** data, int ** shape, int n_arrays)
def my_func(list list_of_arrays):
cdef int n_arrays = len(list_of_arrays)
cdef vector[double * ] data
cdef vector[vector[int]] shape_mem #
for storing casted shapes
cdef vector[int * ] shape #pointers to stored shapes
cdef double x
cdef np.ndarray[double, ndim = 3, mode = "c"] temp
shape_mem.resize(n_arrays)
for i in range(n_arrays):
print "i:", i
temp = list_of_arrays[i]
data.push_back( & temp[0, 0, 0])
for j in range(3):
shape_mem[i].push_back(temp.shape[j])
shape.push_back(shape_mem[i].data())
x = my_func_c(data.data(), shape.data(), n_arrays)
return x
Also your setup would need a modification:
# setup.py from distutils.core import setup, Extension from Cython.Build import cythonize import numpy as np setup(ext_modules = cythonize(Extension( name = 'my_func_c', language = 'c++', extra_compile_args = ['-std=c++11'], sources = ["my_func_c.pyx", "my_func.c"], include_dirs = [np.get_include()] )))
This tutorial is aimed at NumPy users who have no experience with Cython at all. If you have some knowledge of Cython you may want to skip to the ‘’Efficient indexing’’ section.,Very few Python constructs are not yet supported, though making Cython compile all Python code is a stated goal, you can see the differences with Python in limitations.,This creates yourmod.c which is the C source for a Python extension module. A useful additional switch is -a which will generate a document yourmod.html) that shows which Cython code translates to which C code line by line.,A version of pyximport is shipped with Cython, so that you can import pyx-files dynamically into Python and have them compiled automatically (See Compiling with pyximport).
pip install Cython
$ cd path / to / cython - distro $ path - to - sage / sage - python setup.py install
$ cython yourmod.pyx
$ gcc - shared - pthread - fPIC - fwrapv - O2 - Wall - fno - strict - aliasing - I / usr / include / python2 .7 - o yourmod.so yourmod.c
cimport numpy
def compute_np(array_1, array_2, a, b, c):
return np.clip(array_1, 2, 10) * a + array_2 * b + c
The code below does the equivalent of this function in numpy: We’ll say that array_1 and array_2 are 2D NumPy arrays of integer type and a, b and c are three Python integers. This function uses NumPy and is already really fast, so it might be a bit overkill to do it again with Cython. ,However, the cython/numpy interface api seems to have changed a bit, in particular with ensuring the passing of memory-contiguous arrays., By comparing types in if-conditions, it is also possible to execute entirely different code paths depending on the specific data type. In our example, since we don’t have access anymore to the NumPy’s dtype of our input arrays, we use those if-else statements to know what NumPy data type we should use for our output array. , 3 days ago Apr 04, 2017 · So once again, Numpy 2-d array --> C++ --> C++ 2-d iterator/array --> new python numpy array. Note that the returned information is an entirely new array or iterator, and not the original numpy array. I was reading over Kurt Smith's book on Cython, and just wanted to make sure I was doing this correctly. So to pass the numpy array to C++ I ...
cimport numpy as np import numpy as np # as suggested by jorgeca cdef extern from "myclass.h": cdef cppclass MyClass: MyClass() except + void run(double* X, int N, int D, double* Y) def run(np.ndarray[np.double_t, ndim=2] X): cdef int N, D N = X.shape[0] D = X.shape[1] cdef np.ndarray[np.double_t, ndim=1, mode="c"] X_c X_c = np.ascontiguousarray(X, dtype=np.double) cdef np.ndarray[np.double_t, ndim=1, mode="c"] Y_c Y_c = np.ascontiguousarray(np.zeros((N*D,)), dtype=np.double) cdef MyClass myclass myclass = MyClass() myclass.run(<double*> X_c.data, N, D, <double*> Y_c.data) return Y_c.reshape(N, 2)
import numpy as np import mywrapper mywrapper.run(np.array([ [1, 2], [3, 4] ], dtype = np.double)) # NameError: name 'np' is not defined[at mywrapper.pyx ":X_c = ...] # fixed!
cimport numpy as np
import numpy as np cdef extern from "myclass.h": cdef cppclass MyClass: MyClass() except + void run(double * X, int N, int D, double * Y) def run(np.ndarray[np.double_t, ndim = 2] X): X = np.ascontiguousarray(X) cdef np.ndarray[np.double_t, ndim = 2, mode = "c"] Y = np.zeros_like(X) cdef MyClass myclass myclass = MyClass() myclass.run( & X[0, 0], X.shape[0], X.shape[1], & Y[0, 0]) return Y
cimport numpy as np import numpy as np # as suggested by jorgeca cdef extern from "myclass.h":cdef cppclass MyClass: MyClass() except + void run(double* X, int N, int D, double* Y) def run(np.ndarray[np.double_t, ndim=2] X):cdef int N, DN = X.shape[0]D = X.shape[1]cdef np.ndarray[np.double_t, ndim=1, mode="c"] X_cX_c = np.ascontiguousarray(X, dtype=np.double)cdef np.ndarray[np.double_t, ndim=1, mode="c"] Y_cY_c = np.ascontiguousarray(np.zeros((N*D,)), dtype=np.double)cdef MyClass myclassmyclass = MyClass()myclass.run(<double*>X_c.data, N, D, <double*>Y_c.data)return Y_c.reshape(N, 2)
import numpy as np import mywrapper mywrapper.run(np.array([ [1, 2], [3, 4] ], dtype = np.double)) # NameError: name 'np' is not defined[at mywrapper.pyx ":X_c = ...] # fixed!
cimport numpy as np
import numpy as np cdef extern from "myclass.h": cdef cppclass MyClass: MyClass() except + void run(double * X, int N, int D, double * Y) def run(np.ndarray[np.double_t, ndim = 2] X): X = np.ascontiguousarray(X) cdef np.ndarray[np.double_t, ndim = 2, mode = "c"] Y = np.zeros_like(X) cdef MyClass myclassmyclass = MyClass() myclass.run( & X[0, 0], X.shape[0], X.shape[1], & Y[0, 0]) return Y
Here we see how to speed up NumPy array processing using Cython. By explicitly declaring the "ndarray" data type, your array processing can be 1250x faster. ,This tutorial will show you how to speed up the processing of NumPy arrays using Cython. By explicitly specifying the data types of variables in Python, Cython can give drastic speed increases at runtime.,This tutorial discussed using Cython for manipulating NumPy arrays with a speed of more than 1000x times Python processing alone. The key for reducing the computational time is to specify the data types for the variables, and to index the array rather than iterate through it.,The datatype of the array elements is int and defined according to the line below. The numpy imported using cimport has a type corresponding to each type in NumPy but with _t at the end. For example, int in regular NumPy corresponds to int_t in Cython.
We'll start with the same code as in the previous tutorial, except here we'll iterate through a NumPy array rather than a list. The NumPy array is created in the arr variable using the arrange() function, which returns one billion numbers starting from 0 with a step of 1.
import time
import numpy
total = 0
arr = numpy.arange(1000000000)
t1 = time.time()
for k in arr:
total = total + k
print("Total = ", total)
t2 = time.time()
t = t2 - t1
print("%.20f" % t)
Let's see how much time it takes to complete after editing the Cython script created in the previous tutorial, as given below. The only change is the inclusion of the NumPy array in the for loop. Note that you have to rebuild the Cython script using the command below before using it.
python setup.py build_ext--inplace
The maxval variable is set equal to the length of the NumPy array. We can start by creating an array of length 10,000 and increase this number later to compare how Cython improves compared to Python.
import time
import numpy
cimport numpy
cdef unsigned long long int maxval
cdef unsigned long long int total
cdef int k
cdef double t1, t2, t
cdef numpy.ndarray arr
maxval = 10000
arr = numpy.arange(maxval)
t1 = time.time()
for k in arr:
total = total + k
print "Total =", total
t2 = time.time()
t = t2 - t1
print("%.20f" % t)
The argument is ndim
, which specifies the number of dimensions in the array. It is set to 1 here. Note that its default value is also 1, and thus can be omitted from our example. If more dimensions are being used, we must specify it.
cdef numpy.ndarray[numpy.int_t, ndim = 1] arr
Note that we defined the type of the variable arr
to be numpy.ndarray
, but do not forget that this is the type of the container. This container has elements and these elements are translated as objects if nothing else is specified. To force these elements to be integers, the dtype
argument is set to numpy.int
according to the next line.
arr = numpy.arange(maxval, dtype = numpy.int)
The meat of the example is that the data is allocated in C, but exposed in Python without a copy using the PyArray_SimpleNewFromData numpy function in the Cython file cython_wrapper.pyx.,The goal of this example is to show how an existing C codebase for numerical computing (here c_code.c) can be wrapped in Cython to be exposed in Python.,Independent of that code-copying issue, I think it is easiest to just set PyArray_ENABLEFLAGS(arr, np.NPY_OWNDATA) to free the memory as @syrte asked but this example demonstrates how to implement a custom delete function. I need to call fftw_free instead of free and here is the only way I found that achieves this. 💯,Thanks for this useful complete example. When I copied over portions into my code, I ran into the issue that the assignment
To build the C extension in-place run:
$ python setup.py build_ext--i
def as_ndarray(self):
cdef np.npy_intp shape[1]
cdef np.ndarray ndarray
shape[0] = <np.npy_intp> self.size
ndarray = np.PyArray_SimpleNewFromData(1, shape, np.NPY_INT, self.data_ptr)
ndarray.base = <PyObject*> self
Py_INCREF(self)
return ndarray
Thanks for this useful complete example. When I copied over portions into my code, I ran into the issue that the assignment
ndarray.base = <PyObject*> array_wrapper
didn't work. It's because I forgot to declare the ndarray
as a cython variable, which is done properly in the gist.
cdef np.ndarray ndarray
cdef class ArrayWrapper: """Wrap an array allocated in C that has to be deleted by `galario_free`""" cdef void* data_ptr cdef int nx, ny cdef set_data(self, int nx, int ny, void* data_ptr): """ Set the data of the array This cannot be done in the constructor as it must receive C-level arguments. Parameters: ----------- nx: int Number of image rows data_ptr: void* Pointer to the data """ self.data_ptr = data_ptr self.nx = nx self.ny = ny cdef as_ndarray(self, int nx, int ny, void* data_ptr): """Create an `ndarray` that doesn't own the memory, we do.""" cdef np.npy_intp shape[2] cdef np.ndarray ndarray self.set_data(nx, ny, data_ptr) shape[:] = (self.nx, int(self.ny/2)+1) # Create a 2D array, of length `nx*ny/2+1` ndarray = np.PyArray_SimpleNewFromData(2, shape, complex_typenum, self.data_ptr) ndarray.base = <PyObject*> self # without this, data would be cleaned up right away Py_INCREF(self) return ndarray def __dealloc__(self): """ Frees the array. This is called by Python when all the references to the object are gone. """ print("Deallocating array") my_custom_free(self.data_ptr) ... def fftshift(double[:,::1] data): nx, ny = data.shape[0], data.shape[1] # operate on (nx, ny) array and return a (nx, ny/2+1) array that we have to deallocate cdef void* res = my_C_function(nx, ny, <void*>&data[0,0]) return ArrayWrapper().as_ndarray(nx, ny, res)
cdef class ArrayWrapper: """Wrap an array allocated in C that has to be deleted by `galario_free`""" cdef void* data_ptr cdef int nx, ny cdef set_data(self, int nx, int ny, void* data_ptr): """ Set the data of the array This cannot be done in the constructor as it must receive C-level arguments. Parameters: ----------- nx: int Number of image rows data_ptr: void* Pointer to the data """ self.data_ptr = data_ptr self.nx = nx self.ny = ny cdef as_ndarray(self, int nx, int ny, void* data_ptr): """Create an `ndarray` that doesn't own the memory, we do.""" cdef np.npy_intp shape[2] cdef np.ndarray ndarray self.set_data(nx, ny, data_ptr) shape[:] = (self.nx, int(self.ny/2)+1) # Create a 2D array, of length `nx*ny/2+1` ndarray = np.PyArray_SimpleNewFromData(2, shape, complex_typenum, self.data_ptr) ndarray.base = <PyObject*> self # without this, data would be cleaned up right away Py_INCREF(self) return ndarray def __dealloc__(self): """ Frees the array. This is called by Python when all the references to the object are gone. """ print("Deallocating array") my_custom_free(self.data_ptr) ... def fftshift(double[:,::1] data): nx, ny = data.shape[0], data.shape[1] # operate on (nx, ny) array and return a (nx, ny/2+1) array that we have to deallocate cdef void* res = my_C_function(nx, ny, <void*>&data[0,0]) return ArrayWrapper().as_ndarray(nx, ny, res)
return np.array(array_wrapper)
Should probably be
return np.array(array_wrapper, copy = False)
Passing shape=None to functions with a non-optional shape argument is deprecated ,These aliases have been deprecated. The table below shows the full list of deprecated aliases, along with their exact meaning. Replacing uses of items in the first column with the contents of the second column will work identically and silence the deprecation warning.,This header contains all utilities that required for the whole CPU dispatching process, it also can be considered as a bridge linking the new infrastructure work with NumPy CPU runtime detection.,The keyword argument option norm=backward is added as an alias for None and acts as the default option; using it has the direct transforms unscaled and the inverse transforms scaled by 1/n.
>>> np.broadcast_shapes((1, 2), (3, 1)) (3, 2) >>> np.broadcast_shapes(2, (3, 1)) (3, 2) >>> np.broadcast_shapes((6, 7), (5, 6, 1), (7, ), (5, 1, 7)) (5, 6, 7)
np.float(123)
arr1 = np.zeros((5, 0)) arr1[[20]] arr2 = np.zeros((5, 5)) arr2[[20],: 0]
import numpy as np arr = np.array([ [3, 6, 6], [4, 5, 1] ]) # mode: inexact match np.ravel_multi_index(arr, (7, 6), mode = "clap") # should be "clip" # searchside: inexact match np.searchsorted(arr[0], 4, side = 'random') # should be "right"
np.array([np.array(array_like)])
arr = np.empty(3, dtype = object) arr[: ] = [array_like1, array_like2, array_like3]