Optimizing compiler for evaluating mathematical expressions on CPUs and GPUs.
Project description
Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy_. Theano features:
* **tight integration with NumPy:** a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions.
* **transparent use of a GPU:** perform data-intensive computations up to 140x faster than on a CPU (support for float32 only).
* **efficient symbolic differentiation:** Theano can compute derivatives for functions of one or many inputs.
* **speed and stability optimizations:** avoid nasty bugs when computing expressions such as log(1+ exp(x) ) for large values of x.
* **dynamic C code generation:** evaluate expressions faster.
* **extensive unit-testing and self-verification:** includes tools for detecting and diagnosing bugs and/or potential problems.
Theano has been powering large-scale computationally intensive scientific
research since 2007, but it is also approachable enough to be used in the
classroom (IFT6266 at the University of Montreal).
.. _NumPy: http://numpy.scipy.org/
Modifications in the trunk since the last release
Partial of what is in trunk since the last release
--------------------------------------------------
Deprecation:
* tag.shape attribute deprecated (#633)
* FAST_RUN_NOGC mode deprecated
* CudaNdarray_new_null is deprecated in favour of CudaNdarray_New
Bugs fixed:
* Bugfix in CudaNdarray.__iadd__. When it is not implemented, return the error.
* Typo fixed in tensor/opt.py
* THEANO_FLAGS='optimizer=None' now works as expected
* Fixed memory leak in error handling on GPU-to-host copy
* Fix relating specifically to Python 2.7 on Mac OS X
* infer_shape can now handle Python longs
* Fixed behaviour of pydotprint's max_label_size option
Crash fixed:
* Work around a bug in gcc 4.3.0 that make the compilation of 2d convolution
crash.
Optimization:
* Optimize 4 pattern of subtensor followed by subtensor.
* Gemm inplace optimization on the GPU re-enabled
GPU:
* Move to the gpu fused elemwise that have other dtype then float32 in them
(except float64) if the input and output are float32.
* This allow to move elemwise comparisons to the GPU if we cast it to
float32 after that.
* Implemented CudaNdarray.ndim to have the same interface in ndarray.
* Fixed slowdown caused by multiple chained views on CudaNdarray objects
* CudaNdarray_alloc_contiguous changed so as to never try to free
memory on a view: new "base" property
* Safer decref behaviour in CudaNdarray in case of failed allocations
* New GPU implementation of tensor.basic.outer
New features:
* ProfileMode
* profile the scan overhead
* simple hook system to add profiler
* reordered the output to be in the order of more general to more specific
* var[vector of index] now work, (grad work recursively, the direct grad
work inplace, gpu work)
* limitation: work only of the outer most dimensions.
* test_value implementation to allow quick debugging at graph creation time
* cuda.root inferred if nvcc is on the path, otherwise defaults to
/usr/local/cuda
* Better graph printing for graphs involving a scan subgraph
*
Documentation:
* Better commenting of cuda_ndarray.cu
* Fixes in the scan documentation: add missing declarations/print statements
* Better error message on failed __getitem__
* Updated documentation on profile mode
Unit tests:
* More strict float comparaison by default
* Reuse test for subtensor of tensor for gpu tensor(more gpu test)
* Tests that check for aliased function inputs and assure appropriate copying
(#374)
* Better test of copies in CudaNdarray
* New tests relating to the new base pointer requirements
Other:
* ?? a bug?? Correctly put the broadcast flag to True in the output var of
a Rehapse op when we receive an int 1 in the new shape.
* pydotprint: high contrast mode is now the default
* More compact printing (ignore leading "Composite" in op names)
(To be continued...)
* **tight integration with NumPy:** a similar interface to NumPy's. numpy.ndarrays are also used internally in Theano-compiled functions.
* **transparent use of a GPU:** perform data-intensive computations up to 140x faster than on a CPU (support for float32 only).
* **efficient symbolic differentiation:** Theano can compute derivatives for functions of one or many inputs.
* **speed and stability optimizations:** avoid nasty bugs when computing expressions such as log(1+ exp(x) ) for large values of x.
* **dynamic C code generation:** evaluate expressions faster.
* **extensive unit-testing and self-verification:** includes tools for detecting and diagnosing bugs and/or potential problems.
Theano has been powering large-scale computationally intensive scientific
research since 2007, but it is also approachable enough to be used in the
classroom (IFT6266 at the University of Montreal).
.. _NumPy: http://numpy.scipy.org/
Modifications in the trunk since the last release
Partial of what is in trunk since the last release
--------------------------------------------------
Deprecation:
* tag.shape attribute deprecated (#633)
* FAST_RUN_NOGC mode deprecated
* CudaNdarray_new_null is deprecated in favour of CudaNdarray_New
Bugs fixed:
* Bugfix in CudaNdarray.__iadd__. When it is not implemented, return the error.
* Typo fixed in tensor/opt.py
* THEANO_FLAGS='optimizer=None' now works as expected
* Fixed memory leak in error handling on GPU-to-host copy
* Fix relating specifically to Python 2.7 on Mac OS X
* infer_shape can now handle Python longs
* Fixed behaviour of pydotprint's max_label_size option
Crash fixed:
* Work around a bug in gcc 4.3.0 that make the compilation of 2d convolution
crash.
Optimization:
* Optimize 4 pattern of subtensor followed by subtensor.
* Gemm inplace optimization on the GPU re-enabled
GPU:
* Move to the gpu fused elemwise that have other dtype then float32 in them
(except float64) if the input and output are float32.
* This allow to move elemwise comparisons to the GPU if we cast it to
float32 after that.
* Implemented CudaNdarray.ndim to have the same interface in ndarray.
* Fixed slowdown caused by multiple chained views on CudaNdarray objects
* CudaNdarray_alloc_contiguous changed so as to never try to free
memory on a view: new "base" property
* Safer decref behaviour in CudaNdarray in case of failed allocations
* New GPU implementation of tensor.basic.outer
New features:
* ProfileMode
* profile the scan overhead
* simple hook system to add profiler
* reordered the output to be in the order of more general to more specific
* var[vector of index] now work, (grad work recursively, the direct grad
work inplace, gpu work)
* limitation: work only of the outer most dimensions.
* test_value implementation to allow quick debugging at graph creation time
* cuda.root inferred if nvcc is on the path, otherwise defaults to
/usr/local/cuda
* Better graph printing for graphs involving a scan subgraph
*
Documentation:
* Better commenting of cuda_ndarray.cu
* Fixes in the scan documentation: add missing declarations/print statements
* Better error message on failed __getitem__
* Updated documentation on profile mode
Unit tests:
* More strict float comparaison by default
* Reuse test for subtensor of tensor for gpu tensor(more gpu test)
* Tests that check for aliased function inputs and assure appropriate copying
(#374)
* Better test of copies in CudaNdarray
* New tests relating to the new base pointer requirements
Other:
* ?? a bug?? Correctly put the broadcast flag to True in the output var of
a Rehapse op when we receive an int 1 in the new shape.
* pydotprint: high contrast mode is now the default
* More compact printing (ignore leading "Composite" in op names)
(To be continued...)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Theano-0.4.0rc2.zip
(1.1 MB
view details)
Theano-0.4.0rc2.tar.gz
(972.6 kB
view details)
File details
Details for the file Theano-0.4.0rc2.zip
.
File metadata
- Download URL: Theano-0.4.0rc2.zip
- Upload date:
- Size: 1.1 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52c620ff34314089407f0530a4edb1749ec5cae812924df3ce1fa28d2b6182e9 |
|
MD5 | 053ef9f598c8f66e45812d746d50d617 |
|
BLAKE2b-256 | ef8fbbb457597d21ca8bfc3a66c5d05665a7a0436c2970151dcc935583a1d0e9 |
Provenance
File details
Details for the file Theano-0.4.0rc2.tar.gz
.
File metadata
- Download URL: Theano-0.4.0rc2.tar.gz
- Upload date:
- Size: 972.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c1d67866d96ca96f0dc515c91263a7a24674665c58a0e9f58e2049d6543fbb2d |
|
MD5 | e8328ca07b1e7b641896c3ca5a57331d |
|
BLAKE2b-256 | 94d138895a146ef06ecc132bda0ee5ef9666a4bc65881117bb4bc8a0f1b49b55 |