Optimizing compiler for evaluating mathematical expressions on CPUs and GPUs.
Project description
Theano is a Python library that allows you to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays. It is built on top of NumPy. Theano features:
tight integration with NumPy: a similar interface to NumPy’s. numpy.ndarrays are also used internally in Theano-compiled functions.
transparent use of a GPU: perform data-intensive computations up to 140x faster than on a CPU (support for float32 only).
efficient symbolic differentiation: Theano can compute derivatives for functions of one or many inputs.
speed and stability optimizations: avoid nasty bugs when computing expressions such as log(1 + exp(x)) for large values of x.
dynamic C code generation: evaluate expressions faster.
extensive unit-testing and self-verification: includes tools for detecting and diagnosing bugs and/or potential problems.
Theano has been powering large-scale computationally intensive scientific research since 2007, but it is also approachable enough to be used in the classroom (IFT6266 at the University of Montreal).
Release Notes
Theano 1.0.0 (15th of November, 2017)
This is a final release of Theano, version 1.0.0, with a lot of new features, interface changes, improvements and bug fixes.
We recommend that everybody update to this version.
- Highlights (since 0.9.0):
Announcing that MILA will stop developing Theano
conda packages now available and updated in our own conda channel mila-udem To install it: conda install -c mila-udem theano pygpu
Support NumPy 1.13
Support pygpu 0.7
Moved Python 3.* minimum supported version from 3.3 to 3.4
Added conda recipe
Replaced deprecated package nose-parameterized with up-to-date package parameterized for Theano requirements
Theano now internally uses sha256 instead of md5 to work on systems that forbid md5 for security reason
Removed old GPU backend theano.sandbox.cuda. New backend theano.gpuarray is now the official GPU backend
Make sure MKL uses GNU OpenMP
NB: Matrix dot product (gemm) with mkl from conda could return wrong results in some cases. We have reported the problem upstream and we have a work around that raises an error with information about how to fix it.
Improved elemwise operations
Speed-up elemwise ops based on SciPy
Fixed memory leaks related to elemwise ops on GPU
Scan improvements
Speed up Theano scan compilation and gradient computation
Added meaningful message when missing inputs to scan
Speed up graph toposort algorithm
Faster C compilation by massively using a new interface for op params
Faster optimization step, with new optional destroy handler
Documentation updated and more complete
Added documentation for RNNBlock
Updated conv documentation
Support more debuggers for PdbBreakpoint
Many bug fixes, crash fixes and warning improvements
A total of 71 people contributed to this release since 0.9.0, see list below.
- Interface changes:
Merged duplicated diagonal functions into two ops: ExtractDiag (extract a diagonal to a vector), and AllocDiag (set a vector as a diagonal of an empty array)
Removed op ExtractDiag from theano.tensor.nlinalg, now only in theano.tensor.basic
Generalized AllocDiag for any non-scalar input
Added new parameter target for MRG functions
Renamed MultinomialWOReplacementFromUniform to ChoiceFromUniform
Changed grad() method to L_op() in ops that need the outputs to compute gradient
Removed or deprecated Theano flags:
cublas.lib
cuda.enabled
enable_initial_driver_test
gpuarray.sync
home
lib.cnmem
nvcc.* flags
pycuda.init
- Convolution updates:
Implemented separable convolutions for 2D and 3D
Implemented grouped convolutions for 2D and 3D
Added dilated causal convolutions for 2D
Added unshared convolutions
Implemented fractional bilinear upsampling
Removed old conv3d interface
Deprecated old conv2d interface
- GPU:
Added a meta-optimizer to select the fastest GPU implementations for convolutions
Prevent GPU initialization when not required
Added disk caching option for kernels
Added method my_theano_function.sync_shared() to help synchronize GPU Theano functions
Added useful stats for GPU in profile mode
Added Cholesky op based on cusolver backend
Added GPU ops based on magma library: SVD, matrix inverse, QR, cholesky and eigh
Added GpuCublasTriangularSolve
Added atomic addition and exchange for long long values in GpuAdvancedIncSubtensor1_dev20
Support log gamma function for all non-complex types
Support GPU SoftMax in both OpenCL and CUDA
Support offset parameter k for GpuEye
CrossentropyCategorical1Hot and its gradient are now lifted to GPU
cuDNN:
Official support for v6.* and v7.*
Added spatial transformation operation based on cuDNN
Updated and improved caching system for runtime-chosen cuDNN convolution algorithms
Support cuDNN v7 tensor core operations for convolutions with runtime timed algorithms
Better support and loading on Windows and Mac
Support cuDNN v6 dilated convolutions
Support cuDNN v6 reductions for contiguous inputs
Optimized SUM(x^2), SUM(ABS(X)) and MAX(ABS(X)) operations with cuDNN reductions
Added new Theano flags cuda.include_path, dnn.base_path and dnn.bin_path to help configure Theano when CUDA and cuDNN can not be found automatically
Extended Theano flag dnn.enabled with new option no_check to help speed up cuDNN importation
Disallowed float16 precision for convolution gradients
Fixed memory alignment detection
Added profiling in C debug mode (with theano flag cmodule.debug=True)
Added Python scripts to help test cuDNN convolutions
Automatic addition of cuDNN DLL path to PATH environment variable on Windows
Updated float16 support
Added documentation for GPU float16 ops
Support float16 for GpuGemmBatch
Started to use float32 precision for computations that don’t support float16 on GPU
- New features:
Implemented truncated normal distribution with box-muller transform
Added L_op() overriding option for OpFromGraph
Added NumPy C-API based fallback implementation for [sd]gemv_ and [sd]dot_
Implemented topk and argtopk on CPU and GPU
Implemented max() and min() functions for booleans and unsigned integers types
Added tensor6() and tensor7() in theano.tensor module
Added boolean indexing for sub-tensors
Added covariance matrix function theano.tensor.cov
Added a wrapper for Baidu’s CTC cost and gradient functions
Added scalar and elemwise CPU ops for modified Bessel function of order 0 and 1 from scipy.special
Added Scaled Exponential Linear Unit (SELU) activation
Added sigmoid_binary_crossentropy function
Added tri-gamma function
Added unravel_index and ravel_multi_index functions on CPU
Added modes half and full for Images2Neibs ops
Implemented gradient for AbstractBatchNormTrainGrad
Implemented gradient for matrix pseudoinverse op
Added new prop replace for ChoiceFromUniform op
Added new prop on_error for CPU Cholesky op
Added new Theano flag deterministic to help control how Theano optimize certain ops that have deterministic versions. Currently used for subtensor Ops only.
Added new Theano flag cycle_detection to speed-up optimization step by reducing time spending in inplace optimizations
Added new Theano flag check_stack_trace to help check the stack trace during optimization process
Added new Theano flag cmodule.debug to allow a debug mode for Theano C code. Currently used for cuDNN convolutions only.
Added new Theano flag pickle_test_value to help disable pickling test values
- Others:
Kept stack trace for optimizations in new GPU backend
Added deprecation warning for the softmax and logsoftmax vector case
Added a warning to announce that C++ compiler will become mandatory in next Theano release 0.11
Added R_op() for ZeroGrad
Added description for rnnblock
- Other more detailed changes:
Fixed invalid casts and index overflows in theano.tensor.signal.pool
Fixed gradient error for elemwise minimum and maximum when compared values are the same
Fixed gradient for ARange
Removed ViewOp subclass during optimization
Removed useless warning when profile is manually disabled
Added tests for abstract conv
Added options for disconnected_outputs to Rop
Removed theano/compat/six.py
Removed COp.get_op_params()
Support of list of strings for Op.c_support_code(), to help not duplicate support codes
Macro names provided for array properties are now standardized in both CPU and GPU C codes
Moved all C code files into separate folder c_code in every Theano module
Many improvements for Travis CI tests (with better splitting for faster testing)
Many improvements for Jenkins CI tests: daily testings on Mac and Windows in addition to Linux
- Commiters since 0.9.0:
Frederic Bastien
Steven Bocco
João Victor Tozatti Risso
Arnaud Bergeron
Mohammed Affan
amrithasuresh
Pascal Lamblin
Reyhane Askari
Alexander Matyasko
Shawn Tan
Simon Lefrancois
Adam Becker
Vikram
Gijs van Tulder
Faruk Ahmed
Thomas George
erakra
Andrei Costinescu
Boris Fomitchev
Zhouhan LIN
Aleksandar Botev
jhelie
xiaoqie
Tegan Maharaj
Matt Graham
Cesar Laurent
Gabe Schwartz
Juan Camilo Gamboa Higuera
Tim Cooijmans
Anirudh Goyal
Saizheng Zhang
Yikang Shen
vipulraheja
Florian Bordes
Sina Honari
Chiheb Trabelsi
Shubh Vachher
Daren Eiri
Joseph Paul Cohen
Laurent Dinh
Mohamed Ishmael Diwan Belghazi
Jeff Donahue
Ramana Subramanyam
Bogdan Budescu
Dzmitry Bahdanau
Ghislain Antony Vaillant
Jan Schlüter
Nan Jiang
Xavier Bouthillier
fo40225
mrTsjolder
wyjw
Aarni Koskela
Adam Geitgey
Adrian Keet
Adrian Seyboldt
Anmol Sahoo
Chong Wu
Holger Kohr
Jayanth Koushik
Lilian Besson
Lv Tao
Michael Manukyan
Murugesh Marvel
NALEPA
Rebecca N. Palmer
Zotov Yuriy
dareneiri
lrast
morrme
naitonium
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.