Converts MS Excel formulas to Python and evaluates them.
Project description
Excel Calculator
xlcalculator is a Python library that reads MS Excel files and, to the extent of supported functions, can translate the Excel functions into Python code and subsequently evaluate the generated Python code. Essentially doing the Excel calculations without the need for Excel.
xlcalculator is a modernization of the koala2 library.
xlcalculator currently supports:
Loading an Excel file into a Python compatible state
Saving Python compatible state
Loading Python compatible state
Ignore worksheets
Extracting sub-portions of a model. “focussing” on provided cell addresses or defined names
Evaluating
Individual cells
Defined Names (a “named cell” or range)
Ranges
Shared formulas not an Array Formula
Operands (+, -, /, *, ==, <>, <=, >=)
on cells only
Set cell value
Get cell value
Parsing a dict into the Model object
Code is in examples\third_party_datastructure
Functions are at the bottom of this README
- LN
Python Math.log() differs from Excel LN. Currently returning Math.log()
VLOOKUP - Exact match only
YEARFRAC - Basis 1, Actual/actual, is only within 3 decimal places
Not currently supported:
Array Formulas or CSE Formulas (not a shared formula): https://stackoverflow.com/questions/1256359/what-is-the-difference-between-a-shared-formula-and-an-array-formula or https://support.office.com/en-us/article/guidelines-and-examples-of-array-formulas-7d94a64e-3ff3-4686-9372-ecfd5caa57c7#ID0EAAEAAA=Office_2013_-_Office_2019)
Functions required to complete testing as per Microsoft Office Help website for SQRT and LN
EXP
DB
Run tests
Setup your environment:
virtualenv -p 3.10 ve ve/bin/pip install -e .[test]
From the root xlcalculator directory:
ve/bin/py.test -rw -s --tb=native
Or simply use tox:
tox
Run Example
From the examples/common_use_case directory:
python use_case_01.py
Adding/Registering Excel Functions
Excel function support can be easily added.
Fundamental function support is found in the xlfunctions directory. The functions are thematically organised in modules.
Excel functions can be added by any code using the xlfunctions.xl.register() decorator. Here is a simple example:
from xlcalculator.xlfunctions import xl
@xl.register()
@xl.validate_args
def ADDONE(num: xl.Number):
return num + 1
The @xl.validate_args decorator will ensure that the annotated arguments are converted and validated. For example, even if you pass in a string, it is converted to a number (in typical Excel fashion):
>>> ADDONE(1):
2
>>> ADDONE('1'):
2
If you would like to contribute functions, please create a pull request. All new functions should be accompanied by sufficient tests to cover the functionality. Tests need to be written for both the Python implementation of the function (tests/xlfunctions) and a comparison with Excel (tests/xlfunctions_vs_excel).
Excel number precision
Excel number precision is a complex discussion.
It has been discussed in a Wikipedia page.
The fundamentals come down to floating point numbers and a contention between how they are represented in memory Vs how they are stored on disk Vs how they are presented on screen. A Microsoft article explains the contention.
This project is attempting to take care while reading numbers from the Excel file to try and remove a variety of representation errors.
Further work will be required to keep numbers in-line with Excel throughout different transformations.
From what I can determine this requires a low-level implementation of a numeric datatype (C or C++, Cython??) to replicate its behaviour. Python built-in numeric types don’t replicate behaviours appropriately.
Unit testing Excel formulas directly from the workbook.
If you are interested in unit testing formulas in your workbook, you can use FlyingKoala. An example on how can be found here.
TODO
Do not treat ranges as a granular AST node it instead as an operation “:” of two cell references to create the range. That will make implementing features like A1:OFFSET(...) easy to implement.
Support for alternative range evaluation: by ref (pointer), by expr (lazy eval) and current eval mode.
Pointers would allow easy implementations of functions like OFFSET().
Lazy evals will allow efficient implementation of IF() since execution of true and false expressions can be delayed until it is decided which expression is needed.
Implement array functions. It is really not that hard once a proper RangeData class has been implemented on which one can easily act with scalar functions.
Improve testing
Refactor model and evaluator to use pass-by-object-reference for values of cells which then get “used”/referenced by ranges, defined names and formulas
Handle multi-file addresses
Improve integration with pyopenxl for reading and writing files example of problem space
Supported Functions
Compatibility
Function
xlcalculator
PyCel
formulas
Koala
FLOOR
Date and Time
Function
xlcalculator
PyCel
formulas
Koala
DATE
DATEDIF
DATEVALUE
DAY
DAYS
EDATE
EOMONTH
HOUR
ISOWEEKNUM
MINUTE
MONTH
NOW
SECOND
TIME
TIMEVALUE
TODAY
WEEKDAY
YEAR
YEARFRAC
Engineering
Function
xlcalculator
PyCel
formulas
Koala
BIN2DEC
BIN2HEX
BIN2OCT
DEC2BIN
DEC2HEX
DEC2OCT
HEX2BIN
HEX2DEC
HEX2OCT
OCT2BIN
OCT2DEC
OCT2HEX
Financial
Function
xlcalculator
PyCel
formulas
Koala
IRR
NPV
PMT
PV
SLN
VDB
XIRR
XNPV
Information
Function
xlcalculator
PyCel
formulas
Koala
ISBLANK
ISERR
ISERROR
ISEVEN
ISNA
ISNUMBER
ISODD
ISTEXT
NA
Logical
Function
xlcalculator
PyCel
formulas
Koala
AND
FALSE
IF
IFERROR
IFS
NOT
OR
SWITCH
TRUE
XOR
Lookup and reference
Function
xlcalculator
PyCel
formulas
Koala
CHOOSE
COLUMN
COLUMNS
HLOOKUP
INDEX
INDIRECT
LOOKUP
MATCH
OFFSET
ROW
ROWS
VLOOKUP
Math and Trigonometry
Function
xlcalculator
PyCel
formulas
Koala
ABS
ACOS
ACOSH
ACOT
ACOTH
ARABIC
ASIN
ASINH
ATAN
ATAN2
ATANH
CEILING
CEILING.MATH
CEILING.PRECISE
COS
COSH
COT
COTH
CSC
CSCH
DECIMAL
DEGREES
EVEN
EXP
FACT
FACTDOUBLE
FLOOR.MATH
FLOOR.PRECISE
GCD
INT
ISO.CEILING
LCM
LN
LOG
LOG10
MOD
MROUND
ODD
PI
POWER
RADIANS
RAND
RANDBETWEEN
ROMAN
ROUND
ROUNDDOWN
ROUNDUP
SEC
SECH
SIGN
SIN
SINH
SQRT
SQRTPI
SUM
SUMIF
SUMIFS
SUMPRODUCT
TAN
TANH
TRUNC
Statistical
Function
xlcalculator
PyCel
formulas
Koala
AVERAGE
AVERAGEA
AVERAGEIF
AVERAGEIFS
COUNT
COUNTA
COUNTBLANK
COUNTIF
COUNTIFS
LARGE
LINEST
MAX
MAXA
MAXIFS
MIN
MINA
MINIFS
SMALL
Text
Function
xlcalculator
PyCel
formulas
Koala
CONCAT
CONCATENATE
EXACT
FIND
LEFT
LEN
LOWER
MID
REPLACE
RIGHT
TRIM
UPPER
VALUE
CHANGES
0.5.0 (2023-02-06)
Added support for Python 3.10, dropped 3.8.
Upgraded requirements.txt to latest versions.
yearfrac==0.4.4 was incompatible with latest setuptools.
openpyxl had API changes that were addressed and tests fixed.
0.4.2 (2021-05-17)
Make sure that decimal rounding is only set in context and not system wide.
0.4.1 (2021-05-14)
Fixed cross-sheet references.
0.4.0 (2021-05-13)
Pass ignore_hidden from read_and_parse_archive() to parse_archive()
Add Excel tests for IF().
Add NOT() function.
Implemented BIN2OCT(), BIN2DEC(), BIN2HEX(), OCT2BIN(), OCT2DEC(), OCT2HEX(), DEC2BIN(), DEC2OCT(), DEC2HEX(), HEX2BIN(), HEX2OCT(), HEX2DEC().
Drop Python 3.7 support.
0.3.0 (2021-05-13)
Add support for cross-sheet references.
Make *IF() functions case insensitive to properly adhere to Excel specs.
Support for Python 3.9.
0.2.13 (2020-12-02)
Add functions: FALSE(), TRUE(), ATAN2(), ACOS(), DEGREES(), ARCCOSH(), ASIN(), ASINH(), ATAN(), CEILING(), COS(), RADIANS(), COSH(), EXP(), EVEN(), FACT(), FACTDOUBLE(), INT(), LOG(), LOG10(). RAND(), RANDBETWRRN(), SIGN(), SIN(), SQRTPI(), TAN()
0.2.12 (2020-11-28)
Add functions: PV(), XIRR(), ISEVEN(), ISODD(), ISNUMBER(), ISERROR(), FLOOR(), ISERR()
Bugfix unary operator needed to be right associated to handle cases of double use eg; double-negative.. –4 == 4
0.2.11 (2020-11-16)
Add functions: DAY(), YEAR(), MONTH(), NOW(), WEEKDAY() EDATE(), EOMONTH(), DAYS(), ISOWEEKNUM(), DATEDIF() FIND(), LEFT(), LEN(), LOWER(), REPLACE(), TRIM() UPPER(), EXACT()
0.2.10 (2020-10-30)
Support CONCATENATE
Update setup.py classifiers, licence and keywords
0.2.9 (2020-09-26)
Bugfix ModelCompiler.read_and_parse_dict() where a dict being parsed into a Model through ModelCompiler was triggering AttributeError on calling xlcalculator.xlfunctions.xl. It’s a leftover from moving xlfunctions into xlcalculator. There has been a test included.
0.2.8 (2020-09-22)
Fix implementation of ISNA() and NA().
Impement MATCH().
0.2.7 (2020-09-22)
Add functions: ISBLANK(), ISNA(), ISTEXT(), NA()
0.2.6 (2020-09-21)
Add COUNTIIF() and COUNTIFS() function support.
0.2.5 (2020-09-21)
Add SUMIFS() support.
0.2.4 (2020-09-09)
Updated README with supported functions.
Fix bug in ModelCompiler extract method where a defined name cell was being overwritten with the cell from one of the terms contained within the formula. Added a test for this.
Move version of yearfrac to 0.4.4. That project has removed a dependency on the package six.
0.2.3 (2020-08-18)
In-boarded xlfunctions.
Bugfix COUNTA.
Now supports 256 arguments.
Updated README. Includes words on xlfunction.
Changed licence from GPL-3 style to MIT Style.
0.2.2 (2020-05-28)
Make dependency resolution part of the execution.
AST eval’ing takes care of depedency resolution.
Provide cycle detection with reporting.
Implemented a specific evaluation context. That makes cache control, namespace customization and data encapsulation much easier.
Add more tokenizer tests to increase coverage.
0.2.1 (2020-05-28)
Use a less intrusive way to patch openpyxl. Instead of permanently patching the reader to support cached formula values, mock is used to only patch the reader while reading the workbook.
This way the patches do not interfere with other packages not expecting these new classes.
0.2.0 (2020-05-28)
Support for delayed node evaluation by wrapping them into expressions. The function will eval the expression when needed.
Support for native Excel data types.
Enable and update Excel file based function tests that are now working properly.
Flake8 source code.
0.1.0 (2020-05-25)
Refactored xlcalculator types to be more compact.
Reimplemented evaluation engine to not generate Python code anymore, but build a proper AST from the AST nodes. Each AST node supports an eval() function that knows how to compute a result.
This removes a lot of complexities around trying to determine the evaluation context at code creation time and encoding the context as part of the generated code.
Removal of all special function handling.
Use of new xlfunctions implementation.
Use Openpyxl to load the Excel files. This provides shared formula support for free.
0.0.1b (2020-05-03)
Initial release.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.