Excel in VIKTOR: some tips and tricks (featuring openpyxl)

mslootweg · 22 May 2024 09:39

Hi all,

Since the inception of the Excel app generator, there have been a surge of questions on other ways to use Excel within VIKTOR, most notably how to use Python to make changes to Excel sheets, but still use VIKTOR to do the calculations within Excel.

For this, I would like to dedicate today’s Snippet Wednesday to showing an example of using openpyxl, a Python library to read/write Excel files, to help you advance your Excel manipulation skills with Python, and leveraging the result in a VIKTOR app.

Snippet

Let’s start by presenting an example problem: I want to manipulate an existing Excel sheet, changing values of certain cells, that can be used later on in the process to calculate the results.

For this, I will be using openpyxl, so make sure to add this to your requirements.txt.

Fill spreadsheet

In the example, I will be changing two input: a single cell, and a table. Here is the snippet on how one can do this:

from io import BytesIO
from pathlib import Path
from typing import List, Dict

from openpyxl.reader.excel import load_workbook


def fill_spreadsheet(my_number: float, my_table: List[Dict]) -> BytesIO:
    # Load your workbook
    print("Load workbook...")
    wb_path = Path(__file__).parent / 'spreadsheet.xlsx'  # make sure to have the spreadsheet name changed
    wb = load_workbook(wb_path)

    # Select a sheet by name
    sheet_name = "my_first_sheet"
    print(f"Select sheet {sheet_name} and fill values....")
    sheet = wb[sheet_name]

    # Select a cell and assign a new value
    sheet['B3'] = my_number  # assign a value to the cell

    # An example how to add values to a table
    for row_nr, row in enumerate(my_table):
        row_id = 10 + row_nr  # the row that is filled is the location of the first row + the row number
        sheet[f'A{row_id}'] = row['column_1']
        sheet[f'B{row_id}'] = row['column_2']
        sheet[f'C{row_id}'] = row['column_3']
        sheet[f'D{row_id}'] = row['column_4']
        sheet[f'E{row_id}'] = row['column_5']
        sheet[f'F{row_id}'] = row['column_6']
        sheet[f'G{row_id}'] = row['column_7']
        sheet[f'H{row_id}'] = row['column_8']

    # Save the workbook
    print("Save workbook...")
    excel_file = BytesIO()
    wb.save(excel_file)
    return excel_file

To summarize the snippet, here are the steps explained:

A function is created with the two arguments: my_number (a float) and my_table (a list of dictionaries, with keys “column_1” to “column_8”)
The Excel sheet that will be manipulated is loaded using openpyxl.reader.excel.load_workbook (in the example “spreadsheet.xlsx” is used)
The sheet which needs to be manipulated is then selected (in the example “my_first_sheet”)
The cell “B3” is then filled with the my_value variable.
The table, that starts in row 10, is filled with the my_table variable.
The changed workbook is saved to a BytesIO object, and returned.

Compute results

Once you have values filled, you want the Excel sheet to be evaluated. For this VIKTOR’s SpreadsheetCalculation can be used. Here is the snippet:

from viktor.external.spreadsheet import SpreadsheetCalculation


def calculate_excel(excel_file: BytesIO):
    sheet = SpreadsheetCalculation(file=excel_file, inputs=[])
    print("Do calculation...")
    sheet.evaluate(include_filled_file=True)

    return sheet.result.file_content

In summary the following is done: a file is passed in the function as a BytesIO object, loaded into the SpreadsheetCalculation object, evaluated, and finally the results are returned.

Note: The spreadsheet should meet the requirements of the SpreadsheetCalculation, which includes not having Macros, and having the spreadsheets “viktor-input-sheet” and “viktor-output-sheet” included in the Excel sheet.

Application

This can be incorporated in a VIKTOR application, where you can download the results, visualize the results as a PDF, or post-process it to incorporate in other processes. Here is a snippet of a VIKTOR application that:

Downloads the evaluated Excel sheet through a DownloadButton
Renders a PDF of the evaluated Excel sheet
Visualizes the result of the table as a DataView

from io import BytesIO
from pathlib import Path
from typing import List, Dict

from openpyxl.reader.excel import load_workbook
from viktor import progress_message
from viktor.core import ViktorController
from viktor.external.spreadsheet import SpreadsheetCalculation
from viktor.parametrization import ViktorParametrization, NumberField, Table, TextField, DownloadButton
from viktor.result import DownloadResult
from viktor.utils import convert_excel_to_pdf
from viktor.views import DataItem, DataGroup, DataResult, PDFResult, PDFView, DataView


def fill_spreadsheet(my_number: float, my_table: List[Dict]) -> BytesIO:
    # Load your workbook
    progress_message("Load workbook...")
    wb_path = Path(__file__).parent / 'spreadsheet.xlsx'  # make sure to have the spreadsheet name changed
    wb = load_workbook(wb_path)

    # Select a sheet by name
    sheet_name = "my_first_sheet"
    progress_message(f"Select sheet {sheet_name} and fill values....")
    sheet = wb[sheet_name]

    # Select a cell and assign a new value
    sheet['B3'] = my_number  # assign a value to the cell

    # An example how to add values to a table
    for row_nr, row in enumerate(my_table):
        row_id = 10 + row_nr  # the row that is filled is the location of the first row + the row number
        sheet[f'A{row_id}'] = row['column_1']
        sheet[f'B{row_id}'] = row['column_2']
        sheet[f'C{row_id}'] = row['column_3']
        sheet[f'D{row_id}'] = row['column_4']
        sheet[f'E{row_id}'] = row['column_5']
        sheet[f'F{row_id}'] = row['column_6']
        sheet[f'G{row_id}'] = row['column_7']
        sheet[f'H{row_id}'] = row['column_8']

    # Save the workbook
    progress_message("Save workbook...")
    excel_file = BytesIO()
    wb.save(excel_file)
    return excel_file


def calculate_excel(excel_file: BytesIO)
    sheet = SpreadsheetCalculation(file=excel_file, inputs=[])
    progress_message("Do calculation...")
    sheet.evaluate(include_filled_file=True)

    return sheet.result.file_content


class Parametrization(ViktorParametrization):
    my_number = NumberField('My number', default=40)

    table = Table('My table')
    table.column_1 = NumberField('Column 1')
    table.column_2 = TextField('Column 2')
    table.column_3 = TextField('Column 3')
    table.column_4 = NumberField('Column 4')
    table.column_5 = NumberField('Column 5')
    table.column_6 = NumberField('Column 6')
    table.column_7 = NumberField('Column 7')
    table.column_8 = NumberField('Column 8')

    download_report = DownloadButton('Download report', method='download_report', longpoll=True)


class Controller(ViktorController):
    label = 'My Entity Type'
    parametrization = Parametrization

    def download_report(self, params, **kwargs):
        """The method that is triggered when clicking on the Download button with the method name `download_report`"""
        memory_file = fill_spreadsheet(my_number=params.my_number, my_table=params.table)
        excel_file = calculate_excel(memory_file)
        return DownloadResult(file_content=excel_file, file_name="spreadsheet.xlsx")

    @PDFView('Report', duration_guess=10)
    def pdf_view(self, params, **kwargs):
        """Renders the PDF view."""
        memory_file = fill_spreadsheet(my_number=params.my_number, my_table=params.table)
        excel_file = calculate_excel(memory_file)
        with BytesIO(excel_file) as f1:
            pdf = convert_excel_to_pdf(f1)
        return PDFResult(file=pdf)

    @DataView("OUTPUT", duration_guess=10)
    def visualize_data(self, params, **kwargs):
        memory_file = fill_spreadsheet(my_number=params.my_number, my_table=params.table)
        excel_file = calculate_excel(memory_file)

        progress_message("Extract results...")
        wb = load_workbook(filename=BytesIO(excel_file), data_only=True)

        sheet_name = "my_first_sheet"
        sheet = wb[sheet_name]

        # An example how get values from table
        data_item_list = []
        for row_nr, row in enumerate(params.table):
            row_id = 10 + row_nr  # the row that is filled is the location of the first row + the row number
            value = sheet[f'I{row_id}'].value
            data_item_list.append(DataItem(f'Data item {row_nr + 1}', str(value)))
        data = DataGroup(
            *data_item_list
        )
        return DataResult(data)

matthijs · 22 May 2024 09:47

@mslootweg: super nice that you share this here! Can image some are wondering how this relates to the spreadsheet templater that VIKTOR ships, so let me add this here.

When would i use openpyxl and when would i use the spreadsheet templater?

In short: the spreadsheet templater makes sure that you are protected against injection attacks by end-users. The spreadsheet templater is a thin shell around openpyxl, so it’s not fundamentally different (and in general more limited). The difference is that it is run in a separate protected environment, such that any template/injection attacks by end-users can not affect the app. Read more about this in the docs: Documents & spreadsheets - render_jinja_template | VIKTOR Documentation