Access the filetype from a viktor File object

Puijterwaal · 8 October 2021 10:07

I have an entity that can be created based on several different filetypes (.csv, .xls and .xlsx), and I want to be able to know the filetype as the processing of the file is different in each case.

Is it possible to access the filename/extension of the file that is uploaded, during the paramsfromfile? In that case I could do the appropriate processing straight away or save the file type in a hiddenfield for later use. However, I don’t think I can access the file type from the viktor File object directly?

Or do I need to wait until after upload? In that case what would be the most robust way to detect the file-type? I am currently just using params.name as that includes the extension, but the user can rename the entity after upload so that does not seem robust.

kvangiessen · 8 October 2021 10:41

Hi Paulien,

Thanks for posting on the forum.

The process_file method can have the entity_id argument. This allows you to retrieve information of the entity through the API.

@ParamsFromFile(...)
def process_file(file: File, entity_id: int, **kwargs) -> dict:
    file_name = API().get_entity(entity_id).name
    extension = file_name.split(".")[-1]

However, it is not always correct to assume that the provided filename including extension is correct. Any user can change the file-extension to something arbitrary and then upload this file, which would break your logic.

This should either be made very clear to the End-User, or you could to base your file parsing workflow on the file content rather than the file name.

e.g. When deciding whether to parse a .GEF CPT file or a .XML CPT file, one could check the file content whether it contains particular <xml tags> and choose the correct parsing strategy accordingly.

Puijterwaal · 11 November 2021 10:48

I’m hitting this problem again, however in this case I would like to upload multiple files at once and be able to select one of them trough the name / extension

In this case, the workaround is for the user to upload the files as a single zip, and unzip them upon upload. However, that means I can’t use the entity name to distinguish the files, fortunately the zipfile package allows you to access the filename.

The code then looks as follows:

    uploaded_files = API().get_entity(entity_id).get_file()

    with zipfile.ZipFile(BytesIO(uploaded_files.getvalue_binary())) as zipped_files:
        files = []
        for uploaded_file in zipped_files.filelist:
            files.append({"file": File.from_data(zipped_files.read(uploaded_file.filename)), "name": uploaded_file.filename})

You now have a list with dicts with both the file and the filename, and you can process your files accordingly