For multiple apps we use workers on our server.
Now, over the past months weāve been adjusting settings according to multiple scenarios where the workers donāt (properly) start on server restarts.
It is now down to the point where we cant trust the workers running properly, and thus one of us checks the server daily.
We reĆÆnstalled the workers, allowing them to set-up new tasks (while deleting the workers and tasks before this).
We have used the trigger on the system start or on user logon, weāve tried starting the workers with or without prompt but everytime (after weeks) suddenly a restart of the server messes this up.
Daniel Sommers works with us for a while, and once i mentioned this to him, he thought to remember more companies having trouble with this very issue. I would like to find a robust and reliable solution.
One solution i see is using a 3rd party application, which might offer more options or;
One thing to keep in mind is that our ICT currently doesnāt allow a server account to be loged on automatically on a server restart. Luckily Task Scheduler has a option (āuitvoeren ongeacht of gebruiker wel of niet is aangemeldā), the worker then is started without prompt. The viktor UI shows geen icons for āStatus van integratiesā, maar bij werkelijk gebruik van de workers reageren ze totaal niet.
Ik hoor graag wat jullie hiervoor als oplossingen kennen?
Johan and I are in contact over internal actions to take at his organization in order to solve these issues. I will keep this thread updated if any findings are potentially applicable for other users
Could it also be possible that the connection between a worker and the platform is lost without any of them knowing?
We noticed some strange behavior with local workers after a short internet outage. The platform still thought the worker was connected, but when a job was send to it, nothing showed in the worker-console. Didnāt do any further investigation then, as a restart of the worker quickly solved the issue.
While all workers are online on the server. Sometimes a single worker just turns off.
I wonder what makes this unreliable, because it is a big issue on production.
I would like to be kept in the loop on this issue as well, as weāre encountering the same regarding status of integrations. The workers arenāt showing any errors, but the status on the platform is red.
By restarting the workers the issue is resolved for a few days (normally). Iāve set up an auto-reboot of the server each morning which has improved the situation a bit, but still the problem occurs from time to time.
The server and its workers have performed very stable for more than a year, but since december 2022 (or so) weāve encountered this issue.
We currently in the process of rolling out a solution that should solve this issue. Once everything is up and running I will post an update here again.
We have rolled out a solution that should solve the problems related to the workers seemingly losing connection to the platform. If problems still persist please let us know (either here or through email)
Just to add that I am facing the same issues of reliability with workers (generic). I cannot reliably connect to it and sometimes it takes two pressed of āsendā to connect to it.