Optimizing Operating Rooms and Care Services using Deep Reinforcement Learning (OPERATE)

Nowadays the method for scheduling appointments relies on the availability of the medics, the care providers, the patients and the required resources (rooms, material, etc.) at a particular time. Once all is available, a surgery reservation is confirmed. However experience shows that the closer the intervention’s date is, the higher the risk of unavailability or something that disrupts the scheduling, becomes. The orchestration of these factors of uncertainty is currently managed by human experts: based solely on their experience, they are able to manage the exceptions whenever they occur and even anticipate them (for the best ones).

The complexity of the surgical scheduling problem has often led researchers to focus on one aspect of the problem at a time. The main advantage we have in this project is that we have access to data. So we will be able to study these records and calculate estimates of variability in time or cost for each task. We will base our work on what is best in research nowadays and improve it with recent research on deep learning. We aim to create a dynamic model which takes into account uncertainties. At first one has to select the features which will be the most significant for us to consider in order to maximize the operating room (OR) occupancy. Actually only few people use real data to optimize and adjust the scheduling techniques according to existing literature.

We can observe that dynamic learning techniques were used but
their model are concerned with the booking of patients into OR rather than (also) the scheduling of the OR themselves. Other people develop a batch scheduling framework to book a set of surgeries into an ordered set of available OR. OR booking is mainly concerned with the balance between OR utilization and OR overtime. They approximate the sum of procedure durations to a normal distribution and provide near-optimal solutions for stochastic scheduling and show that batch scheduling exhibits a better performance rather than open booking (sequential booking). Open booking books the first surgery case arrival to the first available and appropriate (in case of specialty, time available, etc.) slot. What we propose here is to optimize not only statically but dynamically based on the data we have at disposal and set up a schedule which has inputs and outputs, as an inventory system, the aim being to minimize the unused/wasted minutes here.

Effectively we will create a new scheduler, modelled as a direct acyclic graph (DAG), which is a modern way to represent scheduling problems and which can include complex dependencies and heterogeneous demands, in addition to be flexible and efficient. Note here that in relaxing one or the other of the constraints (i.e scheduling an independent set of heterogeneous tasks) leads to NP-complete problems already. Furthermore it is possible to automate it, i.e. when adding a new constraint, which would require a “re-design” with standard methods. Algorithms over graphs are generally designed by human experts but to meet very strong performances it becomes more and more challenging as the algorithmic literature is limited. What we aim to do here is to apply deep learning methods to challenging graph based optimization problems. The first problem is that the graph-structured data needs to be “transformed” into a Euclidian space before deep learning methods can be used (instead of using vectors as it is usually done). There is a family of representations called graph convolutional neural networks but they are quite specific to particular graphs, inspired essentially from images. There is a need of further understanding in this area and an adaptation to our scheduling problem (DAG). Once this step is done we would like to apply deep neural network techniques, or reinforcement learning methods to the problem of obtaining optimal schedules. The novelty here is to combine neural networks architectures and reinforcement learning methods to downstream graph optimization resulting in a new state-of-the art performance for scheduling. The second step concerns the learning part, from the sequence to the schedule. As stated the “timeseries” format is very well adapted for neural networks and that is especially why this intermediate step is necessary. Advantage of this representation is that it is not limited. The learning part has to be investigated as well. We suggest recurrent neural networks to be compared with reinforcement learning techniques to maximize the reward over time. Applying neural networks techniques based learning on graphs can lead to much more flexibility for being able to optimize complex schedules over time.

Collaboration: Calyps SA