Plot the time distribution for each queue and activity and produce a professional visualisation with each stage queue as a subplot.

TASK – Process optimisation Background
Police forces across the UK use Crime Scene Investigators (CSIs) to collect forensic evidence from crime scenes. For non-priority crimes (e.g. burglary) matching DNA to people on existing crime databases usually takes around a week or two with current technologies and processes. A key bottleneck is the speed of the DNA sequencing machine, which takes 48 hours to produce results.
Newer digital and DNA technologies mean results can be gained from this forensics evidence faster than ever before, with Rapid DNA machines that can sequence a sample in 2 hours. However, improving turnaround times also requires speeding up processes as well as investing in new technology, as much of the week long wait to get results is due to queues that form in the process (from CSI collection to lab prep and eventual validation of the results).
A trial of Rapid DNA forensics was undertaken by West Yorkshire Police (WYP) in 20171. The aim was to try and produce results in less than 24 hours, as this is considered the longest detectives can wait to get useful operational intelligence that increases the chances of conviction. You will replicate the analysis of this trial.
Optimising the Rapid DNA operational process
The current DNA matching process
The current DNA matching process is as follows. First, a petty crime is committed. One of a group of CSIs attends the scene and if any blood or saliva evidence is available it is swabbed and taken back to the station by the CSI at the end of their shift (see Figure 1 for an overview of the process) 2. The CSI must input all the information regarding the forensic evidence they have collected during the day into the forensics system. The DNA samples are then transported to the lab. The samples are first prepared, before being put in a queue for the DNA sequencer machine. Current DNA extraction machines are slow but very accurate, and can process up to 100 samples in one sequencing process. It takes 2 days to run the DNA sequencer, so the machine is run twice a week, once on a Wednesday morning, and once on a Friday afternoon to maximise the number of samples that can be run at one time whilst not unduly delaying results.
Each sample run through the sequencer must be validated by a senior lab researcher before it is sent to the ID database for matching. Only 2% of samples are matched to profiles on the database.
A Simul8 file of the current DNA matching process is included in the coursework material3. More detail of the process is given in Table 2 below.
Figure 1: Current DNA matching process, from CSI visit to identifying a match in the national DNA database.

Task – Sensitivity Analysis of current DNA matching process
1. Using the accompanying Simul8 file, produce a table of the average total time in system, and the operational characteristics for each stage of the process (as listed in Table 2). You MUST state the times in minutes and the relevant low and high 95% range.
Set Random Number Seed to 393 before running simulation.
Referencing other units such as hours, or not referencing units at all, will attract
a zero mark for this task.
Not referencing the low and high 95% range will also attract a zero mark for the
task.

Run the model for 500 trials to get low and high 95% ranges.

How many trials are needed to converge the model if all relevant parameters are monitored? (i.e. total time in system, average number in queue, average time in queue for each stage). Is the 500 trials enough?

According to the average times in queues and activities, identify the main stages of the system that delay the samples from being matched.
Run ONE trial (using the play button), using the Random Number Seed 393.

Extract the data from the Simul8 spreadsheet “Times” into the CSV file “Time_spreadsheet.csv”, making sure to preserve, and line up with, the headings.

For all samples that have passed through the complete forensics process, create
a dataset of the time spent in each queue and activity for each sample.

Plot the time distribution for each queue and activity. Produce a professional
visualisation with each stage queue as a subplot.

Making reference to the plotted distributions, describe the impact of the variance
in the queues on the total time in the system.

Conduct an appropriate sensitivity analysis of the total time in the system. Which
stages have most effect on the variation of the total time in system?

Finally, add a column to your results table indicating the 5th and 95th percentiles
of the distribution for each queue and activity.

Scenario 1 – simple replacement of old sequencers with new Rapid DNA sequencers
In this scenario the older DNA sequencer machines are replaced with newer Rapid DNA machines (see Table 3 for differences in specifications). As the Rapid DNA machine can only hold up to 8 samples at a time, the lab has made the decision to simply run the machine when there are enough samples to completely fill the cartridge. This means that there is less waiting time for samples to be run, but only eight can be processed at one time.
To monitor the Rapid DNA machine and prevent processed sequences waiting a long time for validation, the lab has employed extra staff to ensure that samples can be processed quickly. Therefore, validation now takes place as and when samples are ready throughout the day, rather than on Friday and Monday mornings. Validation also takes longer as staff are unfamiliar with the results from the Rapid DNA machine.

Scenario 2 – addition of courier to the Rapid DNA process
Scenario 2 is as Scenario 1, except now a courier picks up the samples as they are taken by the CSI (see Figure 2 for process map). To make this work, several things must happen. A courier is on standby during CSI working hours, and the CSI rings the courier up once they have a DNA sample to collect. The courier will make a tour of CSI locations around the day, picking up samples and delivering them back to the lab as soon as is possible.
To facilitate this CSIs are now required to log DNA samples on a networked tablet at the crime scene, rather than back at the station at the end of the day. This adds an average of 15 minutes to each crime scene visit, but means the CSI can stay out longer as they no longer need to come back to the station to log evidence. The changes to the parameter estimates for the Simul8 model are given in Table 4.
Figure 2: Rapid DNA process map with courier transport.

Scenario 3 – 24 hour CSI visits
This scenario is the same as Scenario 2 (see Figure 2), except now CSIs provide 24 hour coverage, 7 days per week. We want to model this because this is an obvious way to speed up the process, where waiting for the CSI visit can be a major delay (especially over the weekend).
The changes to the parameter estimates for the Simul8 model are given in Table 5.
CSI availability is now spread over a much longer time period, so a fair utilisation of CSIs can be met with only one CSI operating at any one time.

Scenario 4 – 24 hour lab
This scenario is the same as Scenario 2 (see Figure 2), except now the lab provides 24 hour coverage, 7 days per week. The CSIs work their traditional Monday to Friday shifts. The changes to the parameter estimates for the Simul8 model are given in Table 6.

Scenario 5 – Complete 24 hour operation
This scenario combines Scenarios 3 and 4, so that both the CSI visits and the lab provide 24-hour coverage, 7 days per week.

Tasks – Scenario Analysis
Modify the current DNA matching process to model Scenarios 1-5.

Run each scenario model for 500 trials, using a Random Number Seed of 393.

Estimate the minimum, average and maximum times in system for each scenario.

Produce a table listing these values in minutes, including the low and high 95% ranges.
a. Again, incorrect usage of units and/or missing ranges will attract a zero mark.
8. Run ONE trial of each scenario using a Random Number Seed of 393.
Extract out the waiting time distributions for each activity.

Investigate how the changes in each scenario change the relevant waiting time
distributions (for example, how does changing the CSI schedule from Mon-Fri to 24/7 influence the distribution of the time to wait for a CSI to arrive and collect a sample?

Therefore explain how each scenario works to reduce the overall time in system by reducing the waiting time distributions.

Return on Investment (ROI) analysis
The objective of the new Rapid DNA process is to provide evidence to catch criminals faster. However, the move to new machines, networked working and 24 hour processes will incur significant cost increases. In this current age any expenditure must be justified, so you are asked to conduct a ROI analysis to show how the new process may improve matters.
A popular theory of policing is the “golden hour”, where the quicker a suspect can be identified, the greater the chance of successful arrest. Thus, much of the aim of policing is to be present as quickly as possible. This applies to the collection of forensic evidence.
Forensic evidence typically isn’t considered enough to convict a suspect, so in the case of petty crimes such as burglary and theft, investigators want to identify suspects quickly so that they can catch the offender with the stolen goods. A rough rule of thumb is that if a suspect can be identified within 2 days, there is a much better chance of arresting the suspect.
After extracting historical arrest data, you have constructed a reasonable objective function for the probability of arrest, which is shown in Figure 3.

Figure 3: Probability of arrest as a function of days since crime was committed. This is the objective function to be used in the ROI analysis.
You have also estimated the costs associated with each scenario, which are listed in Table 7.

Tasks – ROI analysis
A standard “ROI” metric for policing is arrests per £m spent. Assuming a constant rate of 5000 DNA samples per year and that the average time in system for each scenario is an appropriate metric to assess the ROI, estimate the ROI for the traditional DNA forensics process and the proposed Rapid DNA scenarios. Present your findings in a suitable graph.
Discuss the ROI findings. What is your recommendation? Do you recommend moving to a Rapid DNA process? If so, which scenario do you recommend?
Consider the distribution (not just the average) of times in system for each scenario. What effect do you think the distribution of the times in system for each scenario will have on the estimated ROI? Can you incorporate the distribution of time in system into your ROI estimate? Does this change your recommendations?

Plot the time distribution for each queue and activity and produce a professional visualisation with each stage queue as a subplot.