Parallel Discrete Event Simulation
Department of computer Science
Virtual University of Pakistan
Abstract: Execution of simulating application on particular computer system, figure of simulation performance should be shortest period of time. Exploitation of parallelism is the best technique to achieve the minimums time of simulation. But parallelizing creates new problems in simulation. As we know that events are generated distributed fashion so the occurrence of errors can result the sequence in which to process the events compulsory indeterminate. In this research paper we try to present a model for analyzing inherent parallelism of simulation, with survey of existing techniques for parallel simulation perform. Some innovations in these model are described, results in reliable evaluation of these techniques effectiveness.
In this paper we are interested to study the performance of applications on the class of computing systems. We try to differentiate the following levels which are related to performance evaluation such as general abstract machine, application, language of simulation and simulator of discrete event. Each lower level supports the upper level in simulation of discrete events. Efficiency of each level is determined partially by imposing some constraints to the simulator in this way.
Particularly if performance figures are used iteratively for the application optimization, effectiveness of simulator play important role. Huge discrete events simulation required a lot of time in case of sequential machines. Exploitation of parallelism is best technique to minimize the needed time of simulation. Main disadvantage of this type of simulation is the inherent complexity since the global notion cannot map easily on parallel computers.
To ensure the cause and effect relationships sophisticate clock synchronization algorithms needed which are correctly reproduced by simulator. The concept of parallel simulation was firstly suggest by K.M Chandy and Bryant  consist concepts of primary parallel simulation, deadlock problem deadlock resolution techniques and deadlock recovery. Other techniques were suggested by D.R Jefferson  which is based on virtual time concepts. Remaining research paper is organized as followings. In section II we will introduce some discrete event simulation concepts. In section III we will discuss discrete event simulation from sequential to partial with their effectiveness and suggested parallel simulation methods. Finally we evaluate these methods and present some suggestions for future research work.
- Discrete event simulation concepts
Simulation and modeling can be divided as the complex of activities which are associated to constructing model for real word system and simulate them on computer system. It is necessary for each model should be on time based events. Simulation models can categorize according to their temporal behavior . A models is called continuous time model when flow of time smoothly and continuously. If time flow jumps after some particular time unit it is called discrete model. Second type of model is based on sets range of models descriptive variables. The model is called continuous state mode if range of descriptive variable can be shown by real numbers. If model accepts only discrete values it is called discrete model. Continuous time base model can be further classified in discrete events and differential equations. If changes in state occur continuously and smoothly in time this model is call differential equation model is a continuous time continuous state model. If state change occur at only finite points in time it is called discrete event model even if time flows continuously such as jump of time from one event to other and these events can be occur independently from each other.
- Discrete event simulation
Idea of system and model of system, we already have discussed in classification of simulations. These ideas required particular order for framework development of discrete event model of the system. Main ideas of simulation modeling are as follows.
- System: Set of entities which can interact each other to achieve some goals.
- Model: Abstraction of system representation under some constraints commonly having logical or mathematical relations that represents the behavior of the system.
- System State: It is a set of variable which have all information required to describe the system at any time.
- Entity: Any component or object in the system which needed explicit representation in the system model.
- Attributes: Properties of the given entity are called attributes.
- Event: An event is change in the state of system which may occur instantaneously.
- Activity: If during particular interval of time entities engage some operation called activity.
- Process: A process is a sequence of events that occur order in time. Process should be logically connected and involved with same entity. To understand this idea we should take analogy of bank. In perspective of banking customer is an entity which can interact with other entity called bank. Balance of the customer in his account might be one of the attribute. Making deposit and withdrawal are activities performed by the customer. Number of busy tellers, number of customers waiting for their turn, next customer arrival time is the variables of the possible state. Possible events are arrival time of the new customer and time require serving the customer waiting for his turn in the queue. To model the flow of time each discrete event simulation has a state variable called simulation clock. Simulated time is advanced from the time of the present event to the time of following scheduled event; therefore skipping periods of inactivity. More events are stored in a calendar which consists on time and kind of all events scheduled, commonly in historical order. The nature of the routine dependent of the world view used in the system model. So we try to understand different views of the world related to discrete event simulation.
- Word views
- All simulations have executive routine for calendar and clock management such as events sequence and simulation driving. Next scheduled events are fetched by executive routine, for transfer of control to specific routine and clock of simulation. Operation routines are dependant of word view, events, processes and activities. A point of view which modeler enables to view the word or modeled system is called world view. Commonly discrete event simulation uses any three of the following perspective  scanning activity, interaction of processes or event scheduling. Every kind of events has its corresponding event routine in event scheduling. Routine of executive process a time ordered calendar events select an event for execution. Events notifications contain time stamp and reference to an event routine. Execution of events can be scheduled new events by creating event notice and placing it at the best position in the calendar. At the time of next event, the one at the top of calendar clock is always updated. In approach of scanning activity simulation consist of activates list each of these are defined by two events such as start of the event and accomplishment of event. Every activity have actions and test conditions. Satisfaction time activities, test conditions and actions executes for first selectable activity is scanned by executive routine. Scan start again one completion of execution activity. Focus of process interaction world view on the entities flow through a model. This technique views system as sets of concurrent interacting processes. Each class entity behavior during its lifetime is explained by the process class. Process classes main contain multiple entities and exits at which process interacts with its environments. To keep tracks forthcoming tasks executive routines uses a calendar. Process identity and recording activation time executive routine must maintain the state in which last process was suspended. By using of these three worlds view technique put high computation demand on sequential computer in large discrete event simulations. The interaction of process world view just like as attractive as a starting point in our effort of simulation parallelization. Concurrent objects interacting with each other well defined communication perceives by modeler simulation. Parallel simulation is interesting because it describes the domain of problem and commonly consist substantial amount of inherent parallelism . In this part a parallel view for sequential execution will be presented for analyzing simulation inherent parallelism.
- Discrete event simulation from sequential to partial
- Measurement of average parallelism
If we want to do the simulation in parallel, there are some basic questions which must be answered. How much advantage we can expect from doing things in parallel fashion? What is the inherent simulation parallelism? How well did we perform this once the job is done? Average parallelism is very interesting characterization of the simulation which can be used to answer these questions. Parallelism can be defined in two equivalent ways 1. Total service time required ratio, critical path length through simulation execution. 2. The speedup figures, if a hypothetical machine have an unbounded number of available processors and zero synchronization overhead.
According to second definition average parallelism figure must be regarded as an upper bound to speed up which can be achieved. We are required to implement a tool for analyzing sequential simulation to reveal the average parallelism and extraction of average parallelism . For expressing the parallelism explicitly and contain hardware component and software components are defined in system model. Sequential simulation execution is represent by software component is the form of graph. On the other hand our system model hardware components show our focus on simulation parallelism inherent and make ideal hardware assumptions.
Figure 1: A program activity g
Simulation execution are represented above by using an acyclic directed graph.(Figure.1).Each graph vertex shows an event that is occurring in simulation. Process constraints exist among events, modeling and chronological order of events. The precedence of constraints is modeled by using graph arcs: An arc from vertex EA to vertex EC shows that event EC cannot occur before process of EA event. Two types of arc can be categorized as inter-process arcs and intra-process arcs. In intra-process arcs precedence of event constraints which occur within same process such as arc between EC and EA in figure 1. Intra-process arcs are denoted by sequential work independent unit inside the process. We can understand inter-process arcs as constraints of precedence between events which occur in different processes. Such as arc between vertex EC and EB. Synchronization needs are achieved by some communication primitive in inter-process. Hardware parts of system are modeled as unlimited number of identical processors, each of unit speed. Whole computer system is assigned to one single task and processors have zero synchronization overhead. A serial run of simulation generates an acyclic directed graph of events with constraints precedence. When each process in simulation is allocated to different processors, every intra-process dependent event occurs at same exclusive processor and each inter-process dependent event occurs at different processors. On the other hand intra-process arcs are denoted by independent sequential unit work on a processor, while inter-process arc shows the requirements of synchronization between processors. Time of execution independent units of work, during sequential run operation is measured and allocated to intra-process arcs and zero cost of synchronization to inter-process arcs. Simulation on a hypothetical machine graph is reduced to a representation. The sum of time needed to process the event is equal to the total of all costs in the graph and critical path in execution of the simulation shows the longest path in the graph. The average parallelism measurement for expression of lower bounds on speed up and efficiency, incremental benefit and additional allocating cost of processors used by eager . According to our proposal average parallelism can be used for the measurement of effectiveness evaluation of different methods in parallel simulation. In simple words, how much parallelism that is inherent actually exploited in simulation?
- Basic problems in parallel discrete event simulation
We are particularly interested in simulation of asynchronous system parallelization where synchronization of events is not done by global clock but these are occurring at irregular intervals of time. If some events occur at any single point in time of simulation parallelization methods that are based of synchronous execution using global clock simulation perform poorly. Execution of events at different points concurrently in required simulated tome but it creates interesting synchronization problems. If we examine the sequential discrete event simulator operation these problems become clear. Sequential simulator use three type of data structures: an event list, state variable and global simulation clock commonly. For execution routine least time stamp event is chosen from event list to be processed next. If we ignore the rule and choose another event having high time stamp, there is a possible for execution to change the state variable. This type of error is called causality error. Let we try to understand the above scenario of simulation based parallelization. Commonly simulation of parallel discrete event techniques adhere to a interaction world view process which strictly forbids processes to direct allocated to shared state variables. To support the parallel execution in simulation some extension has be made . The system which is being modeled is considered as some number of physical processes which can interact at various simulation points of time. Simulation is constructed in the form of logical process LP0, LP1 and so on, one per physical process. Interactions among physical process are modeled event time stamp messages sent among corresponding logical processes. Every logical process have state portion corresponding to physical process and local clock which denotes the process progress.
Local causality constraint: A discrete event simulation is combination of logical processes which interact exclusively by sharing time stamp message, follow the local causality constraint if and only if every logical process executes events in order of non decreasing time stamp. Let suppose two events E1 at logical process LP1 have time stamp 10, and E2 have time stamp 20 according to the figure 2. In this case if event E1 schedule a new event E3 for logical process LP2 having time stamp less than 20, the E3 will effect E2, necessarily all three sequential execution. If one had no information regarding scheduled events by other events, then it will enforce to process only saved event, the one having shortest time stamp, results sequential execution. We should decide whether E1 can be executed with E2 concurrently during the simulation or not but problems here is that how do we know that E1 will not affect E2 without performing actual simulation for E1? To answer this question we should address parallel discrete event simulation techniques. We classified parallel discrete event simulation techniques in two classes called conservative and optimistic. Conservative techniques strictly avoid any causality error ever occurring.
Figure.2 Error of causality
These approaches depend on some strategy to decide whether it is safe to process an event. Optimistic approaches use recovery and detection techniques when casualty errors are detected a rollback mechanism to recovery is invoked. We elaborate few concepts regarding conservative and optimistic simulation machines.
- Conservative Methods
These are first shared simulation mechanisms. Primary problems of conservative mechanisms should be address before determination of event is saving to process. If a process have event E1 with time stamp T1 and process is able to decide that it is not possible to receive another event with time stamp less than T1, the process can safely process event E1 without local causality constraints violation. In case of no safe events it should be blocks so that it cannot lead to deadlock situations. Chandy and Misra [1 ] and Bryant [ 2] design algorithm of parallel discrete event simulation where on statically determine the link that shows which process can communicate with other processes. In case of safe to process a message it is needed that messages from any process to any other processes are transmitted chronological order according to time stamps. Every link has linked with clock which is equal to time stamp of the message at the front of link, if the queue is empty the time of last received message. This process is repeated again and again if message found in the link queue it updates its local clock to link’s clock and process the message. Event processing order will be correct because all latest messages have latest time stamps than local clock, so they will arrive in chronological order with each link. The process will be blocked if the selected queue is empty. There is possible that message received over link may have less time stamp to all other input time stamps. The process is forced to wait for a message updating clock on the link before local clock updating. This protocol ensures that every process will in non decreasing order events. Dead lock occur when a process is waiting for other process to be complete a cycle of block is created. See figure 3 for better understanding. Every process is waiting for incoming link having shortest clock value due to corresponding empty queue. All of these processes are blocked and waiting to be process.
Figure 3: An example of deadlock. (The numbers indicate time stamp)
For dead lock avoidance messages are used. This technique needed strictly positive lower bound on the look ahead for least one process in every cycle. Look ahead is amount of time that a process can look into the future. In simple words if local clock of the process is any time T and process will be able to predict all messages it will send with time stamps less than T+L while L is the look ahead. A strictly lower positive bound for time of service for some stations would be needed. Process keeps output links ahead of their local clocks by sending null messages. Null message having null time stamp or process LPA to LPB shows that there will be no more messages having less than null time stamp for process LPA. When process complete event processing it sends message on all output ports showing lower bound on the time stamp of the next outgoing message. Receiver computes new bounds on outgoing links and sends information to nearest link and so on. Chand and Misra  introduce double phase scheme where simulation proceeds till dead lock, then dead lock is detected and resolved. In this technique no null messages are created. This technique have process controller for dead lock monitoring and dead lock recovery control. Dead lock can be resolved by the messages having less time stamp is always save to process with shared computation, get lower bound to enlarge the safe messages set. This technique can detect and recover only global deadlocks. Other schemes for detection and recovery of dead lock can be found in .Conservative performance is determined critically by degree to which a process can look ahead and make future events prediction. It also ensures that no other than ones message generated up to time clock +L.
- Optimistic methods:
In this methods processors clock may run ahead of incoming link clock and if error occur recovery procedure is invoked. This method does not need to determine when it is safe to proceed. Benefits of these methods are larger speedup potentially and possible interaction of topology between processes not required to be known. Events can be executed by processes and proceed in local simulated time as long as they have input at all. Recovery is performed by undoing the effects of all events processed permanently by the process receives straggler. Execution of premature events results in two things that are needed to roll back: logical process state and event message to other processes. This task is done by saving process state periodically and restoring old vector sate on roll back. Positive messages are those messages which are sent during process is propagating forward in simulated time. If a method receives associate anti-message that corresponds to a positive message that’s still within the input queue, then the 2 can eradicate one another and also the method can proceed. If AN anti-message arrives that correspond to a positive message that’s already processed, then the method has created miscalculation and should conjointly roll back. Current state is updated with last vector state saved with simulated time earlier than message time stamp. Direct result of boll back mechanism is anti-messages may be sent to other processes. GVT is minimum of LVTs for all messages time stamps and processes sent but unprocessed. Anti message are not sent after roll back immediately in lazy cancelation. If same message is created than there is no required cancelation. An anti-message created at simulated time T is simplest sent after the procedure’s clock sweeps past time T without regenerating the equal message. Successor process roll back may be avoided in lazy cancellation. In other case if messages are not regenerated than successor process roll backs is required in both methods and aggressive cancellation will occur. Lazy cancellation can improve or degrade performance. At greater over head expense roll back overhead states may save less frequently. As result lazy cancellation needed more memory than aggressive cancellation.
- Discussion and conclusions
Performance evaluation is critical for improvement of complex applications and design implementation on parallel computers. Commonly analytical techniques are not enough for performance evaluation that are based on unrealistic assumptions and needed many approximations. So simulation is the best tool for correct performance measurement. Parallel simulation like as promising technique for speed up simulation but more improvement is required to increase the effectiveness of existing methods. Conservative methods have ability to handle certain classes of problems. Main disadvantage of this technique is that it cannot exploit the parallelism available in simulation. As result conservative algorithms depends on look ahead to get good performance. Optimistic methods have potential for general purpose simulations. Disadvantage of optimistic methods are incorrect execution computation, roll back And correction cost. Likewise, the further the erroneous calculation spreads the further it moves into the reenacted time future, along these lines bringing down its need for execution. Performance is given to only small time stamps. Incorrect computation decreases the speed; allow error detection and correction before damage has been done. Problem with optimistic method is that it required periodically save the state of every logical process. Limitations of optimistic methods are amount of computation needed to process an event is significantly higher than the cost of saving vector state. For consideration of appropriate method for shared simulation class of application is important. In conservative methods irregular interactions and time wrap techniques are performed. Conservative algorithm will be able to exploit special structure within a fixed topology system if application has good properties of look ahead. An object based technique can be employed if processors are forced to remember private variables. Class must encapsulate entity attributes, actions and life cycle. Objects communication is done by using well defined interfaces. Optimistic methods are required hierarchical knowledge for error detection and correction to stop spread of erroneous computations. As result we have interface having load balance and scheduling strategies that obscure the PDES strategy effectiveness. To obtain the effectiveness of the strategy exploited parallelism can be compared to the average parallelism.
- Chandy, K. M., & Misra, J. (1979). Distributed simulation: A case study in design and verification of distributed programs. IEEE Transactions on software engineering, (5), 440-452.
- Bryant, R. E. (1977). Simulation of Packet Communication Architecture Computer Systems(No. MIT/LCS/TR-188). MASSACHUSETTS INST OF TECH CAMBRIDGE LAB FOR COMPUTER SCIENCE.
- Jefferson, D. R. (1985). Virtual time. ACM Transactions on Programming Languages and Systems (TOPLAS), 7(3), 404-425.
- Zeigler, B. P. (1976). Theory of modelling and simulation John Wiley. New York.
- Hooper, J. W. (1986). Strategy-related characteristics of discrete-event languages and models. Simulation, 46(4), 153-159.
- Livny, M. (1985). A study of parallelism in distributed simulation. In Proceeding of the SCS Multiconference on Distributed Simulation, January 1985(Vol. 15, No. 2, pp. 94-98).
- Lin, Y. B., & Lazowska, E. D. (1990). Exploiting lookahead in parallel simulation. IEEE Transactions on Parallel and Distributed Systems, 1(4), 457-469.
- Chandy, K. M., & Misra, J. (1981). Asynchronous distributed simulation via a sequence of parallel computations. Communications of the ACM, 24(4), 198-206.
- Misra, J. (1986). Distributed discrete-event simulation. ACM Computing Surveys (CSUR), 18(1), 39-65.