# pm4py.algo.simulation.playout.dfg.variants package

## pm4py.algo.simulation.playout.dfg.variants.classic module

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.simulation.playout.dfg.variants.classic.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

ACTIVITY_KEY = 'pm4py:param:activity_key'
INTERRUPT_SIMULATION_WHEN_DFG_COMPLETE = 'interrupt_simulation_when_dfg_complete'
MAX_EXECUTION_TIME = 'max_execution_time'
MAX_NO_OCC_PER_ACTIVITY = 'max_no_occ_per_activitiy'
MAX_NO_VARIANTS = 'max_no_variants'
MIN_WEIGHTED_PROBABILITY = 'min_weighted_probability'
RETURN_VARIANTS = 'return_variants'
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'
pm4py.algo.simulation.playout.dfg.variants.classic.apply(dfg: Dict[Tuple[str, str], int], start_activities: Dict[str, int], end_activities: Dict[str, int], parameters: Optional[Dict[Union[str, pm4py.algo.simulation.playout.dfg.variants.classic.Parameters], Any]] = None) Union[pm4py.objects.log.obj.EventLog, Dict[Tuple[str, str], int]][source]

Applies the playout algorithm on a DFG, extracting the most likely traces according to the DFG

Parameters
• dfgComplete DFG

• start_activities – Start activities

• end_activities – End activities

• parameters – Parameters of the algorithm, including: - Parameters.ACTIVITY_KEY => the activity key of the simulated log - Parameters.TIMESTAMP_KEY => the timestamp key of the simulated log - Parameters.MAX_NO_VARIANTS => the maximum number of variants generated by the method (default: 3000) - Parameters.MIN_WEIGHTED_PROBABILITY => the minimum overall weighted probability that makes the method stop

(default: 1)

• Parameters.MAX_NO_OCC_PER_ACTIVITY => the maximum number of occurrences per activity in the traces of the log

(default: 2)

• Parameters.INTERRUPT_SIMULATION_WHEN_DFG_COMPLETE => interrupts the simulation when the DFG of the simulated

log has the same keys to the DFG of the original log (all behavior is contained) (default: False)

elements to the simulated DFG, e.g., it adds behavior; skip insertion otherwise (default: False)

• Parameters.RETURN_VARIANTS => returns the traces as variants with a likely number of occurrences

Returns

Simulated log

Return type

simulated_log

pm4py.algo.simulation.playout.dfg.variants.classic.get_node_tr_probabilities(dfg, start_activities, end_activities)[source]

Gets the transition probabilities between the nodes of a DFG

Parameters
• dfg – DFG

• start_activities – Start activities

• end_activities – End activities

Returns

• weighted_start_activities – Start activities, with a relative weight going from 0 to 1

• node_transition_probabilities – The transition probabilities between the nodes of the DFG (the end node is None)

pm4py.algo.simulation.playout.dfg.variants.classic.get_trace_probability(trace, dfg, start_activities, end_activities, parameters=None)[source]

Given a trace of a log, gets its probability given the complete DFG

Parameters
• trace – Trace of a log

• dfgComplete DFG

• start_activities – Start activities of the model

• end_activities – End activities of the model

• parameters – Parameters of the algorithm: - Parameters.ACTIVITY_KEY => activity key

Returns

The probability of the trace according to the DFG

Return type

prob

pm4py.algo.simulation.playout.dfg.variants.classic.get_traces(dfg, start_activities, end_activities, parameters=None)[source]

Gets the most probable traces from the DFG, one-by-one (iterator), until the least probable

Parameters
• dfgComplete DFG

• start_activities – Start activities

• end_activities – End activities

• parameters – Parameters of the algorithm, including: - Parameters.MAX_NO_OCC_PER_ACTIVITY => the maximum number of occurrences per activity in the traces of the log

(default: 2)

Returns

Trace of the simulation

Return type

yielded_trace

## pm4py.algo.simulation.playout.dfg.variants.performance module

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.simulation.playout.dfg.variants.performance.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

ACTIVITY_KEY = 'pm4py:param:activity_key'
CASE_ARRIVAL_RATE = 'case_arrival_rate'
CASE_ID_KEY = 'pm4py:param:case_id_key'
NUM_TRACES = 'num_traces'
PARAM_ARTIFICIAL_END_ACTIVITY = 'pm4py:param:art_end_act'
PARAM_ARTIFICIAL_START_ACTIVITY = 'pm4py:param:art_start_act'
PERFORMANCE_DFG = 'performance_dfg'
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'
pm4py.algo.simulation.playout.dfg.variants.performance.apply(frequency_dfg: Dict[Tuple[str, str], int], start_activities: Dict[str, int], end_activities: Dict[str, int], parameters: Optional[Dict[Any, Any]] = None) [source]

Simulates a log out with the transition probabilities provided by the frequency DFG, and the time deltas provided by the performance DFG

Parameters
• frequency_dfg – Frequency DFG

• start_activities – Start activities

• end_activities – End activities

• parameters – Parameters of the algorithm, including: - Parameters.NUM_TRACES: the number of traces of the simulated log - Parameters.ACTIVITY_KEY: the activity key to be used in the simulated log - Parameters.TIMESTAMP_KEY: the timestamp key to be used in the simulated log - Parameters.CASE_ID_KEY: the case identifier key to be used in the simulated log - Parameters.CASE_ARRIVAL_RATE: the average distance (in seconds) between the start of two cases (default: 1) - Parameters.PERFORMANCE_DFG: (mandatory) the performance DFG that is used for the time deltas.

Returns

Simulated log

Return type

simulated_log

pm4py.algo.simulation.playout.dfg.variants.performance.choice(a, size=None, replace=True, p=None)

Generates a random sample from a given 1-D array

New in version 1.7.0.

Note

New code should use the choice method of a default_rng() instance instead; please see the random-quick-start.

Parameters
• a (1-D array-like or int) – If an ndarray, a random sample is generated from its elements. If an int, the random sample is generated as if it were np.arange(a)

• size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. Default is None, in which case a single value is returned.

• replace (boolean, optional) – Whether the sample is with or without replacement. Default is True, meaning that a value of a can be selected multiple times.

• p (1-D array-like, optional) – The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribution over all entries in a.

Returns

samples – The generated random samples

Return type

single item or ndarray

Raises

ValueError – If a is an int and less than zero, if a or p are not 1-dimensional, if a is an array-like of size 0, if p is not a vector of probabilities, if a and p have different lengths, or if replace=False and the sample size is greater than the population size

randint, shuffle, permutation

random.Generator.choice

which should be used in new code

Notes

Setting user-specified probabilities through p uses a more general but less efficient sampler than the default. The general sampler produces a different sample than the optimized sampler even if each element of p is 1 / len(a).

Sampling random rows from a 2-D array is not possible with this function, but is possible with Generator.choice through its axis keyword.

Examples

Generate a uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3)
array([0, 3, 4]) # random
>>> #This is equivalent to np.random.randint(0,5,3)


Generate a non-uniform random sample from np.arange(5) of size 3:

>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0])
array([3, 3, 0]) # random


Generate a uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False)
array([3,1,0]) # random
>>> #This is equivalent to np.random.permutation(np.arange(5))[:3]


Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:

>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
array([2, 3, 0]) # random


Any of the above can be repeated with an arbitrary array-like instead of just integers. For instance:

>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'], # random
dtype='<U11')

pm4py.algo.simulation.playout.dfg.variants.performance.dict_based_choice(dct: Dict[str, float]) str[source]

Performs a weighted choice, given a dictionary associating a weight to each possible choice

Parameters

dct – Dictionary associating a weight to each choice

Returns

Choice

Return type

choice

pm4py.algo.simulation.playout.dfg.variants.performance.exponential(scale=1.0, size=None)

Draw samples from an exponential distribution.

Its probability density function is

$f(x; \frac{1}{\beta}) = \frac{1}{\beta} \exp(-\frac{x}{\beta}),$

for x > 0 and 0 elsewhere. $$\beta$$ is the scale parameter, which is the inverse of the rate parameter $$\lambda = 1/\beta$$. The rate parameter is an alternative, widely used parameterization of the exponential distribution 3.

The exponential distribution is a continuous analogue of the geometric distribution. It describes many common situations, such as the size of raindrops measured over many rainstorms 1, or the time between page requests to Wikipedia 2.

Note

New code should use the exponential method of a default_rng() instance instead; please see the random-quick-start.

Parameters
• scale (float or array_like of floats) – The scale parameter, $$\beta = 1/\lambda$$. Must be non-negative.

• size (int or tuple of ints, optional) – Output shape. If the given shape is, e.g., (m, n, k), then m * n * k samples are drawn. If size is None (default), a single value is returned if scale is a scalar. Otherwise, np.array(scale).size samples are drawn.

Returns

out – Drawn samples from the parameterized exponential distribution.

Return type

ndarray or scalar

random.Generator.exponential

which should be used for new code.

References

1

Peyton Z. Peebles Jr., “Probability, Random Variables and Random Signal Principles”, 4th ed, 2001, p. 57.

2

Wikipedia, “Poisson process”, https://en.wikipedia.org/wiki/Poisson_process

3

Wikipedia, “Exponential distribution”, https://en.wikipedia.org/wiki/Exponential_distribution