# pm4py.algo.discovery.correlation_mining package

## pm4py.algo.discovery.correlation_mining.algorithm module

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.discovery.correlation_mining.algorithm.Variants(value)[source]

Bases: enum.Enum

An enumeration.

CLASSIC = <module 'pm4py.algo.discovery.correlation_mining.variants.classic'>
CLASSIC_SPLIT = <module 'pm4py.algo.discovery.correlation_mining.variants.classic_split'>
TRACE_BASED = <module 'pm4py.algo.discovery.correlation_mining.variants.trace_based'>
pm4py.algo.discovery.correlation_mining.algorithm.apply(log: Union[pm4py.objects.log.obj.EventLog, pm4py.objects.log.obj.EventStream, pandas.core.frame.DataFrame], variant=Variants.CLASSIC, parameters: Optional[Dict[Any, Any]] = None) Tuple[Dict[Tuple[str, str], int], Dict[Tuple[str, str], float]][source]

Applies the Correlation Miner to the event stream (a log is converted to a stream)

The approach is described in: Pourmirza, Shaya, Remco Dijkman, and Paul Grefen. “Correlation miner: mining business process models and event correlations without case identifiers.” International Journal of Cooperative Information Systems 26.02 (2017): 1742002.

Parameters
• log – Log object

• variant – Variant of the algorithm to use

• parameters – Parameters of the algorithm

Returns

• dfg – Directly-follows graph

• performance_dfg – Performance DFG (containing the estimated performance for the arcs)

## pm4py.algo.discovery.correlation_mining.util module

pm4py.algo.discovery.correlation_mining.util.calculate_time_match_fifo(ai, aj, times0=None)[source]

Associate the times between two lists of timestamps using FIFO

Parameters
• ai – First list of timestamps

• aj – Second list of timestamps

• times0 – Correspondence between execution times

Returns

Correspondence between execution times

Return type

times0

pm4py.algo.discovery.correlation_mining.util.calculate_time_match_rlifo(ai, aj, times1=None)[source]

Associate the times between two lists of timestamps using LIFO (start from end)

Parameters
• ai – First list of timestamps

• aj – Second list of timestamps

• times0 – Correspondence between execution times

Returns

Correspondence between execution times

Return type

times0

pm4py.algo.discovery.correlation_mining.util.get_c_matrix(PS_matrix, duration_matrix, activities, activities_counter)[source]

Calculates the C-matrix out of the PS matrix and the duration matrix

Parameters
• PS_matrix – PS matrix

• duration_matrix – Duration matrix

• activities – Ordered list of activities of the log

• activities_counter – Counter of activities

Returns

C matrix

Return type

c_matrix

pm4py.algo.discovery.correlation_mining.util.greedy_match_return_avg_time(ai, aj)[source]

Matches two list of times with a greedy method and returns the average.

Parameters
• ai – First list

• aj – Second list

• parameters – Parameters of the algorithm

Returns

Mean of times

Return type

times_mean

pm4py.algo.discovery.correlation_mining.util.match_return_avg_time(ai, aj, exact=False)[source]

Matches two list of times (exact or greedy) and returns the average.

Parameters
• ai – First list

• aj – Second list

Returns

Mean of times

Return type

times_mean

pm4py.algo.discovery.correlation_mining.util.resolve_LP(C_matrix, duration_matrix, activities, activities_counter)[source]

Formulates and solve the LP problem

Parameters
• C_matrix – C_matrix

• duration_matrix – Duration matrix

• activities – Ordered list of activities of the log

• activities_counter – Counter of activities

Returns

• dfg – Directly-Follows Graph

• performance_dfg – Performance DFG (containing the estimated performance for the arcs)