pm4py.algo.organizational_mining.network_analysis.variants package#

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

Submodules#

pm4py.algo.organizational_mining.network_analysis.variants.dataframe module#

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.organizational_mining.network_analysis.variants.dataframe.Parameters(value)[source]#

Bases: Enum

An enumeration.

SORTING_COLUMN = 'sorting_column'#
INDEX_KEY = 'index_key'#
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'#
IN_COLUMN = 'in_column'#
OUT_COLUMN = 'out_column'#
NODE_COLUMN_SOURCE = 'node_column_source'#
NODE_COLUMN_TARGET = 'node_column_target'#
EDGE_COLUMN = 'edge_column'#
INCLUDE_PERFORMANCE = 'include_performance'#
BUSINESS_HOURS = 'business_hours'#
BUSINESS_HOUR_SLOTS = 'business_hour_slots'#
WORKCALENDAR = 'workcalendar'#
TIMESTAMP_DIFF_COLUMN = 'timestamp_diff_column'#
EDGE_REFERENCE = 'edge_reference'#

Builds the network analysis from the results of the link analysis (internal method)

Parameters#

merged_df

Dataframe obtained from the link analysis

parameters
Parameters of the method, including:
  • Parameters.NODE_COLUMN_SOURCE => the attribute to be used for the node definition of the source event (default: the resource of the log, org:resource)

  • Parameters.NODE_COLUMN_TARGET => the attribute to be used for the node definition of the target event (default: the resource of the log, org:resource)

  • Parameters.EDGE_COLUMN => the attribute to be used for the edge definition (default: the activity of the log, concept:name)

  • Parameters.EDGE_REFERENCE => the event into which the edge attribute should be picked:
    • _out => the source event

    • _in => the target event

  • Parameters.TIMESTAMP_COLUMN => the timestamp column

  • Parameters.TIMESTAMP_DIFF_COLUMN => timestamp diff column

  • Parameters.INCLUDE_PERFORMANCE => considers the performance of the edge

  • Parameters.BUSINESS_HOURS => boolean value that enables the business hours

  • Parameters.BUSINESS_HOURS_SLOTS =>

work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. One slot i.e. one tuple consists of one start and one end time given in seconds since week start, e.g. [

(7 * 60 * 60, 17 * 60 * 60), ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60),

] meaning that business hours are Mondays 07:00 - 17:00 and Tuesdays 07:00 - 12:00 and 13:00 - 17:00

Returns#

network_analysis

Edges of the network analysis (first key: edge; second key: type; value: number of occurrences)

pm4py.algo.organizational_mining.network_analysis.variants.dataframe.apply(dataframe: DataFrame, parameters: Optional[Dict[Any, Any]] = None) Dict[Tuple[str, str], Dict[str, Any]][source]#

Performs the network analysis on the provided dataframe

Parameters#

dataframe

Dataframe

parameters

Parameters of the method, including: - Parameters.SORTING_COLUMN => the column that should be used to sort the log - Parameters.IN_COLUMN => the target column of the link (default: the case identifier; events of the same case are linked) - Parameters.OUT_COLUMN => the source column of the link (default: the case identifier; events of the same case are linked) - Parameters.INDEX_KEY => the name for the index attribute in the log (inserted during the execution) - Parameters.NODE_COLUMN_SOURCE => the attribute to be used for the node definition of the source event (default: the resource of the log, org:resource) - Parameters.NODE_COLUMN_TARGET => the attribute to be used for the node definition of the target event (default: the resource of the log, org:resource) - Parameters.EDGE_COLUMN => the attribute to be used for the edge definition (default: the activity of the log, concept:name) - Parameters.EDGE_REFERENCE => the event into which the edge attribute should be picked:

  • _out => the source event

  • _in => the target event

  • Parameters.TIMESTAMP_COLUMN => the timestamp column

  • Parameters.TIMESTAMP_DIFF_COLUMN => timestamp diff column

  • Parameters.INCLUDE_PERFORMANCE => considers the performance of the edge

  • Parameters.BUSINESS_HOURS => boolean value that enables the business hours

  • Parameters.BUSINESS_HOURS_SLOTS =>

work schedule of the company, provided as a list of tuples where each tuple represents one time slot of business hours. One slot i.e. one tuple consists of one start and one end time given in seconds since week start, e.g. [

(7 * 60 * 60, 17 * 60 * 60), ((24 + 7) * 60 * 60, (24 + 12) * 60 * 60), ((24 + 13) * 60 * 60, (24 + 17) * 60 * 60),

] meaning that business hours are Mondays 07:00 - 17:00 and Tuesdays 07:00 - 12:00 and 13:00 - 17:00

Returns#

network_analysis

Edges of the network analysis (first key: edge; second key: type; value: number of occurrences)