pm4py.algo.transformation.log_to_features.variants package

Submodules

pm4py.algo.transformation.log_to_features.variants.event_based module

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.transformation.log_to_features.variants.event_based.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

FEATURE_NAMES = 'feature_names'
MAX_NUM_DIFF_STR_VALUES = 'max_num_diff_str_values'
MIN_NUM_DIFF_STR_VALUES = 'min_num_diff_str_values'
NUM_EVENT_ATTRIBUTES = 'num_ev_attr'
STR_EVENT_ATTRIBUTES = 'str_ev_attr'
pm4py.algo.transformation.log_to_features.variants.event_based.apply(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.event_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Extracts all the features for the traces of an event log (each trace becomes a vector of vectors, where each event has its own vector)

Parameters
  • log – Event log

  • parameters

    Parameters of the algorithm, including:
    • STR_EVENT_ATTRIBUTES => string event attributes to consider in the features extraction

    • NUM_EVENT_ATTRIBUTES => numeric event attributes to consider in the features extraction

    • FEATURE_NAMES => features to consider (in the given order)

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

pm4py.algo.transformation.log_to_features.variants.event_based.extract_all_ev_features_names_from_log(log: pm4py.objects.log.obj.EventLog, str_ev_attr: List[str], num_ev_attr: List[str], parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.event_based.Parameters], Any]] = None) List[str][source]

Extracts the feature names from an event log.

Parameters
  • log – Event log

  • str_ev_attr – (if provided) list of string event attributes to consider in extracting the feature names

  • num_ev_attr – (if provided) list of integer event attributes to consider in extracting the feature names

  • parameters

    Parameters, including:
    • MIN_NUM_DIFF_STR_VALUES => minimum number of distinct values to include an attribute as feature(s)

    • MAX_NUM_DIFF_STR_VALUES => maximum number of distinct values to include an attribute as feature(s)

Returns

List of feature names

Return type

feature_names

pm4py.algo.transformation.log_to_features.variants.event_based.extract_features(log: pm4py.objects.log.obj.EventLog, feature_names: List[str], parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.event_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Extracts the matrix of the features from an event log

Parameters
  • log – Event log

  • feature_names – Features to consider (in the given order)

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

pm4py.algo.transformation.log_to_features.variants.trace_based module

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

ACTIVITY_KEY = 'pm4py:param:activity_key'
ADD_CASE_IDENTIFIER_COLUMN = 'add_case_identifier_column'
CASE_ATTRIBUTE_PREFIX = 'case:'
CASE_ID_KEY = 'pm4py:param:case_id_key'
DEFAULT_NOT_PRESENT = 'default_not_present'
ENABLE_ACTIVITY_DEF_REPRESENTATION = 'enable_activity_def_representation'
ENABLE_ALL_EXTRA_FEATURES = 'enable_all_extra_features'
ENABLE_CASE_DURATION = 'enable_case_duration'
ENABLE_DIRECT_PATHS_TIMES_LAST_OCC = 'enable_direct_paths_times_last_occ'
ENABLE_FIRST_LAST_ACTIVITY_INDEX = 'enable_first_last_activity_index'
ENABLE_INDIRECT_PATHS_TIMES_LAST_OCC = 'enable_indirect_paths_times_last_occ'
ENABLE_MAX_CONCURRENT_EVENTS = 'enable_max_concurrent_events'
ENABLE_MAX_CONCURRENT_EVENTS_PER_ACTIVITY = 'enable_max_concurrent_events_per_activity'
ENABLE_RESOURCE_WORKLOAD = 'enable_resource_workload'
ENABLE_SUCC_DEF_REPRESENTATION = 'enable_succ_def_representation'
ENABLE_TIMES_FROM_FIRST_OCCURRENCE = 'enable_times_from_first_occurrence'
ENABLE_TIMES_FROM_LAST_OCCURRENCE = 'enable_times_from_last_occurrence'
ENABLE_WORK_IN_PROGRESS = 'enable_work_in_progress'
EPSILON = 'epsilon'
FEATURE_NAMES = 'feature_names'
NUM_EVENT_ATTRIBUTES = 'num_ev_attr'
NUM_TRACE_ATTRIBUTES = 'num_tr_attr'
RESOURCE_KEY = 'pm4py:param:resource_key'
START_TIMESTAMP_KEY = 'pm4py:param:start_timestamp_key'
STR_EVENT_ATTRIBUTES = 'str_ev_attr'
STR_EVSUCC_ATTRIBUTES = 'str_evsucc_attr'
STR_TRACE_ATTRIBUTES = 'str_tr_attr'
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'
pm4py.algo.transformation.log_to_features.variants.trace_based.apply(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Extract the features from an event log (a vector for each trace)

Parameters
  • log – Log

  • parameters – Parameters of the algorithm, including: - STR_TRACE_ATTRIBUTES => string trace attributes to consider in the features extraction - STR_EVENT_ATTRIBUTES => string event attributes to consider in the features extraction - NUM_TRACE_ATTRIBUTES => numeric trace attributes to consider in the features extraction - NUM_EVENT_ATTRIBUTES => numeric event attributes to consider in the features extraction - STR_EVSUCC_ATTRIBUTES => succession of event attributes to consider in the features extraction - FEATURE_NAMES => features to consider (in the given order) - ENABLE_ALL_EXTRA_FEATURES => enables all the extra features - ENABLE_CASE_DURATION => enables the case duration as additional feature - ENABLE_TIMES_FROM_FIRST_OCCURRENCE => enables the addition of the times from start of the case, to the end of the case, from the first occurrence of an activity of a case - ADD_CASE_IDENTIFIER_COLUMN => adds the case identifier (string) as column of the feature table (default: False) - ENABLE_TIMES_FROM_LAST_OCCURRENCE => enables the addition of the times from start of the case, to the end of the case, from the last occurrence of an activity of a case - ENABLE_DIRECT_PATHS_TIMES_LAST_OCC => add the duration of the last occurrence of a directed (i, i+1) path in the case as feature - ENABLE_INDIRECT_PATHS_TIMES_LAST_OCC => add the duration of the last occurrence of an indirect (i, j) path in the case as feature - ENABLE_WORK_IN_PROGRESS => enables the work in progress (number of concurrent cases) as a feature - ENABLE_RESOURCE_WORKLOAD => enables the resource workload as a feature - ENABLE_FIRST_LAST_ACTIVITY_INDEX => enables the insertion of the indexes of the activities as features - ENABLE_MAX_CONCURRENT_EVENTS => enables the count of the number of concurrent events inside a case - ENABLE_MAX_CONCURRENT_EVENTS_PER_ACTIVITY => enables the count of the number of concurrent events per activity

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

pm4py.algo.transformation.log_to_features.variants.trace_based.case_duration(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, the case duration (and adds it as a feature)

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.direct_paths_times_last_occ(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, and for each direct path of the case, the difference between the start timestamp of the later event and the completion timestamp of the first event. Defaults if a path is not present in a case.

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.first_last_activity_index_trace(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Consider as features the first and the last index of an activity inside a case

Parameters
  • log – Event log

  • parameters – Parameters, including: - Parameters.ACTIVITY_KEY => the attribute to use as activity - Parameters.DEFAULT_NOT_PRESENT => the replacement value for activities that are not present for the specific case

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.get_all_string_event_attribute_values(log: pm4py.objects.log.obj.EventLog, event_attribute: str) List[str][source]

Get all the representations for all the traces of the log associated to a string event attribute values

Parameters
  • log – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute in the given log

Return type

values

pm4py.algo.transformation.log_to_features.variants.trace_based.get_all_string_event_succession_attribute_values(log: pm4py.objects.log.obj.EventLog, event_attribute: str) List[str][source]

Get all the representations for all the traces of the log associated to a string event attribute succession values

Parameters
  • log – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute succession in the given log

Return type

values

pm4py.algo.transformation.log_to_features.variants.trace_based.get_all_string_trace_attribute_values(log: pm4py.objects.log.obj.EventLog, trace_attribute: str) List[str][source]

Get all string trace attribute values representations for a log

Parameters
  • log – Trace log

  • trace_attribute – Attribute of the trace to consider

Returns

List containing for each trace a representation of the feature name associated to the attribute

Return type

list

pm4py.algo.transformation.log_to_features.variants.trace_based.get_default_representation(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None, feature_names: Optional[List[str]] = None) Tuple[Any, List[str]][source]

Gets the default data representation of an event log (for process tree building)

Parameters
  • log – Trace log

  • parameters – Possible parameters of the algorithm

  • feature_names – (If provided) Feature to use in the representation of the log

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

pm4py.algo.transformation.log_to_features.variants.trace_based.get_default_representation_with_attribute_names(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None, feature_names: Optional[List[str]] = None) Tuple[Any, List[str], List[str], List[str], List[str], List[str]][source]

Gets the default data representation of an event log (for process tree building) returning also the attribute names

Parameters
  • log – Trace log

  • parameters – Possible parameters of the algorithm

  • feature_names – (If provided) Feature to use in the representation of the log

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_event_attribute_rep(event_attribute: str) str[source]

Get the feature name associated to a numeric event attribute

Parameters

event_attribute – Name of the event attribute

Returns

Name of the feature

Return type

feature_name

pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_event_attribute_value(event: pm4py.objects.log.obj.Event, event_attribute: str) Union[int, float][source]

Get the value of a numeric event attribute from a given event

Parameters

event – Event

Returns

Value of the numeric event attribute for the given event

Return type

value

pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_event_attribute_value_trace(trace: pm4py.objects.log.obj.Trace, event_attribute: str) Union[int, float][source]

Get the value of the last occurrence of a numeric event attribute given a trace

Parameters

trace – Trace of the log

Returns

Value of the last occurrence of a numeric trace attribute for the given trace

Return type

value

pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_trace_attribute_rep(trace_attribute: str) str[source]

Get the feature name associated to a numeric trace attribute

Parameters

trace_attribute – Name of the trace attribute

Returns

Name of the feature

Return type

feature_name

pm4py.algo.transformation.log_to_features.variants.trace_based.get_numeric_trace_attribute_value(trace: pm4py.objects.log.obj.Trace, trace_attribute: str) Union[int, float][source]

Get the value of a numeric trace attribute from a given trace

Parameters

trace – Trace of the log

Returns

Value of the numeric trace attribute for the given trace

Return type

value

pm4py.algo.transformation.log_to_features.variants.trace_based.get_representation(log: pm4py.objects.log.obj.EventLog, str_tr_attr: List[str], str_ev_attr: List[str], num_tr_attr: List[str], num_ev_attr: List[str], str_evsucc_attr: Optional[List[str]] = None, feature_names: Optional[List[str]] = None) Tuple[Any, List[str]][source]

Get a representation of the event log that is suited for the data part of the decision tree learning

NOTE: this function only encodes the last value seen for each attribute

Parameters
  • log – Trace log

  • str_tr_attr – List of string trace attributes to consider in data vector creation

  • str_ev_attr – List of string event attributes to consider in data vector creation

  • num_tr_attr – List of numeric trace attributes to consider in data vector creation

  • num_ev_attr – List of numeric event attributes to consider in data vector creation

  • str_evsucc_attr – List of attributes succession of values to consider in data vector creation

  • feature_names – (If provided) Feature to use in the representation of the log

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

pm4py.algo.transformation.log_to_features.variants.trace_based.get_string_event_attribute_rep(event: pm4py.objects.log.obj.Event, event_attribute: str) str[source]

Get a representation of the feature name associated to a string event attribute value

Parameters
  • event – Single event of a trace

  • event_attribute – Event attribute to consider

Returns

Representation of the feature name associated to a string event attribute value

Return type

rep

pm4py.algo.transformation.log_to_features.variants.trace_based.get_string_event_attribute_succession_rep(event1: pm4py.objects.log.obj.Event, event2: pm4py.objects.log.obj.Event, event_attribute: str) str[source]

Get a representation of the feature name associated to a string event attribute value

Parameters
  • event1 – First event of the succession

  • event2 – Second event of the succession

  • event_attribute – Event attribute to consider

Returns

Representation of the feature name associated to a string event attribute value

Return type

rep

pm4py.algo.transformation.log_to_features.variants.trace_based.get_string_trace_attribute_rep(trace: pm4py.objects.log.obj.Trace, trace_attribute: str) str[source]

Get a representation of the feature name associated to a string trace attribute value

Parameters
  • trace – Trace of the log

  • trace_attribute – Attribute of the trace to consider

Returns

Representation of the feature name associated to a string trace attribute value

Return type

rep

pm4py.algo.transformation.log_to_features.variants.trace_based.get_values_event_attribute_for_trace(trace: pm4py.objects.log.obj.Trace, event_attribute: str) Set[str][source]

Get all the representations for the events of a trace associated to a string event attribute values

Parameters
  • trace – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute in the given trace

Return type

values

pm4py.algo.transformation.log_to_features.variants.trace_based.get_values_event_attribute_succession_for_trace(trace: pm4py.objects.log.obj.Trace, event_attribute: str) Set[str][source]

Get all the representations for the events of a trace associated to a string event attribute succession values

Parameters
  • trace – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute succession in the given trace

Return type

values

pm4py.algo.transformation.log_to_features.variants.trace_based.indirect_paths_times_last_occ(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, and for each indirect path of the case, the difference between the start timestamp of the later event and the completion timestamp of the first event. Defaults if a path is not present in a case.

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.max_concurrent_events(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Counts for every trace the maximum number of events (of any activity) that happen concurrently (e.g., their time intervals [st1, ct1] and [st2, ct2] have non-empty intersection).

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.max_concurrent_events_per_activity(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Counts for every trace and every activity the maximum number of events of the given activity that happen concurrently (e.g., their time intervals [st1, ct1] and [st2, ct2] have non-empty intersection).

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.resource_workload(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, and for each resource of the log, the workload of the resource during the lead time of a case. Defaults if a resource is not contained in a case.

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.times_from_first_occurrence_activity_case(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, and for each activity, the times from the start to the case, and to the end of the case, from the first occurrence of the activity in the case.

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.times_from_last_occurrence_activity_case(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, and for each activity, the times from the start to the case, and to the end of the case, from the last occurrence of the activity in the case.

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

pm4py.algo.transformation.log_to_features.variants.trace_based.work_in_progress(log: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[Union[str, pm4py.algo.transformation.log_to_features.variants.trace_based.Parameters], Any]] = None) Tuple[Any, List[str]][source]

Calculates for each case, and for each resource of the log, the number of cases which are open during the lead time of the case.

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

  • data – Numeric value of the features

  • feature_names – Names of the features

Module contents

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.