pm4py.objects.log.util package

Submodules

pm4py.objects.log.util.basic_filter module

class pm4py.objects.log.util.basic_filter.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

ATTRIBUTE_KEY = 'pm4py:param:attribute_key'
POSITIVE = 'positive'
pm4py.objects.log.util.basic_filter.filter_log_events_attr(log, values, parameters=None)[source]

Filter log by keeping only events with an attribute value that belongs to the provided values list

Parameters
  • log – log

  • values – Allowed attributes

  • parameters

    Parameters of the algorithm, including:

    activity_key -> Attribute identifying the activity in the log positive -> Indicate if events should be kept/removed

Returns

Filtered log

Return type

filtered_log

pm4py.objects.log.util.basic_filter.filter_log_traces_attr(log, values, parameters=None)[source]

Filter log by keeping only traces that has/has not events with an attribute value that belongs to the provided values list

Parameters
  • log – Trace log

  • values – Allowed attributes

  • parameters

    Parameters of the algorithm, including:

    activity_key -> Attribute identifying the activity in the log positive -> Indicate if events should be kept/removed

Returns

Filtered log

Return type

filtered_log

pm4py.objects.log.util.dataframe_utils module

class pm4py.objects.log.util.dataframe_utils.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

CASE_ID_KEY = 'case_id_glue'
MANDATORY_ATTRIBUTES = 'mandatory_attributes'
MAX_DIFFERENT_OCC_STR_ATTR = 50
MAX_NO_CASES = 'max_no_cases'
MIN_DIFFERENT_OCC_STR_ATTR = 5
PARTITION_COLUMN = 'partition_column'
pm4py.objects.log.util.dataframe_utils.automatic_feature_extraction_df(df: pandas.core.frame.DataFrame, parameters: Optional[Dict[Any, Any]] = None) → pandas.core.frame.DataFrame[source]

Performs an automatic feature extraction given a dataframe

Parameters
  • df – Dataframe

  • parameters – Parameters of the algorithm, including: - Parameters.CASE_ID_KEY: the case ID - Parameters.MIN_DIFFERENT_OCC_STR_ATTR - Parameters.MAX_DIFFERENT_OCC_STR_ATTR

Returns

Dataframe with the features

Return type

fea_df

pm4py.objects.log.util.dataframe_utils.automatic_feature_selection_df(df, parameters=None)[source]

Performs an automatic feature selection on dataframes, keeping the features useful for ML purposes

Parameters
  • df – Dataframe

  • parameters – Parameters of the algorithm

Returns

Dataframe with only the features that have been selected

Return type

featured_df

pm4py.objects.log.util.dataframe_utils.convert_timestamp_columns_in_df(df, timest_format=None, timest_columns=None)[source]

Convert all dataframe columns in a dataframe

Parameters
  • df – Dataframe

  • timest_format – (If provided) Format of the timestamp columns in the CSV file

  • timest_columns – Columns of the CSV that shall be converted into timestamp

Returns

Dataframe with timestamp columns converted

Return type

df

pm4py.objects.log.util.dataframe_utils.get_features_df(df: pandas.core.frame.DataFrame, list_columns: List[str], parameters: Optional[Dict[Any, Any]] = None) → pandas.core.frame.DataFrame[source]

Given a dataframe and a list of columns, performs an automatic feature extraction

Parameters
  • df – Dataframe

  • list_column – List of column to consider in the feature extraction

  • parameters – Parameters of the algorithm, including: - Parameters.CASE_ID_KEY: the case ID

Returns

Feature dataframe (desidered output)

Return type

fea_df

pm4py.objects.log.util.dataframe_utils.insert_partitioning(df, num_partitions, parameters=None)[source]

Insert the partitioning in the specified dataframe

Parameters
  • df – Dataframe

  • num_partitions – Number of partitions

  • parameters – Parameters of the algorithm

Returns

Partitioned dataframe

Return type

df

pm4py.objects.log.util.dataframe_utils.legacy_parquet_support(df, parameters=None)[source]

For legacy support, Parquet files columns could not contain a “:” that has been arbitrarily replaced by a replacer string. This string substitutes the replacer to the :

Parameters
  • dataframe – Dataframe

  • parameters – Parameters of the algorithm

pm4py.objects.log.util.dataframe_utils.sample_dataframe(df, parameters=None)[source]

Sample a dataframe on a given number of cases

Parameters
  • df – Dataframe

  • parameters – Parameters of the algorithm, including: - Parameters.CASE_ID_KEY - Parameters.CASE_ID_TO_RETAIN

Returns

Sampled dataframe

Return type

sampled_df

pm4py.objects.log.util.dataframe_utils.select_number_column(df: pandas.core.frame.DataFrame, fea_df: pandas.core.frame.DataFrame, col: str, case_id_key='case:concept:name') → pandas.core.frame.DataFrame[source]

Extract a column for the features dataframe for the given numeric attribute

Parameters
  • df – Dataframe

  • fea_df – Feature dataframe

  • col – Numeric column

  • case_id_key – Case ID key

Returns

Feature dataframe (desidered output)

Return type

fea_df

pm4py.objects.log.util.dataframe_utils.select_string_column(df: pandas.core.frame.DataFrame, fea_df: pandas.core.frame.DataFrame, col: str, case_id_key='case:concept:name') → pandas.core.frame.DataFrame[source]

Extract N columns (for N different attribute values; hotencoding) for the features dataframe for the given string attribute

Parameters
  • df – Dataframe

  • fea_df – Feature dataframe

  • col – String column

  • case_id_key – Case ID key

Returns

Feature dataframe (desidered output)

Return type

fea_df

pm4py.objects.log.util.dataframe_utils.table_to_log(table, parameters=None)[source]

Converts a Pyarrow table to an event log

Parameters
  • table – Pyarrow table

  • parameters – Possible parameters of the algorithm

pm4py.objects.log.util.dataframe_utils.table_to_stream(table, parameters=None)[source]

Converts a Pyarrow table to an event stream

Parameters
  • table – Pyarrow table

  • parameters – Possible parameters of the algorithm

pm4py.objects.log.util.filtering_utils module

pm4py.objects.log.util.filtering_utils.keep_one_trace_per_variant(log, parameters=None)[source]

Keeps only one trace per variant (does not matter for basic inductive miner)

Parameters
  • log – Log

  • parameters – Parameters of the algorithm

Returns

Log (with one trace per variant)

Return type

new_log

pm4py.objects.log.util.filtering_utils.keep_only_one_attribute_per_event(log, attribute_key)[source]

Keeps only one attribute per event

Parameters
  • log – Event log

  • attribute_key – Attribute key

pm4py.objects.log.util.func module

pm4py.objects.log.util.func.filter_(func, log)[source]

Filters the log according to a given lambda function.

Parameters
  • func

  • log

Returns

This will be removed in 2.4.0. filter_() deprecated, use pm4py.filter_log() or pm4py.filter_trace() instead

Return type

Deprecated since version 2.1.3.1.

pm4py.objects.log.util.func.map_(func, log)[source]

Maps the log according to a given lambda function. domain and target of the function need to be of the same type (either trace or event) otherwise, the map behaves unexpected

Parameters
  • func

  • log

Returns

This will be removed in 2.4.0. map_() deprecated, use pm4py.map_log() or pm4py.map_trace() instead

Return type

Deprecated since version 2.1.3.1.

pm4py.objects.log.util.func.sort_(func, log, reverse=False)[source]

Deprecated since version 2.1.3.1: This will be removed in 2.4.0. sort_() deprecated, use pm4py.sort_log() or pm4py.sort_trace() instead

pm4py.objects.log.util.general module

pm4py.objects.log.util.get_class_representation module

pm4py.objects.log.util.get_class_representation.get_class_representation_by_str_ev_attr_value_presence(log, str_attr_name, str_attr_value)[source]

Get the representation for the target part of the decision tree learning if the focus is on the presence of a given value of a (string) event attribute

Parameters
  • log – Trace log

  • str_attr_name – Attribute name to consider

  • str_attr_value – Attribute value to consider

Returns

  • target – Target part for decision tree learning

  • classes – Name of the classes, in order

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.get_class_representation.get_class_representation_by_str_ev_attr_value_value(log, str_attr_name)[source]

Get the representation for the target part of the decision tree learning if the focus is on all (string) values of an event attribute

Parameters
  • log – Trace log

  • str_attr_name – Attribute name to consider

Returns

  • target – Target part for decision tree learning

  • classes – Name of the classes, in order

pm4py.objects.log.util.get_class_representation.get_class_representation_by_trace_duration(log, target_trace_duration, timestamp_key='time:timestamp', parameters=None)[source]

Get class representation by splitting traces according to trace duration

Parameters
  • log – Trace log

  • target_trace_duration – Target trace duration

  • timestamp_key – Timestamp key

Returns

  • target – Target part for decision tree learning

  • classes – Name of the classes, in order

pm4py.objects.log.util.get_log_encoded module

pm4py.objects.log.util.get_log_encoded.get_log_encoded(event_log, trace_attributes=[], event_attributes=[], concatenate=False)[source]

Get event log encoded into matrix.

Parameters
  • event_log – Trace log

  • trace_attributes – Attributes of the trace to be encoded

  • event_attributes – Attributes of the events to be encoded

  • concatenate – Boolean indicating if to generate all sub-sequences of events in a trace

Returns

  • dataset – A numpy matrix with the event log

  • columns – The names of the columns in the dataset

pm4py.objects.log.util.get_log_representation module

pm4py.objects.log.util.get_log_representation.get_all_string_event_attribute_values(log, event_attribute)[source]

Get all the representations for all the traces of the log associated to a string event attribute values

Parameters
  • log – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute in the given log

Return type

values

pm4py.objects.log.util.get_log_representation.get_all_string_event_succession_attribute_values(log, event_attribute)[source]

Get all the representations for all the traces of the log associated to a string event attribute succession values

Parameters
  • log – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute succession in the given log

Return type

values

pm4py.objects.log.util.get_log_representation.get_all_string_trace_attribute_values(log, trace_attribute)[source]

Get all string trace attribute values representations for a log

Parameters
  • log – Trace log

  • trace_attribute – Attribute of the trace to consider

Returns

List containing for each trace a representation of the feature name associated to the attribute

Return type

list

pm4py.objects.log.util.get_log_representation.get_default_representation(log, parameters=None, feature_names=None)[source]

Gets the default data representation of an event log (for process tree building)

Parameters
  • log – Trace log

  • parameters – Possible parameters of the algorithm

  • feature_names – (If provided) Feature to use in the representation of the log

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

Deprecated since version 2.2.8: This will be removed in 3.0.0. please use pm4py.algo.transformation.log_to_features instead

pm4py.objects.log.util.get_log_representation.get_default_representation_with_attribute_names(log, parameters=None, feature_names=None)[source]

Gets the default data representation of an event log (for process tree building) returning also the attribute names

Parameters
  • log – Trace log

  • parameters – Possible parameters of the algorithm

  • feature_names – (If provided) Feature to use in the representation of the log

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

Deprecated since version 2.2.8: This will be removed in 3.0.0. please use pm4py.algo.transformation.log_to_features instead

pm4py.objects.log.util.get_log_representation.get_numeric_event_attribute_rep(event_attribute)[source]

Get the feature name associated to a numeric event attribute

Parameters

event_attribute – Name of the event attribute

Returns

Name of the feature

Return type

feature_name

pm4py.objects.log.util.get_log_representation.get_numeric_event_attribute_value(event, event_attribute)[source]

Get the value of a numeric event attribute from a given event

Parameters

event – Event

Returns

Value of the numeric event attribute for the given event

Return type

value

pm4py.objects.log.util.get_log_representation.get_numeric_event_attribute_value_trace(trace, event_attribute)[source]

Get the value of the last occurrence of a numeric event attribute given a trace

Parameters

trace – Trace of the log

Returns

Value of the last occurrence of a numeric trace attribute for the given trace

Return type

value

pm4py.objects.log.util.get_log_representation.get_numeric_trace_attribute_rep(trace_attribute)[source]

Get the feature name associated to a numeric trace attribute

Parameters

trace_attribute – Name of the trace attribute

Returns

Name of the feature

Return type

feature_name

pm4py.objects.log.util.get_log_representation.get_numeric_trace_attribute_value(trace, trace_attribute)[source]

Get the value of a numeric trace attribute from a given trace

Parameters

trace – Trace of the log

Returns

Value of the numeric trace attribute for the given trace

Return type

value

pm4py.objects.log.util.get_log_representation.get_representation(log, str_tr_attr, str_ev_attr, num_tr_attr, num_ev_attr, str_evsucc_attr=None, feature_names=None)[source]

Get a representation of the event log that is suited for the data part of the decision tree learning

NOTE: this function only encodes the last value seen for each attribute

Parameters
  • log – Trace log

  • str_tr_attr – List of string trace attributes to consider in data vector creation

  • str_ev_attr – List of string event attributes to consider in data vector creation

  • num_tr_attr – List of numeric trace attributes to consider in data vector creation

  • num_ev_attr – List of numeric event attributes to consider in data vector creation

  • str_evsucc_attr – List of attributes succession of values to consider in data vector creation

  • feature_names – (If provided) Feature to use in the representation of the log

Returns

  • data – Data to provide for decision tree learning

  • feature_names – Names of the features, in order

Deprecated since version 2.2.8: This will be removed in 3.0.0. please use pm4py.algo.transformation.log_to_features instead

pm4py.objects.log.util.get_log_representation.get_string_event_attribute_rep(event, event_attribute)[source]

Get a representation of the feature name associated to a string event attribute value

Parameters
  • event – Single event of a trace

  • event_attribute – Event attribute to consider

Returns

Representation of the feature name associated to a string event attribute value

Return type

rep

pm4py.objects.log.util.get_log_representation.get_string_event_attribute_succession_rep(event1, event2, event_attribute)[source]

Get a representation of the feature name associated to a string event attribute value

Parameters
  • event1 – First event of the succession

  • event2 – Second event of the succession

  • event_attribute – Event attribute to consider

Returns

Representation of the feature name associated to a string event attribute value

Return type

rep

pm4py.objects.log.util.get_log_representation.get_string_trace_attribute_rep(trace, trace_attribute)[source]

Get a representation of the feature name associated to a string trace attribute value

Parameters
  • trace – Trace of the log

  • trace_attribute – Attribute of the trace to consider

Returns

Representation of the feature name associated to a string trace attribute value

Return type

rep

pm4py.objects.log.util.get_log_representation.get_values_event_attribute_for_trace(trace, event_attribute)[source]

Get all the representations for the events of a trace associated to a string event attribute values

Parameters
  • trace – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute in the given trace

Return type

values

pm4py.objects.log.util.get_log_representation.get_values_event_attribute_succession_for_trace(trace, event_attribute)[source]

Get all the representations for the events of a trace associated to a string event attribute succession values

Parameters
  • trace – Trace of the log

  • event_attribute – Event attribute to consider

Returns

All feature names present for the given attribute succession in the given trace

Return type

values

pm4py.objects.log.util.get_prefixes module

pm4py.objects.log.util.get_prefixes.get_log_traces_to_activities(log, activities, parameters=None)[source]

Get sublogs taking to each one of the specified activities

Parameters
  • log – Trace log object

  • activities – List of activities in the log

  • parameters

    Possible parameters of the algorithm, including:

    PARAMETER_CONSTANT_ACTIVITY_KEY -> activity PARAMETER_CONSTANT_TIMESTAMP_KEY -> timestamp

Returns

  • list_logs – List of event logs taking to the first occurrence of each activity

  • considered_activities – All activities that are effectively have been inserted in the list of logs (in some of them, the resulting log may be empty)

pm4py.objects.log.util.get_prefixes.get_log_traces_until_activity(log, activity, parameters=None)[source]

Gets a reduced version of the log containing, for each trace, only the events before a specified activity

Parameters
  • log – Trace log

  • activity – Activity to reach

  • parameters

    Possible parameters of the algorithm, including:

    PARAMETER_CONSTANT_ACTIVITY_KEY -> activity PARAMETER_CONSTANT_TIMESTAMP_KEY -> timestamp

Returns

New log

Return type

new_log

pm4py.objects.log.util.get_prefixes.get_log_with_log_prefixes(log, parameters=None)[source]

Gets an extended log that contains, in order, all the prefixes for a case of the original log

Parameters
  • log – Original log

  • parameters – Possible parameters of the algorithm

Returns

  • all_prefixes_log – Log with all the prefixes

  • change_indexes – Indexes of the extended log where there was a change between cases

pm4py.objects.log.util.get_prefixes.get_prefixes_from_log(log: pm4py.objects.log.obj.EventLog, length: int)pm4py.objects.log.obj.EventLog[source]

Gets the prefixes of a log of a given length

Parameters
  • log – Event log

  • length – Length

Returns

Log contain the prefixes: - if a trace has lower or identical length, it is included as-is - if a trace has greater length, it is cut

Return type

prefix_log

pm4py.objects.log.util.index_attribute module

pm4py.objects.log.util.index_attribute.insert_event_index_as_event_attribute(stream, event_index_attr_name='@@eventindex')[source]

Insert the current event index as event attribute

Parameters
  • stream – Stream

  • event_index_attr_name – Attribute name given to the event index

pm4py.objects.log.util.index_attribute.insert_trace_index_as_event_attribute(log, trace_index_attr_name='@@traceindex')[source]

Inserts the current trace index as event attribute (overrides previous values if needed)

Parameters
  • log – Log

  • trace_index_attr_name – Attribute name given to the trace index

pm4py.objects.log.util.insert_classifier module

pm4py.objects.log.util.insert_classifier.insert_activity_classifier_attribute(log, classifier, force_activity_transition_insertion=False)[source]

Insert the specified classifier as additional event attribute in the log

Parameters
  • log – Trace log

  • classifier – Event classifier

  • force_activity_transition_insertion – Optionally force the activitiy+transition classifier insertion

Returns

  • log – Trace log (plus eventually one additional event attribute as the classifier)

  • classifier_attr_key – Attribute name of the attribute that contains the classifier value

pm4py.objects.log.util.insert_classifier.insert_trace_classifier_attribute(log, classifier)[source]

Insert the specified classifier as additional trace attribute in the log

log

Trace log

classifier

Event classifier

Returns

  • log – Trace log (plus eventually one additional event attribute as the classifier)

  • classifier_attr_key – Attribute name of the attribute that contains the classifier value

pm4py.objects.log.util.insert_classifier.search_act_class_attr(log, force_activity_transition_insertion=False)[source]

Search among classifiers expressed in the log one that is good for the process model extraction

Parameters
  • log – Trace log

  • force_activity_transition_insertion – Optionally force the activitiy+transition classifier insertion

Returns

Trace log (plus eventually one additional event attribute as the classifier)

Return type

log

pm4py.objects.log.util.interval_lifecycle module

pm4py.objects.log.util.interval_lifecycle.assign_lead_cycle_time(log, parameters=None)[source]

Assigns the lead and cycle time to an interval log

Parameters
  • log – Interval log

  • parameters – Parameters of the algorithm, including: start_timestamp_key, timestamp_key, worktiming, weekends

pm4py.objects.log.util.interval_lifecycle.to_interval(log, parameters=None)[source]

Converts a log to interval format (e.g. an event has two timestamps) from lifecycle format (an event has only a timestamp, and a transition lifecycle)

Parameters
  • log – Log (expressed in the lifecycle format)

  • parameters – Possible parameters of the method (activity, timestamp key, start timestamp key, transition …)

Returns

Interval event log

Return type

log

pm4py.objects.log.util.interval_lifecycle.to_lifecycle(log, parameters=None)[source]

Converts a log from interval format (e.g. an event has two timestamps) to lifecycle format (an event has only a timestamp, and a transition lifecycle)

Parameters
  • log – Log (expressed in the interval format)

  • parameters – Possible parameters of the method (activity, timestamp key, start timestamp key, transition …)

Returns

Lifecycle event log

Return type

log

pm4py.objects.log.util.log module

pm4py.objects.log.util.log.add_artficial_start_and_end(event_log, start='[start>', end='[end]', activity_key='concept:name')[source]
pm4py.objects.log.util.log.derive_and_lift_trace_attributes_from_event_attributes(trlog, ignore=None, retain_on_event_level=False, verbose=False)[source]
pm4py.objects.log.util.log.get_event_labels(event_log, key)[source]

Fetches the labels present in a log, given a key to use within the events.

Parameters
  • param event_log: log to use

  • param key: to use for event identification, can for example be “concept:name”

Returns

Return type

return: a list of labels

pm4py.objects.log.util.log.get_event_labels_counted(event_log, key)[source]

Fetches the labels (and their frequency) present in a log, given a key to use within the events.

Parameters
  • param event_log: log to use

  • param key: to use for event identification, can for example be “concept:name”

Returns

Return type

return: a list of labels

pm4py.objects.log.util.log.get_trace_variants(event_log, key='concept:name')[source]

Returns a pair of a list of (variants, dict[index -> trace]) where the index of a variant maps to all traces describing that variant, with that key.

Parameters
  • param event_log: log

  • param key: key to use to identify the label of an event

Returns

Return type

return:

pm4py.objects.log.util.log.project_traces(event_log, keys='concept:name')[source]

projects traces on a (set of) event attribute key(s). If the key provided is of type string, each trace is converted into a list of strings. If the key provided is a collection, each trace is converted into a list of (smaller) dicts of key value pairs

Parameters
  • event_log

  • keys

Returns

pm4py.objects.log.util.log_regex module

pm4py.objects.log.util.log_regex.form_encoding_dictio_from_log(log, parameters=None)[source]

Forms the encoding dictionary from the current log

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

Encoding dictionary

Return type

encoding_dictio

pm4py.objects.log.util.log_regex.form_encoding_dictio_from_two_logs(log1: pm4py.objects.log.obj.EventLog, log2: pm4py.objects.log.obj.EventLog, parameters: Optional[Dict[str, Any]] = None) → Dict[str, str][source]

Forms the encoding dictionary from a couple of logs

Parameters
  • log1 – First log

  • log2 – Second log

  • parameters – Parameters of the algorithm

Returns

Encoding dictionary

Return type

encoding_dictio

pm4py.objects.log.util.log_regex.get_encoded_log(log, mapping, parameters=None)[source]

Gets the encoding of the provided log

Parameters
  • log – Event log

  • mapping – Mapping (activity to symbol)

Returns

List of encoded strings

Return type

list_str

pm4py.objects.log.util.log_regex.get_encoded_trace(trace, mapping, parameters=None)[source]

Gets the encoding of the provided trace

Parameters
  • trace – Trace of the event log

  • mapping – Mapping (activity to symbol)

Returns

Trace string

Return type

trace_str

pm4py.objects.log.util.prefix_matrix module

pm4py.objects.log.util.prefix_matrix.get_activities_list(log, parameters=None)[source]

Gets the activities list from a log object, sorted by activity name

Parameters
  • log – Log

  • parameters – Possible parameters of the algorithm

Returns

List of activities sorted by activity name

Return type

activities_list

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_matrix(log, parameters=None)[source]

Gets the prefix matrix from a log object

Parameters
  • log – Log

  • parameters – Parameters of the algorithm: activity_key

Returns

  • prefix_matrix – Prefix matrix

  • activities – Sorted (by name) activities of the log

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_matrix_from_event_log_not_unique(event_log, activities, parameters=None)[source]

Gets a numeric matrix where each trace is associated to different rows, each one is referring to one of its prefixes.

Parameters
  • event_log – Event log

  • activities – Activities

  • parameters – Parameters of the algorithm

Returns

Prefix matrix of the log

Return type

prefix_mat

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_matrix_from_trace(trace, activities, parameters=None)[source]

Gets a numeric matrix where a trace is associated to different rows, each one is referring to one of its prefixes.

Parameters
  • trace – Trace of the event log

  • activities – Activities

  • parameters – Parameters of the algorithm

Returns

Prefix matrix of the log

Return type

prefix_mat

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_matrix_from_var_str(var_str, activities, parameters=None)[source]

Gets a numeric matrix where a variant is associated to different rows, each one is referring to one of its prefixes.

Parameters
  • var_str – String representation of a variant

  • activities – Activities

  • parameters – Parameters of the algorithm

Returns

Prefix matrix of the log

Return type

prefix_mat

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_matrix_from_variants_list(variants_list, activities, parameters=None)[source]

Gets a numeric matrix where each row is associated to a different prefix of activities happening in the variants of the log, along with the count of the particular situation

Parameters
  • variants_list – List of variants contained in the log, along with their count

  • activities – List of activities in the log

  • parameters – Parameters of the algorithm

Returns

Prefix matrix of the log

Return type

prefix_mat

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_repr(prefix, activities)[source]

Gets the numeric representation (as vector) of a prefix

Parameters
  • prefix – Prefix

  • activities – Activities

Returns

Representation of a prefix

Return type

prefix_repr

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_prefix_variants_matrix(log, parameters=None)[source]

Gets the prefix variants matrix from a log object

Parameters
  • log – Log

  • parameters – Parameters of the algorithm: activity_key

Returns

  • prefix_matrix – Prefix matrix

  • variants_matrix – Variants matrix

  • activities – Sorted (by name) activities of the log

pm4py.objects.log.util.prefix_matrix.get_variants_list(log, parameters=None)[source]

Gets the list of variants (along with their count) from the particular log type

Parameters
  • log – Log

  • parameters – Parameters of the algorithm

Returns

List of variants of the log (along with their count)

Return type

variants_list

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_variants_matrix(log, parameters=None)[source]

Gets the variants matrix from a log object

Parameters
  • log – Log

  • parameters – Parameters of the algorithm: activity_key

Returns

  • variants_matrix – Variants matrix

  • activities – Sorted (by name) activities of the log

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.prefix_matrix.get_variants_matrix_from_variants_list(variants_list, activities, parameters=None)[source]

Gets a numeric matrix where each row is associated to a different set of activities happening in the (complete) variants of the log, along with the count of the particular situation

Parameters
  • variants_list – List of variants contained in the log, along with their count

  • activities – List of activities in the log

  • parameters – Parameters of the algorithm: keep_unique (default: True)

Returns

Variants matrix of the log

Return type

variants_matrix

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.sampling module

pm4py.objects.log.util.sampling.sample(log, n=100)[source]

Randomly sample a fixed number of traces from the original log

Parameters
  • log – Trace/event log

  • n – Number of elements that the sample should have

Returns

Filtered log

Return type

newLog

pm4py.objects.log.util.sampling.sample_log(log, no_traces=100)[source]

Randomly sample a fixed number of traces from the original log

Parameters
  • log – Log

  • no_traces – Number of traces that the sample should have

Returns

Filtered log

Return type

newLog

pm4py.objects.log.util.sampling.sample_stream(event_log, no_events=100)[source]

Randomly sample a fixed number of events from the original event log

Parameters
  • event_log – Event log

  • no_events – Number of events that the sample should have

Returns

Filtered log

Return type

newLog

pm4py.objects.log.util.sorting module

pm4py.objects.log.util.sorting.sort_lambda(log, sort_function, reverse=False)[source]

Sort a log based on lambda expression

Parameters
  • log – Log

  • sort_function – Sort function

  • reverse – Boolean (sort by reverse order)

Returns

Sorted log

Return type

log

pm4py.objects.log.util.sorting.sort_lambda_log(event_log, sort_function, reverse=False)[source]

Sort a log based on a lambda expression

Parameters
  • event_log – Log

  • sort_function – Sort function

  • reverse – Boolean (sort by reverse order)

Returns

Sorted log

Return type

new_log

pm4py.objects.log.util.sorting.sort_lambda_stream(event_log, sort_function, reverse=False)[source]

Sort a stream based on a lambda expression

Parameters
  • event_log – Stream

  • sort_function – Sort function

  • reverse – Boolean (sort by reverse order)

Returns

Sorted stream

Return type

stream

pm4py.objects.log.util.sorting.sort_timestamp(log, timestamp_key='time:timestamp', reverse_sort=False)[source]

Sort a log based on timestamp key

Parameters
  • log – Trace/Event log

  • timestamp_key – Timestamp key

  • reverse_sort – If true, reverses the direction in which the sort is done (ascending)

Returns

Sorted Trace/Event log

Return type

log

pm4py.objects.log.util.sorting.sort_timestamp_log(event_log, timestamp_key='time:timestamp', reverse_sort=False)[source]

Sort a log based on timestamp key

Parameters
  • event_log – Log

  • timestamp_key – Timestamp key

  • reverse_sort – If true, reverses the direction in which the sort is done (ascending)

Returns

Sorted log

Return type

log

pm4py.objects.log.util.sorting.sort_timestamp_stream(event_log, timestamp_key='time:timestamp', reverse_sort=False)[source]

Sort an event log based on timestamp key

Parameters
  • event_log – Event log

  • timestamp_key – Timestamp key

  • reverse_sort – If true, reverses the direction in which the sort is done (ascending)

Returns

Sorted event log

Return type

event_log

pm4py.objects.log.util.sorting.sort_timestamp_trace(trace, timestamp_key='time:timestamp', reverse_sort=False)[source]

Sort a trace based on timestamp key

Parameters
  • trace – Trace

  • timestamp_key – Timestamp key

  • reverse_sort – If true, reverses the direction in which the sort is done (ascending)

Returns

Sorted trace

Return type

trace

pm4py.objects.log.util.split_train_test module

pm4py.objects.log.util.split_train_test.split(log: pm4py.objects.log.obj.EventLog, train_percentage: float = 0.8) → Tuple[pm4py.objects.log.obj.EventLog, pm4py.objects.log.obj.EventLog][source]

Split an event log in a training log and a test log (for machine learning purposes)

Parameters
  • log – Event log

  • train_percentage – Fraction of traces to be included in the training log (from 0.0 to 1.0)

Returns

  • training_log – Training event log

  • test_log – Test event log

pm4py.objects.log.util.time_from_previous module

pm4py.objects.log.util.time_from_previous.insert_time_from_previous(log, parameters=None)[source]

Inserts the time from the previous event, both in normal and business hours

Parameters
  • log – Event log

  • parameters – Parameters of the algorithm

Returns

Enriched log (with the time passed from the previous event)

Return type

enriched_log

Deprecated since version 2.2.7: This will be removed in 3.0.0.

pm4py.objects.log.util.xes module

Module contents