pm4py.algo.filtering.pandas.attributes package

Submodules

pm4py.algo.filtering.pandas.attributes.attributes_filter module

class pm4py.algo.filtering.pandas.attributes.attributes_filter.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

ACTIVITY_KEY = 'pm4py:param:activity_key'
ATTRIBUTE_KEY = 'pm4py:param:attribute_key'
CASE_ID_KEY = 'case_id_glue'
DECREASING_FACTOR = 'decreasingFactor'
POSITIVE = 'positive'
STREAM_FILTER_KEY1 = 'stream_filter_key1'
STREAM_FILTER_KEY2 = 'stream_filter_key2'
STREAM_FILTER_VALUE1 = 'stream_filter_value1'
STREAM_FILTER_VALUE2 = 'stream_filter_value2'
pm4py.algo.filtering.pandas.attributes.attributes_filter.apply(df, values, parameters=None)[source]

Filter dataframe on attribute values (filter traces)

Parameters
  • df – Dataframe

  • values – Values to filter on

  • parameters

    Possible parameters of the algorithm, including:

    Parameters.CASE_ID_KEY -> Case ID column in the dataframe Parameters.ATTRIBUTE_KEY -> Attribute we want to filter Parameters.POSITIVE -> Specifies if the filter should be applied including traces (positive=True) or excluding traces (positive=False)

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.attributes.attributes_filter.apply_auto_filter(df, parameters=None)[source]

Apply auto filter on activity values

Parameters
  • df – Dataframe

  • parameters

    Possible parameters of the algorithm, including:

    Parameters.ACTIVITY_KEY -> Column containing the activity Parameters.DECREASING_FACTOR -> Decreasing factor that should be passed to the algorithm

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.attributes.attributes_filter.apply_events(df, values, parameters=None)[source]

Filter dataframe on attribute values (filter events)

Parameters
  • df – Dataframe

  • values – Values to filter on

  • parameters

    Possible parameters of the algorithm, including:

    Parameters.ATTRIBUTE_KEY -> Attribute we want to filter Parameters.POSITIVE -> Specifies if the filter should be applied including traces (positive=True) or excluding traces (positive=False)

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.attributes.attributes_filter.apply_numeric(df, int1, int2, parameters=None)[source]

Filter dataframe on attribute values (filter cases)

Parameters
  • df – Dataframe

  • int1 – Lower bound of the interval

  • int2 – Upper bound of the interval

  • parameters

    Possible parameters of the algorithm:

    Parameters.ATTRIBUTE_KEY => indicates which attribute to filter Parameters.POSITIVE => keep or remove traces with such events?

Returns

Filtered dataframe

Return type

filtered_df

pm4py.algo.filtering.pandas.attributes.attributes_filter.apply_numeric_events(df, int1, int2, parameters=None)[source]

Apply a filter on events (numerical filter)

Parameters
  • df – Dataframe

  • int1 – Lower bound of the interval

  • int2 – Upper bound of the interval

  • parameters

    Possible parameters of the algorithm:

    Parameters.ATTRIBUTE_KEY => indicates which attribute to filter positive => keep or remove events?

Returns

Filtered dataframe

Return type

filtered_df

pm4py.algo.filtering.pandas.attributes.attributes_filter.filter_df_keeping_activ_exc_thresh(df, thresh, act_count0=None, activity_key='concept:name', most_common_variant=None)[source]

Filter a dataframe keeping activities exceeding the threshold

Parameters
  • df – Pandas dataframe

  • thresh – Threshold to use to cut activities

  • act_count0 – (If provided) Dictionary that associates each activity with its count

  • activity_key – Column in which the activity is present

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.attributes.attributes_filter.filter_df_keeping_spno_activities(df, activity_key='concept:name', max_no_activities=25)[source]

Filter a dataframe on the specified number of attributes

Parameters
  • df – Dataframe

  • activity_key – Activity key in dataframe (must be specified if different from concept:name)

  • max_no_activities – Maximum allowed number of attributes

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.attributes.attributes_filter.filter_df_on_attribute_values(df, values, case_id_glue='case:concept:name', attribute_key='concept:name', positive=True)[source]

Filter dataframe on attribute values

Parameters
  • df – Dataframe

  • values – Values to filter on

  • case_id_glue – Case ID column in the dataframe

  • attribute_key – Attribute we want to filter

  • positive – Specifies if the filtered should be applied including traces (positive=True) or excluding traces (positive=False)

Returns

Filtered dataframe

Return type

df

Module contents