pm4py.algo.filtering.pandas.cases package

Submodules

pm4py.algo.filtering.pandas.cases.case_filter module

class pm4py.algo.filtering.pandas.cases.case_filter.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

CASE_ID_KEY = 'case_id_glue'
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'
pm4py.algo.filtering.pandas.cases.case_filter.apply(df, parameters=None)[source]
pm4py.algo.filtering.pandas.cases.case_filter.apply_auto_filter(df, parameters=None)[source]
pm4py.algo.filtering.pandas.cases.case_filter.filter_case_performance(df, min_case_performance=0, max_case_performance=10000000000, parameters=None)[source]
pm4py.algo.filtering.pandas.cases.case_filter.filter_on_case_performance(df, case_id_glue='case:concept:name', timestamp_key='time:timestamp', min_case_performance=0, max_case_performance=10000000000)[source]

Filter a dataframe on case performance

Parameters
  • df – Dataframe

  • case_id_glue – Case ID column in the CSV

  • timestamp_key – Timestamp column to use for the CSV

  • min_case_performance – Minimum case performance

  • max_case_performance – Maximum case performance

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.cases.case_filter.filter_on_case_size(df, case_id_glue='case:concept:name', min_case_size=2, max_case_size=None)[source]

Filter a dataframe keeping only traces with at least the specified number of events

Parameters
  • df – Dataframe

  • case_id_glue – Case ID column in the CSV

  • min_case_size – Minimum size of a case

  • max_case_size – Maximum case size

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.cases.case_filter.filter_on_ncases(df, case_id_glue='case:concept:name', max_no_cases=1000)[source]

Filter a dataframe keeping only the specified maximum number of traces

Parameters
  • df – Dataframe

  • case_id_glue – Case ID column in the CSV

  • max_no_cases – Maximum number of traces to keep

Returns

Filtered dataframe

Return type

df

Module contents