pm4py.algo.filtering.pandas.cases package

Submodules

pm4py.algo.filtering.pandas.cases.case_filter module

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.

class pm4py.algo.filtering.pandas.cases.case_filter.Parameters(value)[source]

Bases: enum.Enum

An enumeration.

BUSINESS_HOURS = 'business_hours'
CASE_ID_KEY = 'pm4py:param:case_id_key'
TIMESTAMP_KEY = 'pm4py:param:timestamp_key'
WEEKENDS = 'weekends'
WORKCALENDAR = 'workcalendar'
WORKTIMING = 'worktiming'
pm4py.algo.filtering.pandas.cases.case_filter.apply(df, parameters=None)[source]
pm4py.algo.filtering.pandas.cases.case_filter.apply_auto_filter(df, parameters=None)[source]
pm4py.algo.filtering.pandas.cases.case_filter.filter_case_performance(df: pandas.core.frame.DataFrame, min_case_performance: float = 0, max_case_performance: float = 10000000000, parameters: Optional[Dict[Union[str, pm4py.algo.filtering.pandas.cases.case_filter.Parameters], Any]] = None) pandas.core.frame.DataFrame[source]
pm4py.algo.filtering.pandas.cases.case_filter.filter_on_case_performance(df: pandas.core.frame.DataFrame, case_id_glue: str = 'case:concept:name', timestamp_key: str = 'time:timestamp', min_case_performance: float = 0, max_case_performance: float = 10000000000, business_hours=False, worktiming=[7, 17], weekends=[6, 7]) pandas.core.frame.DataFrame[source]

Filter a dataframe on case performance

Parameters
  • df – Dataframe

  • case_id_glue – Case ID column in the CSV

  • timestamp_key – Timestamp column to use for the CSV

  • min_case_performance – Minimum case performance

  • max_case_performance – Maximum case performance

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.cases.case_filter.filter_on_case_size(df0: pandas.core.frame.DataFrame, case_id_glue: str = 'case:concept:name', min_case_size: int = 2, max_case_size=None)[source]

Filter a dataframe keeping only traces with at least the specified number of events

Parameters
  • df – Dataframe

  • case_id_glue – Case ID column in the CSV

  • min_case_size – Minimum size of a case

  • max_case_size – Maximum case size

Returns

Filtered dataframe

Return type

df

pm4py.algo.filtering.pandas.cases.case_filter.filter_on_ncases(df: pandas.core.frame.DataFrame, case_id_glue: str = 'case:concept:name', max_no_cases: int = 1000)[source]

Filter a dataframe keeping only the specified maximum number of traces

Parameters
  • df – Dataframe

  • case_id_glue – Case ID column in the CSV

  • max_no_cases – Maximum number of traces to keep

Returns

Filtered dataframe

Return type

df

Module contents

This file is part of PM4Py (More Info: https://pm4py.fit.fraunhofer.de).

PM4Py is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

PM4Py is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with PM4Py. If not, see <https://www.gnu.org/licenses/>.