API Reference#

This page provides an overview of all public pm4py objects, functions and methods.

Input (pm4py.read)#

pm4py supports importing the following standardized event data format:

In case an event log is stored as a .csv file, pandas can be used to directly import the event log as a data frame (docs). .xes files are internally converted to a pandas dataframe, which is the default data structure used by all algorithms implemented in pm4py.

Additional file formats that are currently supported by pm4py are:

Importing object-centric event logs is possible given the following formats: * .csv specification pm4py.read.read_ocel_csv() * .jsonocel specification pm4py.read.read_ocel_jsonocel() * .xmlocel specification pm4py.read.read_ocel_xmlocel() * .sqlite specification pm4py.read.read_ocel_sqlite()

Output (pm4py.write)#

Similarly to event data importing, pm4py supports export functionalities to:

Exporting object-centric event logs is possible to the following formats: * .csv specification pm4py.write.write_ocel_csv() * .jsonocel specification pm4py.write.write_ocel_jsonocel() * .xmlocel specification pm4py.write.write_ocel_xmlocel() * .sqlite specification pm4py.write.write_ocel_sqlite()

Conversion (pm4py.convert)#

Several conversions are available from/to different objects supported by pm4py. The following conversions are currently available:

Process Discovery (pm4py.discovery)#

Process Discovery algorithms discover a process model that describes the process execution, as stored in the event log. pm4py implements a variety of different process discovery algorithms. These different algorithms return different kinds of models, i.e., models with imprecise execution semantics, procedural process models and declarative process models. Among the models with imprecise execution semantics, pmp4py currently supports:

Among procedural process models, pmp4py currently supports:

Among declarative process models, pmp4py currently supports:

Conformance Checking (pm4py.conformance)#

Conformance checking techniques compare a process model with an event log of the same process. The goal is to check if the event log conforms to the model, and, vice versa. Among procedural process models, pmp4py currently supports:

Among declarative process models, pmp4py currently supports:

Visualization (pm4py.vis)#

The pm4py library implements basic visualizations of process models and statistics. Among the on-screen visualizations, pmp4py currently supports:

We offer also some methods to store the visualizations on the disk:

Statistics (pm4py.stats)#

Different statistics that could be computed on top of event logs are proposed, including:

Filtering (pm4py.filtering)#

Filtering is the restriction of the event log to a subset of the behavior. Different methods are offered in pm4py for traditional event logs (.xes, .csv), including:

Also, some filtering techniques are offered on top of object-centric event logs:

Machine Learning (pm4py.ml)#

PM4Py offers some features useful for the application of machine learning techniques. Among those:

Simulation (pm4py.sim)#

We offer different simulation algorithms, that starting from a model, are able to produce an output that follows the model and the different rules that have been provided by the user. Among those:

Object-Centric Process Mining (pm4py.ocel)#

Traditional event logs, used by mainstream process mining techniques, require the events to be related to a case. A case is a set of events for a particular purpose. A case notion is a criteria to assign a case to the events.

However, in real processes this leads to two problems:

  • If we consider the Order-to-Cash process, an order could be related to many different deliveries. If we consider the delivery as case notion, the same event of Create Order needs to be replicated in different cases (all the deliveries involving the order). This is called the convergence problem.

  • If we consider the Order-to-Cash process, an order could contain different order items, each one with a different lifecycle. If we consider the order as case notion, several instances of the activities for the single items may be contained in the case, and this make the frequency/performance annotation of the process problematic. This is called the divergence problem.

Object-centric event logs relax the assumption that an event is related to exactly one case. Indeed, an event can be related to different objects of different object types.

Essentially, we can describe the different components of an object-centric event log as:

  • Events, having an identifier, an activity, a timestamp, a list of related objects and a dictionary of other attributes.

  • Objects, having an identifier, a type and a dictionary of other attributes.

  • Attribute names, e.g., the possible keys for the attributes of the event/object attribute map.

  • Object types, e.g., the possible types for the objects.

In PM4Py, we offer object-centric process mining features:

Some object-centric process discovery algorithms are also offered:

Social Network Analysis (pm4py.org)#

We offer different algorithms for the analysis of the organizational networks starting from an event log:

Privacy (pm4py.privacy)#

We offer the following algorithms for the anonymization of event logs:

Utilities (pm4py.utils)#

Other algorithms, which do not belong to the aforementioned categories, are collected in this section:

Overall List of Methods#


The pm4py.read module contains all funcationality related to reading files/objects from disk.


Reads a BPMN model from a .bpmn file


Reads a DFG object from a .dfg file.

pm4py.read.read_pnml(file_path[, ...])

Reads a Petri net object from a .pnml file.


Reads a process tree object from a .ptml file

pm4py.read.read_xes(file_path[, variant, ...])

Reads an event log stored in XES format (see xes-standard) Returns a table (pandas.DataFrame) view of the event log.

pm4py.read.read_ocel_csv(file_path[, ...])

Reads an object-centric event log from a CSV file (see: http://www.ocel-standard.org/).


Reads an object-centric event log from a SQLite database (see: http://www.ocel-standard.org/).


The pm4py.write module contains all funcationality related to writing files/objects to disk.

pm4py.write.write_bpmn(model, file_path[, ...])

Writes a BPMN model object to disk in the .bpmn format.

pm4py.write.write_dfg(dfg, start_activities, ...)

Writes a directly follows graph (DFG) object to disk in the .dfg format.

pm4py.write.write_pnml(petri_net, ...)

Writes a Petri net object to disk in the .pnml format (see pnml-standard)

pm4py.write.write_ptml(tree, file_path)

Writes a process tree object to disk in the .ptml format.

pm4py.write.write_xes(log, file_path[, ...])

Writes an event log to disk in the XES format (see xes-standard)

pm4py.write.write_ocel_csv(ocel, file_path, ...)

Writes an OCEL object to disk in the .csv file format.

pm4py.write.write_ocel_sqlite(ocel, file_path)

Writes an OCEL object to disk to a SQLite database (exported as .sqlite file).


The pm4py.convert module contains the cross-conversions implemented in pm4py

pm4py.convert.convert_to_event_log(obj[, ...])

Converts a DataFrame/EventStream object to an event log object

pm4py.convert.convert_to_event_stream(obj[, ...])

Converts a log object to an event stream


Converts a log object to a dataframe


Converts an object to a BPMN diagram.


Converts an input model to an (accepting) Petri net.


Converts an input model to a process tree.


Converts an input model to a reachability graph (transition system).

pm4py.convert.convert_log_to_ocel(log[, ...])

Converts an event log to an object-centric event log with one or more than one object types.

pm4py.convert.convert_log_to_networkx(log[, ...])

Converts an event log object to a NetworkX DiGraph object.


Converts an OCEL to a NetworkX DiGraph object.


Converts a Petri net to a NetworkX DiGraph.


The pm4py.discovery module contains the process discovery algorithms implemented in pm4py

pm4py.discovery.discover_dfg(log[, ...])

Discovers a Directly-Follows Graph (DFG) from a log.


Discovers a performance directly-follows graph from an event log.


Discovers a Petri net using the Alpha Miner.


Discovers a Petri net using the inductive miner algorithm.


Discover a Petri net using the Heuristics Miner


Discovers a process tree using the inductive miner algorithm


Discovers an heuristics net


This algorithm computes the minimum self-distance for each activity observed in an event log.


Discovers the footprints out of the provided event log / process model


Gets the eventually follows graph from a log object.


Discovers a BPMN using the Inductive Miner algorithm


Discovers a transition system as described in the process mining book "Process Mining: Data Science in Action"

pm4py.discovery.discover_prefix_tree(log[, ...])

Discovers a prefix tree from the provided log object.


Discovers a temporal profile from a log object.

pm4py.discovery.discover_log_skeleton(log[, ...])

Discovers a log skeleton from an event log.

pm4py.discovery.discover_batches(log[, ...])

Discover batches from the provided log object


The pm4py.conformance module contains the conformance checking algorithms implemented in pm4py


Apply token-based replay for conformance checking analysis.


Apply the alignments algorithm between a log and a process model.


Calculates the fitness using token-based replay.

pm4py.conformance.fitness_alignments(log, ...)

Calculates the fitness using alignments


Calculates the precision precision using token-based replay

pm4py.conformance.precision_alignments(log, ...)

Calculates the precision of the model w.r.t.


Performs conformance checking on the provided log with the provided temporal profile.


Performs conformance checking using the log skeleton


The pm4py.vis module contains the visualizations offered in pm4py

pm4py.vis.view_petri_net(petri_net[, ...])

Views a (composite) Petri net

pm4py.vis.save_vis_petri_net(petri_net, ...)

Saves a Petri net visualization to a file

pm4py.vis.view_performance_dfg(dfg, ...[, ...])

Views a performance DFG

pm4py.vis.save_vis_performance_dfg(dfg, ...)

Saves the visualization of a performance DFG

pm4py.vis.view_dfg(dfg, start_activities, ...)

Views a (composite) DFG

pm4py.vis.save_vis_dfg(dfg, ...[, bgcolor])

Saves a DFG visualization to a file

pm4py.vis.view_process_tree(tree[, format, ...])

Views a process tree

pm4py.vis.save_vis_process_tree(tree, file_path)

Saves the visualization of a process tree

pm4py.vis.view_bpmn(bpmn_graph[, format, ...])

Views a BPMN graph

pm4py.vis.save_vis_bpmn(bpmn_graph, file_path)

Saves the visualization of a BPMN graph

pm4py.vis.view_heuristics_net(heu_net[, ...])

Views an heuristics net

pm4py.vis.save_vis_heuristics_net(heu_net, ...)

Saves the visualization of an heuristics net

pm4py.vis.view_dotted_chart(log[, format, ...])

Displays the dotted chart

pm4py.vis.save_vis_dotted_chart(log, file_path)

Saves the visualization of the dotted chart


Represents a SNA metric (.html)

pm4py.vis.save_vis_sna(sna_metric, file_path)

Saves the visualization of a SNA metric in a .html file

pm4py.vis.view_case_duration_graph(log[, ...])

Visualizes the case duration graph

pm4py.vis.save_vis_case_duration_graph(log, ...)

Saves the case duration graph in the specified path

pm4py.vis.view_events_per_time_graph(log[, ...])

Visualizes the events per time graph


Saves the events per time graph in the specified path

pm4py.vis.view_performance_spectrum(log, ...)

Displays the performance spectrum

pm4py.vis.save_vis_performance_spectrum(log, ...)

Saves the visualization of the performance spectrum to a file


Shows the distribution of the events in the specified dimension


Saves the distribution of the events in a picture file

pm4py.vis.view_ocdfg(ocdfg[, annotation, ...])

Views an OC-DFG (object-centric directly-follows graph) with the provided configuration.

pm4py.vis.save_vis_ocdfg(ocdfg, file_path[, ...])

Saves the visualization of an OC-DFG (object-centric directly-follows graph) with the provided configuration.

pm4py.vis.view_ocpn(ocpn[, format, bgcolor])

Visualizes on the screen the object-centric Petri net

pm4py.vis.save_vis_ocpn(ocpn, file_path[, ...])

Saves the visualization of the object-centric Petri net into a file

pm4py.vis.view_object_graph(ocel, graph[, ...])

Visualizes an object graph on the screen

pm4py.vis.save_vis_object_graph(ocel, graph, ...)

Saves the visualization of an object graph


Visualizes the network analysis

pm4py.vis.save_vis_network_analysis(...[, ...])

Saves the visualization of the network analysis

pm4py.vis.view_transition_system(...[, ...])

Views a transition system

pm4py.vis.save_vis_transition_system(...[, ...])

Persists the visualization of a transition system

pm4py.vis.view_prefix_tree(trie[, format, ...])

Views a prefix tree

pm4py.vis.save_vis_prefix_tree(trie, file_path)

Persists the visualization of a prefix tree

pm4py.vis.view_alignments(log, aligned_traces)

Views the alignment table as a figure

pm4py.vis.save_vis_alignments(log, ...)

Saves an alignment table's figure in the disk

pm4py.vis.view_footprints(footprints[, format])

Views the footprints as a figure

pm4py.vis.save_vis_footprints(footprints, ...)

Saves the footprints' visualization on disk


The pm4py.stats module contains the statistics offered in pm4py

pm4py.stats.get_start_activities(log[, ...])

Returns the start activities from a log object

pm4py.stats.get_end_activities(log[, ...])

Returns the end activities of a log


Returns the attributes at the event level of the log


Gets the attributes at the trace level of a log object

pm4py.stats.get_event_attribute_values(log, ...)

Returns the values for a specified (event) attribute

pm4py.stats.get_trace_attribute_values(log, ...)

Returns the values for a specified trace attribute

pm4py.stats.get_variants(log[, ...])

Gets the variants from the log

pm4py.stats.get_variants_as_tuples(log[, ...])

Gets the variants from the log (where the keys are tuples and not strings)


This algorithm computes the minimum self-distance for each activity observed in an event log.


This function derives the minimum self distance witnesses.

pm4py.stats.get_case_arrival_average(log[, ...])

Gets the average difference between the start times of two consecutive cases


Find out for which activities of the log the rework (more than one occurrence in the trace for the activity) occurs.

pm4py.stats.get_cycle_time(log[, ...])

Calculates the cycle time of the event log.

pm4py.stats.get_all_case_durations(log[, ...])

Gets the durations of the cases in the event log

pm4py.stats.get_case_duration(log, case_id)

Gets the duration of a specific case


Given an event log, returns a dictionary which summarize the positions of the activities in the different cases of the event log.

pm4py.stats.get_stochastic_language(*args, ...)

Gets the stochastic language from the provided object


The pm4py.filtering module contains the filtering features offered in pm4py


Filters the event log keeping only the events having an attribute value which occurs: - in at least the specified (min_relative_stake) percentage of events, when level="events" - in at least the specified (min_relative_stake) percentage of cases, when level="cases"

pm4py.filtering.filter_start_activities(log, ...)

Filter cases having a start activity in the provided list

pm4py.filtering.filter_end_activities(log, ...)

Filter cases having an end activity in the provided list


Filter a log object on the values of some event attribute


Filter a log on the values of a trace attribute

pm4py.filtering.filter_variants(log, variants)

Filter a log on a specified set of variants


Retain traces that contain any of the specified 'directly follows' relations.


Retain traces that contain any of the specified 'eventually follows' relations.

pm4py.filtering.filter_time_range(log, dt1, dt2)

Filter a log on a time interval

pm4py.filtering.filter_between(log, act1, act2)

Finds all the sub-cases leading from an event with activity "act1" to an event with activity "act2" in the log, and returns a log containing only them.

pm4py.filtering.filter_case_size(log, ...[, ...])

Filters the event log, keeping the cases having a length (number of events) included between min_size and max_size

pm4py.filtering.filter_case_performance(log, ...)

Filters the event log, keeping the cases having a duration (the timestamp of the last event minus the timestamp of the first event) included between min_performance and max_performance


Filters the event log, keeping the cases where the specified activity occurs at least min_occurrences times.


Filters the event log, either: - (keep=True) keeping the cases having the specified path (tuple of 2 activities) with a duration included between min_performance and max_performance - (keep=False) discarding the cases having the specified path with a duration included between min_performance and max_performance

pm4py.filtering.filter_variants_top_k(log, k)

Keeps the top-k variants of the log


Filters the variants of the log by a coverage percentage (e.g., if min_coverage_percentage=0.4, and we have a log with 1000 cases, of which 500 of the variant 1, 400 of the variant 2, and 100 of the variant 3, the filter keeps only the traces of variant 1 and variant 2).

pm4py.filtering.filter_prefixes(log, activity)

Filters the log, keeping the prefixes to a given activity.

pm4py.filtering.filter_suffixes(log, activity)

Filters the log, keeping the suffixes from a given activity.


Filters the object-centric event log on the provided event attributes values


Filters the object-centric event log on the provided object attributes values


Filters an object-centric event log keeping only the specified object types with the specified activity set (filters out the rest).


Filters the events of the object-centric logs which are related to at least the specified amount of objects per type.


Filters the events in which a new object for the given object type is spawn.


Filters the events in which an object for the given object type terminates its lifecycle.


Filters the object-centric event log keeping events in the provided timestamp range


Filter the cases of the log which violates the four eyes principle on the provided activities.


Filters the cases where an activity is repeated by different resources.


Filters the object types of an object-centric event log.

pm4py.filtering.filter_ocel_events(ocel, ...)

Filters the event identifiers of an object-centric event log.

pm4py.filtering.filter_ocel_objects(ocel, ...)

Filters the object identifiers of an object-centric event log.

pm4py.filtering.filter_ocel_cc_object(ocel, ...)

Returns the connected component of the object-centric event log to which the object with the provided identifier belongs.


The pm4py.ml module contains the machine learning features offered in pm4py

pm4py.ml.split_train_test(log[, ...])

Split an event log in a training log and a test log (for machine learning purposes).

pm4py.ml.get_prefixes_from_log(log, length)

Gets the prefixes of a log of a given length.

pm4py.ml.extract_features_dataframe(log[, ...])

Extracts a dataframe containing the features of each case of the provided log object


Extracts a dataframe containing the temporal features of the provided log object


The pm4py.sim module contains the simulation algorithms offered in pm4py

pm4py.sim.play_out(*args, **kwargs)

Performs the playout of the provided model, i.e., gets a set of traces from the model.


Generates a process tree


The pm4py.ocel module contains the object-centric process mining features offered in pm4py


Gets the list of object types contained in the object-centric event log (e.g., ["order", "item", "delivery"]).


Gets the list of attributes at the event and the object level of an object-centric event log (e.g.

pm4py.ocel.ocel_flattening(ocel, object_type)

Flattens the object-centric event log to a traditional event log with the choice of an object type.


Gets the set of activities performed for each object type


Counts for each event the number of related objects per type

pm4py.ocel.discover_ocdfg(ocel[, ...])

Discovers an OC-DFG from an object-centric event log.


Discovers an object-centric Petri net from the provided object-centric event log.


Returns the ``temporal summary'' from an object-centric event log. The temporal summary aggregates all the events performed in the same timestamp, and reports the set of activities and the involved objects.


Gets the objects summary of an object-centric event log


Gets the objects interactions summary of an object-centric event log.

pm4py.ocel.sample_ocel_objects(ocel, num_objects)

Given an object-centric event log, returns a sampled event log with a subset of the objects that is chosen in a random way.


Given an object-centric event log, returns a sampled event log with a subset of the executions.


Drop relations between events and objects happening at the same time, with the same activity, to the same object identifier.

pm4py.ocel.ocel_merge_duplicates(ocel[, ...])

Merge events in the OCEL that happen with the same activity at the same timestamp


The pm4py.org module contains the organizational analysis techniques offered in pm4py


Calculates the handover of work network of the event log.


Calculates the working together network of the process.


Calculates similarity between the resources in the event log, based on their activity profiles.


Calculates the subcontracting network of the process.


Mines the organizational roles

pm4py.org.discover_network_analysis(log, ...)

Performs a network analysis of the log based on the provided parameters.


pm4py.analysis.solve_marking_equation(...[, ...])

Solves the marking equation of a Petri net.

pm4py.analysis.check_soundness(petri_net, ...)

Check if a given Petri net is a sound WF-net. A Petri net is a WF-net iff: - it has a unique source place - it has a unique end place - every element in the WF-net is on a path from the source to the sink place A WF-net is sound iff: - it contains no live-locks - it contains no deadlocks - we are able to always reach the final marking For a formal definition of sound WF-net, consider: http://www.padsweb.rwth-aachen.de/wvdaalst/publications/p628.pdf.


Inserts the artificial start/end activities in an event log / Pandas dataframe


Checks if the input Petri net satisfies the WF-net conditions: 1.

pm4py.analysis.maximal_decomposition(net, im, fm)

Calculate the maximal decomposition of an accepting Petri net.

pm4py.analysis.generate_marking(net, ...)

Generate a marking for a given Petri net

pm4py.analysis.compute_emd(language1, language2)

Computes the earth mover distance between two stochastic languages (for example, the first extracted from the log, and the second extracted from the process model.


Reduce the number of invisibles transitions in the provided Petri net.


pm4py.utils.rebase(log_obj[, case_id, ...])

Re-base the log object, changing the case ID, activity and timestamp attributes.


Parse a process tree from a string


Serialize a PM4Py object into a bytes string


Deserialize a bytes string to a PM4Py object

pm4py.utils.parse_event_log_string(traces[, ...])

Parse a collection of traces expressed as strings (e.g., ["A,B,C,D", "A,C,B,D", "A,D"]) to a log object (Pandas dataframe)


Project the event log on a specified event attribute.

pm4py.utils.sample_cases(log, num_cases[, ...])

(Random) Sample a given number of cases from the event log.

pm4py.utils.sample_events(log, num_events)

(Random) Sample a given number of events from the event log.