Parsing Procmon files with Python

Procmon (https://docs.microsoft.com/en-us/sysinternals/downloads/procmon) is a very powerful monitoring tool for Windows,
capable of capturing file system, registry, process/thread and network activity.
Procmon uses internal file formats for configuration (PMC) and logs (PML).
Prior to procmon-parser
, PMC files could only be parsed and generated by the Procmon GUI, and PML files
could be read only using the Procmon GUI, or by converting them to CSV or XML using Procmon command line.
The goals of procmon-parser
are:
- Parsing & Building PMC files - making it possible to dynamically add/remove filter rules, which can significantly
reduce the size of the log file over time as Procmon captures millions of events.
- Parsing PML files - making it possible to directly load the raw PML file into convenient python objects
instead of having to convert the file to CSV/XML formats prior to loading.
PMC (Process Monitor Configuration) Parser
Usage
Loading configuration of a pre-exported Procmon configuration:
>>> from procmon_parser import load_configuration, dump_configuration, Rule
>>> with open("ProcmonConfiguration.pmc", "rb") as f:
... config = load_configuration(f)
>>> config["DestructiveFilter"]
0
>>> config["FilterRules"]
[Rule(Column.PROCESS_NAME, RuleRelation.IS, "System", RuleAction.EXCLUDE), Rule(Column.PROCESS_NAME, RuleRelation.IS, "Procmon64.exe", RuleAction.EXCLUDE), Rule(Column.PROCESS_NAME, RuleRelation.IS, "Procmon.exe", RuleAction.EXCLUDE), Rule(Column.PROCESS_NAME, RuleRelation.IS, "Procexp64.exe", RuleAction.EXCLUDE), Rule(Column.PROCESS_NAME, RuleRelation.IS, "Procexp.exe", RuleAction.EXCLUDE), Rule(Column.PROCESS_NAME, RuleRelation.IS, "Autoruns.exe", RuleAction.EXCLUDE), Rule(Column.OPERATION, RuleRelation.BEGINS_WITH, "IRP_MJ_", RuleAction.EXCLUDE), Rule(Column.OPERATION, RuleRelation.BEGINS_WITH, "FASTIO_", RuleAction.EXCLUDE), Rule(Column.RESULT, RuleRelation.BEGINS_WITH, "FAST IO", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "pagefile.sys", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$Volume", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$UpCase", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$Secure", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$Root", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$MftMirr", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$Mft", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$LogFile", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.CONTAINS, "$Extend", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$Boot", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$Bitmap", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$BadClus", RuleAction.EXCLUDE), Rule(Column.PATH, RuleRelation.ENDS_WITH, "$AttrDef", RuleAction.EXCLUDE), Rule(Column.EVENT_CLASS, RuleRelation.IS, "Profiling", RuleAction.EXCLUDE)]
Adding some new rules
>>> new_rules = [Rule('PID', 'is', '1336', 'include'), Rule('Process_Name', 'contains', 'python')]
>>> config["FilterRules"] = new_rules + config["FilterRules"]
Dropping filtered events
>>> config["DestructiveFilter"] = 1
Dumping the new configuration to a file
>>> with open("ProcmonConfiguration1337.pmc", "wb") as f:
... dump_configuration(config, f)
File Format
For the raw binary format of PMC files you can refer to the docs, or take a look at the source code in configuration_format.py.
PML (Process Monitor Log) Parser
Usage
procmon-parser
exports a ProcmonLogsReader
class for reading logs directly from a PML file:
>>> from procmon_parser import ProcmonLogsReader
>>> f = open("LogFile.PML", "rb")
>>> pml_reader = ProcmonLogsReader(f)
>>> len(pml_reader)
53214
>>> first_event = next(pml_reader)
>>> print(first_event)
Process Name=dwm.exe, Pid=932, Operation=RegQueryValue, Path="HKCU\Software\Microsoft\Windows\DWM\ColorPrevalence", Time=7/12/2020 1:18:10.7752429 AM
>>> print(first_event.process)
"C:\Windows\system32\dwm.exe", 932
>>> for module in first_event.process.modules[:3]:
... print(module)
"C:\Windows\system32\dwm.exe", address=0x7ff6fa980000, size=0x18000
"C:\Windows\system32\d3d10warp.dll", address=0x7fff96700000, size=0x76c000
"C:\Windows\system32\wuceffects.dll", address=0x7fff9a920000, size=0x3f000
>>> first_event.stacktrace
[18446735291098361031, 18446735291098336505, 18446735291095097155, 140736399934388, 140736346856333, 140736346854333, 140698742953668, 140736303659045, 140736303655429, 140736303639145, 140736303628747, 140736303625739, 140736303693867, 140736303347333, 140736303383760, 140736303385017, 140736398440420, 140736399723393]
>>>
File Format
For the raw binary format of PML files you can refer to the docs, or take a look at the source code in stream_logs_format.py.
Currently the parser is only tested with PML files saved by Procmon.exe of versions v3.4.0 or higher.
TODO
The PML format is very complex so there are some features (unchecked in the list) that are not supported yet:
These are a lot of operation types so I didn't manage to get to all of them yet :(
If there is an unsupported operation which you think its details are interesting, please let me know :)
Tests
To test that the parsing is done correctly, There are two fairly large Procmon PML files and their respective CSV format
log files, taken from 64 bit and 32 bit machine. The test checks that each event in the PML parsed by procmon-parser
equals to the respective event in the CSV.
Contributing
procmon-parser
is developed on GitHub at eronnen/procmon-parser.
Feel free to report an issue or send a pull request, use the
issue tracker.