apread
previously: Catman Reader
Support this project
Cite this project
If you use this software in any of your work, please cite it using the "Cite this repository" button in the right sidebar or use this:
@software{leonbohmann_APReader,
author = {Bohmann, Leon},
doi = {10.5281/zenodo.8369804},
month = sep,
title = {{leonbohmann/APReader: v1.1.2}},
url = {https://github.com/leonbohmann/APReader},
version = {v1.1.2},
year = {2023}
}
General
Read binary files produced from catmanAP projects directly into python.
CatmanAP procudes .bin files after each measurement. While it is possible to export as a different format (i.e. txt or asc) it's not efficient because one has to change the export format after every measurement. Here comes the treat: Just export as binary and use this package to work with binary files directly.
After reading all channels from the binary file, the channels are analyzed and every measure-channel will receive a reference to a time channel, depending on the amount of entries in the channels and the fact, that the time-channel has to contain "time" or "zeit" in its name. What that means is, that a channel with x entries and the name "time - 1" will be regarded as the time-channel of any other channel with x Data Entries.
Here is an example plot, generated directly from a binary file:
Installation/Update
Anywhere with python (note the uppercase U):
pip install -U apread
How it works
The workflow of the package is straight-forward. You supply a binary file created with CatmanAP and the script will read that into python.
First of all, the binary data is analyzed and packaged into seperate Channel
objects. When all Channels
are created, each Channel.Name
will be checked against ([T|t]ime)|([Z|z]eit)
, which mark time channels which usually are the reference.
These Channels
marked as istime
are the basis for Groups
. Inside a group you will find the ChannelX
(time channel) and a bunch of other channels in ChannelsY
, which are the channels containing data in that time domain. The corresponding channels inside a Group
are found by analyzing their length. Since the total time measured is the same for all groups, it is assumed that Channels
with the same data-length belong to the same group. Connecting the matching channels to the group give a structured representation of your measurement data.
Now that the Data is available in python you are free to do with that whatever you want. Until Version 1.0.x
there were some features in which you can save the data but that feature has been removed.
Usage
Lets say you produced a file called measurements.bin
and you put it in the directory of your python script, then you can create the APReader
on that file. It's that simple. The Initialization may take some time depending on how large your .bin-File is.
from apread import APReader
reader = APReader('measurements.bin')
Print channels
Afterwards you can access the Channels
by accessing the APReader.Channels
Member. Channel
and Group
implement __str__
which will return the name and the length of data inside it.
for channel in reader.Channels:
print(channel)
for group in reader.Groups:
print(group)
Plot Channels/Groups
To review your data on the fly, you can plot every entity in the data structure by calling .plot()
. When plotting, every group will get its own figure window, in which all connected channels are plotted.
reader.plot()
for group in reader.Groups:
group.plot()
for channel in reader.Channels:
channel.plot()
As you can see, you can access the channels from the reader, which contains all channels (including time channels) or you can access them from the groups.
There are some more functions to plot specific data. When plotting multiple channels each channel gets its own y-axis.
group.plotChannel(0)
group.plotChannels(0,3)
group.plot([0, 2, 4])
The same can be applied to the APReader
. The only difference is that you can plot specific groups instead of channels.
reader.plotGroup(0)
reader.plotGroups(0,3)
reader.plot([0, 2, 4])
Thanks to (hakonbars PR13) you are now able to access external header information using channel.exthdr
, a dicitionary containing all keys as described in this sheet.
['T0']
['dt']
['SensorType']
['SupplyVoltage']
['FiltChar']
['FiltFreq']
['TareVal']
['ZeroVal']
['MeasRange']
['InChar']
['SerNo']
['PhysUnit']
['NativeUnit']
['Slot']
['SubSlot']
['AmpType']
['APType']
['kFactor']
['bFactor']
['MeasSig']
['AmpInput']
['HPFilt']
['OLImportInfo']
['ScaleType']
['SoftwareTareVal']
['WriteProtected']
['NominalRange']
['CLCFactor']
['ExportFormat']
Parallel reading of data
Only available from version v1.1.1-alpha1
and above
See test/testing.py
for a full example. Modify the following around your APReader
call:
import multiprocessing as mp
...
if __name__ == '__main__':
pool = mp.Pool()
reader = APReader(file, parallelPool=pool)
mp.close()
mp.join()
For the parallel loading to work, you have to define a parallel pool of processes in your top-level script. These processes will be accessed from within APReader
-Functions. When passing no arguments to mp.Pool()
it will automatically create as many processes as possible, according to the amount of threads your CPU allows (cores + virtual cores). It does not make sense to pass in more, since the APReader
spawns the same amount of processes as there are CPU Threads. Increasing the amount of processes in your pool does not increase the amount of parallelism. It is fixed.
Keep in mind, that parallelisation is not always faster. Spawning of processes is expensive and can be wasteful for small files.
The results from APReader
stay the same and you can continue your analysis.
Release History
Version 1.1.1-alpha1
- Added converted timestamp property on channels (
Channel.date
)
- Property
Channel.time
will be deleted at some point in the future...
- Parallel reading of binary files
- Max degree of parallelism is automatically set to amount of available cores
-
-
Version 1.1
Breaking changes
- Removed saving functions, this will be up to the user
Since these function change a lot based on current needs, I decided to remove the post-processing functionality completely. The user now needs to do the post-processing on his own, meaning the creation of plots using time and data channels...
Changes
- (hakonbar PR13) Differentiate floating point precision
- (hakonbar PR13) Reading additional header information
- (hakonbar PR13) Supplying binary format reference
- Fixed null returning string conversion function
- Using regex to find time channels
- Improved plotting with multiple axes
- Printing channels and groups will now give a summary instead of all data
Version 1.0.22
- Fixed an issue with groups where time channels are not recognized
- now, user is prompted, when suspected time channel is found
- plotting is not possible when there is no time-channel found
- save groups and channels even when there is no time channel
Version 1.0.21
- Updated serialisation-procedures to always encode in
UTF-8
Version 1.0.20
- Switched to explicit type hinting with
typing
package (compatibility issues with python <3.9.x)
Version 1.0.15/16
- Fixed an issue with saving and non-existent directories
- Added
getas
to generate formatted string without saving
Version 1.0.14
- Output file-names updated
Version 1.0.12/13
- Group channels with their time-channel into "groups"
- Multiple plot modes:
- Whole file
- Channel/Group only
- Output data
Version 1.0.11
- Progressbars indicate read-progress of files
- Multiple plot modes
Version 1.0.0
- Convert catman files to channels
Meta
Leon Bohmann – mail@leonbohmann.de
Distributed under the MIT license. See LICENSE
for more information.
This software comes with no warranty, expressed or implied. Use at your own risk!
https://github.com/leonbohmann/apreader