Introduction
What is this repository for?
- This repository include Python scripts that transform raw sensor data collected from Android mobile devices into intermediate CSV file for analysis of various purposes.
- The preprocessing scripts here intend to be used to provide reliable, consistent, and structured intermediate data file for data analysis related to MicroT project.
Summary of Preprocessed Intermediate File
The discussion on data source to include can bee found in the #86 Issue of DataProcessing repository
1. Android Smartwatch
Prompt response | [participant_folder] / logs-watch / DATE / HOUR / PromptResponses.log.csv | ID, EMA_type, Date, Prompt Timestamp, Time Zone, Completion Status, Reprompt, Response Timestamp, Q-Key, Response | None | [save_path]/[participant_id]/DATE/watch_prompt_response_PARTICIPANT_DATE.csv |
Battery level | [participant_folder] / data-watch / DATE / HOUR / Battery.##.event.csv | timestamp, battery_level, battery_charging | None | [save_path]/[participant_id]/DATE/watch_battery_PARTICIPANT_DATE.csv |
Accelerometer data | [participant_folder] / data-watch / DATE / HOUR / AndroidWearWatch-AccelerationCalibrated-NA.*.sensor.baf | header_timestamp, accelation_meters_per_second_squared(X,Y,Z axis), MIMS-unit | None | [participant_folder] / data-watch / DATE / HOUR / 020000000000-AccelerationCalibrated.*.sensor.csv, mims_DATE_HOUR.csv; a copy of former two in [save_path]/[participant_id]/DATE/ |
App usages | [participant_folder] / data-watch / DATE / HOUR / AppEventCounts.csv | log_time, last_hour_timestamp, current_hour_time_stamp, app_package_name, event_time_stamp, app_event | None | [save_path]/[participant_id]/DATE/phone_app_usage_PARTICIPANT_DATE.csv |
2. Android Smartphone
Prompt response | [participant_folder] / logs / DATE / HOUR / PromptResponses.log.csv | ID, EMA_type, Date, Prompt Timestamp, Time Zone, Completion Status, Reprompt, Response Timestamp, Q-Key, Response | None | [save_path]/[participant_id]/DATE/phone_prompt_response_PARTICIPANT_DATE.csv |
GPS data | [participant_folder] / data / DATE / HOUR / GPS.csv | log_time, location_time, lat, long, horizontal_accuracy, provider, speed, altitude, bearing | None | [save_path]/[participant_id]/DATE/phone_GPS_PARTICIPANT_DATE.csv |
step count | [participant_folder] / data / DATE / HOUR / StepCounterService.csv | log_time_stamp, steps_last_hour, accumulated_steps | None | [save_path]/[participant_id]/DATE/phone_stepCount_PARTICIPANT_DATE.csv |
Phone state and detected activities | [participant_folder] / data / DATE / HOUR / ActivityDetected.csv | log_time, in_vehicle, on_bike, on_foot, running, still, tilting, walking, unknown | None | [save_path]/[participant_id]/DATE/phone_detected_activity_PARTICIPANT_DATE.csv |
Phone usage events and broadcasts | | | | [save_path]/[participant_id]/DATE/phone_usage_broadcasts_PARTICIPANT_DATE.csv |
Environmental sensors | LightSensorStats.csv, ProximitySensorManagerService.csv, AmbientPressManagerService.csv, AmbientTempManagerService.csv, AmbientHumidManagerService.csv | log time, sensor value, sensor max | None | |
Code Book
Detailed explanation of columns in intermediate files can be found in the code book.
Python Version
Python 3.6+ (Other versions haven't been tested but should be fine)
Dependencies
- For user who wants to include accelerometer and MIMS-unit, the below are extra set-up:
- The MIMS-unit depends on particular R package, so install R on your system.
- Add R to your environment variables. For windowsOS users, put path similar to "C:\Program Files\R\R-4.0.2\bin\x64" to your path in system variables, and reboot your computer.
Usage Option1: Install and use as a python package
Install package
The pypi link can be accessed here.
> pip install microt-preprocessing
Import package in python
from microt_preprocessing import time_study_preprocessing_main
microT_root_path = <microT_root_path> # path to the data source folder
intermediate_file_save_path = <intermediate_file_save_path> # path to the destination folder
decrypt_password = <decrypt_password> # decryption password for GPS file
delete_raw = "0" or "1" # "0" denotes not deleting data source, "1" denotes deleting data source
date = "2020-06-11" # specific dates to be preprocessed
time_study_preprocessing_main.preprocessing_all_ema.run_ema_main(microT_root_path, intermediate_file_save_path, decrypt_password, delete_raw, date)
Usage Option2: Clone this project and run scripts
-
Run script
python preprocessing_all_ema.py <microT_root_path> <intermediate_file_save_path> [date_start]
python preprocessing_all_uema.py <microT_root_path> <intermediate_file_save_path> <participants_included_text_file_path> [date_start] [date_end]
Who do I talk to?
Maintained by Aditya and Jixin