--- title: "Audit logging data" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Audit logging data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, echo = TRUE, comment = "#>" ) ``` `KoboToolbox` audit logging feature records all activities related to a specific form submission in a log file. This log file can include things like when the form was opened, when individual questions were answered, when the form was saved, and when it was finally submitted. This feature provides a detailed record of the timing and sequence of events associated with each form submission. This feature can be especially beneficial for several reasons: - **Data Quality Control:** Audit logs can help data managers in verifying that data collection activities are happening as planned. For example, if a survey is supposed to take 20 minutes on average and you see many instances of it being completed in 2 minutes, that could be a sign of rushed or careless data entry. - **Troubleshooting:** If issues with data collection arise, the audit logs can provide clues as to what might be going wrong. For example, if a particular question is often being skipped or answered incorrectly, that could suggest a problem with the question wording or placement. - **Security and Accountability:** If data are altered or deleted, audit logs can provide a trail of what happened and who was involved. This can be important for maintaining the integrity of the data and holding individuals accountable for their actions. - **Workflow Management:** Managers can better comprehend the duration of different parts of the data collection process and seek ways to increase efficiency by reviewing the timestamps in the audit logs. ## Audit logging data The form below provides a toy example to showcase how audit logs can be read using `robotoolbox`. ```{r setup, echo = FALSE} library(robotoolbox) library(dplyr) ``` ```{r asset_list, echo = FALSE} l <- asset_list ``` - **Survey questions** | type | name | label | parameters | |:-----------|:-----------|:------------------------|:---------------------------------------------------------------------------| | start | start | | | | end | end | | | | username | username | | | | audit | audit | | identify-user=true location-priority=balanced location-min-interval=60 location-max-age=120 track-changes=true track-changes-reasons=on-form-edit | | text | Q1 | Q1. What is your name? | | | integer | Q2 | Q2. How old are you? | | We have four metadata questions: `start`, `end`, `username` and `audit`. You need to have the `audit` metadata question enable to use this feature. We also have two questions: `Q1` and `Q2`. ### Loading the project The above form was uploaded to the server. It's the only project named `Audit multi params`, and can be selected from the list of assets `asset_list`. ```{r, eval = FALSE} library(robotoolbox) library(dplyr) asset_list <- kobo_asset_list() uid <- filter(asset_list, name == "Audit multi params") |> pull(uid) asset <- kobo_asset(uid) asset ``` ```{r, echo = FALSE} asset <- asset_audit asset ``` ### Extracting the audit data In order to get the audit logging, we need to use the `kobo_audit` function. ```{r, eval = FALSE} df <- kobo_audit(asset) glimpse(df) ``` ```{r, echo = FALSE} df <- data_audit glimpse(df) ``` The columns in the audit logging data include: - `_id` : This columns generated by `robotoolbox` allow you to do a mapping the `_id` of the submissions in `kobo_data`. - `event`: This column records the action that took place. The different event types include form start, form exit, question, group questions, end screen, and device or metadata audit. - `node`: This column records the name of the question or group related to the event. - `name`: This column is appended by `robotoolbox` to match the name of the question in the audit and the data from `kobo_data`. - `start_int`: This column records the timestamp when the event started in integer. - `end_int`: This column records the timestamp when the event ended in integer. - `start`: This column records the timestamp when the event started in date time format (`POSIXct`). - `end`: This column records the timestamp when the event ended in date time format (`POSIXct`). - `latitude`: This column records the latitude of the device when the event occurred. - `longitude`: This column records the longitude of the device when the event occurred. - `accuracy`: This column records the GPS accuracy of the location data. - `old-value`: This column records the previous value of the question before it was changed in this event. - `new-value`: This column records the new value of the question after it was changed in this event. - `user`: This column records the username of the data collector. - `change-reason`: This column records the reason before they save changes to a form. The structure of the output depends on the parameters of the audit logging you set in your form. For instance, if you set the parameter `track-changes=true`, the columns `old-value` and `new-value` become available. `latitude`, `longitude` and `accuracy` are associated to the parameter `location-priority`. The `user` column is available when you use the `identify-user=true` parameter. Using the parameter `track-changes-reasons=on-form-edit` prevent you to edit a filled out forms without giving a reason. These reasons are recored in the column `change-reason`. You can learn how to use audit logging in the documentation of `KoboToolbox` and `ODK`. - ODK's guide on audit logging: https://docs.getodk.org/form-audit-log/ - KoboToolbox's guide on audit logging: https://support.kobotoolbox.org/audit_logging.html