We’ve had some performance issues with Eramba when working with large sets of data on the Online Assessments module (context below). I’d like to see if anyone could help me find answers to some more specific questions:
-
Are there any config tweaks that can be made to improve queries made against 13k+ OA Items?
-
Are there any ways to programmatically export answered OA questionnaires as user-readable PDFs (e.g., creating a PDF of an OA’s questions, answers, comments, findings, attachments, etc. generated for each OA that meets X criteria)?
-
Other than running a query using a filter and using the “export as CSV” options, are there any better ways to export large sets of OAs and OA items (including comments, findings, and attachments) for external processing or archival purposes?
-
Assuming we can get all the info we need copied out of Eramba for OAs, can OAs and related OA items be readily purged en masse to reduce query time and improve performance?
-
Is it correct to assume that other actions, such as running Eramba updates, is directly tied to the amount of data in the system’s database?
Context:
We have been using Eramba to distribute and track risk assessment questionnaires sent to system owners in our tenant*. As a result, we have hundreds of Online Assessments (OAs) with over 13,900 OA items (anywhere from 12 to 70 questions in each OA + comments/findings). To meet some reporting needs, we have to export detailed CSV files of both OAs and OA Items to do more detailed analysis on the data to generate some metrics.
These activities have led to VERY long query times when trying to search for and export data; times are especially long when querying against OA items. For example, it took ~17 minutes to return 13,972 OA items this morning. The amount of data in the system also appears to correlate to how long it takes updates to run.
I’ve watched our production instance’s CPU usage when a query is made, and I’ve seen that only 1 of the 8 CPUs allocated to the VM has ~91% usage attributable to mysql. The DB is installed on the same host with more than adequate disk I/O and 16GB RAM. Ordinary usage, even with multiple simulations users in the OA module, is okay. Running similar queries against much smaller data sets in our non-production instance is much faster despite it having fewer resource (4 vCPUs, 4GB RAM, similar disk I/O capabilities) and being otherwise configured the same (same OS, packages, and just about the same in-app settings).
*My org’s current requirements have us distribute recurring risk assessment questionnaires to internal stakeholders instead of having security team members be the ones to shoulder all of the assessment work. The process in general isn’t great, but it largely comes from requirements out of our control. We DO want to explore other ways to support the “business requirement” of running recurring assessments, but that’s a bit out of scope for this thread and short-term issues we’re facing.