Skip to content

Architecture Overview

This page describes QueryAudit's internal architecture, module structure, key interfaces, and the full lifecycle of a query from execution to report.


High-Level Flow

  JUnit Test Lifecycle
  ====================

  @BeforeAll
      |
      +- 1. Resolve DataSource (Spring context or static field)
      +- 2. Wrap DataSource with datasource-proxy
      +- 3. Collect index metadata (via database-specific provider)
      +- 4. Register Hibernate LazyLoadTracker (if Hibernate present)
      +- 5. Load query count baseline

  @BeforeEach
      |
      +- 6. interceptor.start() -- begin capturing queries

  Test Method Executes
      |
      +- Application code (JPA, JDBC, MyBatis, etc.)
      |       |
      |       v
      +- DataSource Proxy (datasource-proxy)
      |       |
      |       v
      +- QueryInterceptor.afterQuery()
      |       +- Records: SQL text, execution time, stack trace
      |       +- Normalizes: literals -> ?, whitespace collapsed, lowercased
      |
      +- Hibernate events (if applicable)
              +- INIT_COLLECTION -> LazyLoadTracker (collection lazy load)
              +- POST_LOAD -> LazyLoadTracker (proxy resolution)

  @AfterEach
      |
      +-  7. interceptor.stop()
      +-  8. QueryAuditAnalyzer.analyze()
      |       +- Run 57 DetectionRules against recorded queries
      |       +- Cross-reference index metadata
      |       +- Filter suppressed issues
      |       +- Split into CONFIRMED / INFO / ACKNOWLEDGED
      +-  9. Merge Hibernate N+1 issues (if any)
      +- 10. Detect query count regression (vs baseline)
      +- 11. ConsoleReporter.report() -> print to stdout
      +- 12. HtmlReportAggregator.addReport() -> accumulate for HTML
      +- 13. Check @ExpectMaxQueryCount
      +- 14. Check @DetectNPlusOne
      +- 15. Check failOnDetection -> throw AssertionError if needed

  @AfterAll
      |
      +- 16. Write HTML report to build/reports/query-audit/
      +- 17. Write JSON report (if configured)
      +- 18. Update query count baseline (if -DqueryGuard.updateBaseline=true)
      +- 19. Auto-open report in browser (if configured)

Module Structure

query-audit/
+-- query-audit-core                   Core engine (no framework dependencies)
|   +-- detector/                      57 detection rules
|   +-- interceptor/                   QueryInterceptor, DataSourceProxyFactory
|   +-- parser/                        SQL parser (regex-based)
|   +-- model/                         Issue, QueryRecord, IssueType, Severity
|   +-- regression/                    Query count baseline & regression detection
|   +-- baseline/                      Issue baseline (acknowledged issues)
|   +-- config/                        QueryAuditConfig builder
|   +-- reporter/                      ConsoleReporter, JsonReporter, HtmlReporter
|
+-- query-audit-mysql                  MySQL-specific implementation
|   +-- MySqlIndexMetadataProvider     SHOW INDEX + INFORMATION_SCHEMA
|
+-- query-audit-postgresql             PostgreSQL-specific implementation
|   +-- PostgreSqlIndexMetadataProvider  pg_catalog system tables
|
+-- query-audit-junit5                 JUnit 5 integration
|   +-- QueryAuditExtension            Lifecycle callbacks (BeforeAll/Each, AfterAll/Each)
|   +-- @QueryAudit                    Full analysis annotation
|   +-- @EnableQueryInspector          Report-only annotation
|   +-- @DetectNPlusOne               N+1 focused annotation
|   +-- @ExpectMaxQueryCount           Query count assertion annotation
|
+-- query-audit-spring-boot-starter    Spring Boot auto-configuration
    +-- QueryAuditAutoConfiguration    BeanPostProcessor for DataSource wrapping
    +-- QueryAuditProperties           application.yml binding

Module Dependency Graph

query-audit-spring-boot-starter
       |
       +-->  query-audit-junit5
       |         |
       |         +-->  query-audit-core
       |
       +-->  query-audit-core

query-audit-mysql
       |
       +-->  query-audit-core
       +-->  query-audit-junit5

query-audit-postgresql
       |
       +-->  query-audit-core
       +-->  query-audit-junit5

Design Principle: Separation of Concerns

Module Knows About Does NOT Know About
core SQL parsing, detection rules, config, reporting JUnit, Spring, any specific database
mysql MySQL SHOW INDEX, INFORMATION_SCHEMA JUnit, Spring
postgresql PostgreSQL pg_catalog system tables JUnit, Spring
junit5 JUnit 5 lifecycle, DataSource resolution Spring, any specific database
spring-boot-starter Spring Boot auto-configuration Any specific database

Core Components

QueryInterceptor

The entry point for query capture. Implements datasource-proxy's QueryExecutionListener.

  Application Code
       |
       v
  DataSource (wrapped by datasource-proxy)
       |
       |  SQL execution
       v
  QueryInterceptor.afterQuery()
       |
       +- Filter: skip if not active (outside test method)
       +- Capture: SQL text, execution time (nanos), timestamp
       +- Stack trace: up to 10 non-framework frames
       +- Store: Collections.synchronizedList(ArrayList<QueryRecord>)

Stack trace capture filters out framework classes to show only application code:

Filtered out: java.lang.Thread, sun.*, jdk.internal.*, org.springframework.*,
              org.hibernate.*, org.junit.*, net.ttddyy.*, com.zaxxer.*, ...

Kept: com.example.OrderService.findOrders:42
      com.example.OrderController.getOrders:28

This stack trace is used for:

  • N+1 call-site grouping (same query from same location = N+1)
  • Source location in the report

QueryRecord

Immutable record storing a captured query:

record QueryRecord(
    String sql,              // Raw SQL: "SELECT * FROM orders WHERE id = 123"
    String normalizedSql,    // Normalized: "select * from orders where id = ?"
    long executionTimeNanos, // Execution time
    long timestamp,          // Capture time
    String stackTrace,       // Application stack frames
    int fullStackHash        // Hash for call-site grouping
)

Normalization rules:

Input Output Purpose
WHERE id = 123 WHERE id = ? Group identical queries with different params
WHERE name = 'John' WHERE name = ? Handle string literals
IN (1, 2, 3) IN (?) Collapse IN lists
SELECT * FROM select * from Collapse whitespace + lowercase

QueryAuditAnalyzer

Central coordinator that runs all detection rules:

QueryAuditAnalyzer.analyze(testName, queries, indexMetadata)
       |
       +- 1. Filter suppressed queries
       |
       +- 2. Run 57 DetectionRules
       |      +- NPlusOneDetector
       |      +- SelectAllDetector
       |      +- MissingIndexDetector
       |      +- UpdateWithoutWhereDetector
       |      +- DmlWithoutIndexDetector
       |      +- RepeatedSingleInsertDetector
       |      +- ... (51 additional rules)
       |      +- Each returns List<Issue>
       |
       +- 3. Filter suppressed patterns
       |
       +- 4. Check against baseline (acknowledged issues)
       |
       +- 5. Split results
       |      +- CONFIRMED: ERROR + WARNING severity
       |      +- INFO: INFO severity
       |      +- ACKNOWLEDGED: baseline-matched issues
       |
       +- 6. Return QueryAuditReport

DetectionRule Interface

Every detection rule implements this single interface:

public interface DetectionRule {
    List<Issue> evaluate(List<QueryRecord> queries, IndexMetadata indexMetadata);
}

Rules receive all recorded queries (SELECT + INSERT + UPDATE + DELETE) and the collected index metadata. Each rule returns zero or more Issue objects.

Issue Model

record Issue(
    IssueType type,       // Enum: N_PLUS_ONE, MISSING_WHERE_INDEX, ...
    Severity severity,    // ERROR, WARNING, INFO
    String query,         // The normalized SQL that triggered this issue
    String table,         // Affected table
    String column,        // Affected column (nullable)
    String detail,        // Human-readable description of the problem
    String suggestion,    // How to fix it
    String sourceLocation // Stack trace (nullable)
)

Extension Points

QueryAudit is designed for extensibility via Java ServiceLoader. No changes to core modules are required when adding support for new databases or custom rules.

Adding a Custom Detection Rule

Implement DetectionRule and register via ServiceLoader:

// 1. Implement the interface
public class MyCustomDetector implements DetectionRule {
    @Override
    public List<Issue> evaluate(List<QueryRecord> queries, IndexMetadata indexMetadata) {
        // Your detection logic
    }
}

// 2. Register in META-INF/services/io.queryaudit.core.detector.DetectionRule
//    com.example.MyCustomDetector

The rule is automatically discovered and runs alongside all built-in detectors.

Adding a New Database

Implement IndexMetadataProvider and register via ServiceLoader:

// 1. Implement the interface
public class MariaDbIndexMetadataProvider implements IndexMetadataProvider {
    @Override
    public String supportedDatabase() { return "mariadb"; }

    @Override
    public IndexMetadata getIndexMetadata(Connection conn) throws SQLException {
        // Query system catalogs
    }
}

// 2. Register in META-INF/services/io.queryaudit.core.analyzer.IndexMetadataProvider
//    com.example.MariaDbIndexMetadataProvider

Adding a New Reporter

Implement the Reporter interface:

public class SlackReporter implements Reporter {
    @Override
    public void report(QueryAuditReport report) {
        // Send report summary to Slack
    }
}

Extension Points Summary

Extension Point Interface ServiceLoader File Purpose
Detection rule DetectionRule io.queryaudit.core.detector.DetectionRule Add custom query analysis rules
Database support IndexMetadataProvider io.queryaudit.core.analyzer.IndexMetadataProvider Add index metadata for new databases
EXPLAIN analysis ExplainAnalyzer io.queryaudit.core.analyzer.ExplainAnalyzer Parse database-specific EXPLAIN output
Reporter Reporter (programmatic registration) Add custom report output formats

Detection Rules (57 Active Rules)

SELECT-Focused Rules

Rule Issue Type Severity What it detects
NPlusOneDetector N_PLUS_ONE ERROR Repeated query patterns from same call site
SelectAllDetector SELECT_ALL INFO SELECT * usage
CountInsteadOfExistsDetector COUNT_INSTEAD_OF_EXISTS INFO COUNT(*) where EXISTS is better
UnboundedResultSetDetector UNBOUNDED_RESULT_SET WARNING SELECT without LIMIT
SlowQueryDetector SLOW_QUERY WARNING/ERROR Queries exceeding time thresholds
~~DuplicateQueryDetector~~ DUPLICATE_QUERY WARNING Exact duplicate SQL (disabled -- awaiting parameter tracking)
CoveringIndexDetector COVERING_INDEX_OPPORTUNITY INFO Queries that could use covering indexes
DistinctMisuseDetector DISTINCT_MISUSE WARNING Unnecessary DISTINCT
HavingMisuseDetector HAVING_MISUSE WARNING HAVING on non-aggregate columns
UnionWithoutAllDetector UNION_WITHOUT_ALL INFO UNION without ALL
SelectCountStarWithoutWhereDetector COUNT_STAR_WITHOUT_WHERE INFO COUNT(*) without WHERE scans entire table
ExcessiveColumnFetchDetector EXCESSIVE_COLUMN_FETCH INFO Query fetches too many columns

WHERE Clause Rules

Rule Issue Type Severity What it detects
WhereFunctionDetector WHERE_FUNCTION ERROR DATE(), LOWER(), etc. wrapping columns
OrAbuseDetector OR_ABUSE WARNING Excessive OR conditions (>= threshold)
LikeWildcardDetector LIKE_LEADING_WILDCARD WARNING Leading wildcard LIKE '%...'
NullComparisonDetector NULL_COMPARISON ERROR = NULL instead of IS NULL
RedundantFilterDetector REDUNDANT_FILTER INFO Duplicate WHERE conditions
StringConcatInWhereDetector STRING_CONCAT_IN_WHERE WARNING String concatenation in WHERE prevents index usage
CaseInWhereDetector CASE_IN_WHERE WARNING CASE expression in WHERE prevents index usage
Rule Issue Type Severity What it detects
MissingIndexDetector MISSING_WHERE_INDEX ERROR Unindexed WHERE columns
MissingIndexDetector MISSING_JOIN_INDEX ERROR Unindexed JOIN columns
MissingIndexDetector MISSING_ORDER_BY_INDEX WARNING Unindexed ORDER BY columns
MissingIndexDetector MISSING_GROUP_BY_INDEX WARNING Unindexed GROUP BY columns
CompositeIndexDetector COMPOSITE_INDEX_LEADING_COLUMN WARNING Composite index leading column unused
IndexRedundancyDetector REDUNDANT_INDEX WARNING One index is prefix of another
ForUpdateWithoutIndexDetector FOR_UPDATE_WITHOUT_INDEX ERROR FOR UPDATE without index
RangeLockDetector RANGE_LOCK_RISK WARNING Range + FOR UPDATE on unindexed column

DML Rules

Rule Issue Type Severity What it detects
UpdateWithoutWhereDetector UPDATE_WITHOUT_WHERE ERROR UPDATE/DELETE without WHERE
DmlWithoutIndexDetector DML_WITHOUT_INDEX WARNING UPDATE/DELETE WHERE without index
RepeatedSingleInsertDetector REPEATED_SINGLE_INSERT WARNING Same INSERT pattern repeated >= 3 times
InsertSelectAllDetector INSERT_SELECT_ALL WARNING INSERT ... SELECT *
InsertOnDuplicateKeyDetector INSERT_ON_DUPLICATE_KEY WARNING INSERT ON DUPLICATE KEY UPDATE may cause deadlocks
InsertSelectLocksSourceDetector INSERT_SELECT_LOCKS_SOURCE INFO INSERT...SELECT locks source table rows
ImplicitColumnsInsertDetector IMPLICIT_COLUMNS_INSERT WARNING INSERT without explicit column list
SubqueryInDmlDetector SUBQUERY_IN_DML WARNING Subquery in UPDATE/DELETE cannot use semijoin optimization

JOIN Rules

Rule Issue Type Severity What it detects
CartesianJoinDetector CARTESIAN_JOIN ERROR JOIN without ON condition
TooManyJoinsDetector TOO_MANY_JOINS WARNING Excessive number of JOINs
ImplicitJoinDetector IMPLICIT_JOIN WARNING Implicit comma-separated join syntax
UnusedJoinDetector UNUSED_JOIN WARNING LEFT JOIN table never referenced in query
CorrelatedSubqueryDetector CORRELATED_SUBQUERY WARNING Correlated subquery in SELECT

Locking Rules

Rule Issue Type Severity What it detects
ForUpdateNonUniqueIndexDetector FOR_UPDATE_NON_UNIQUE WARNING FOR UPDATE on non-unique index causes gap locks
ForUpdateWithoutTimeoutDetector FOR_UPDATE_WITHOUT_TIMEOUT WARNING FOR UPDATE without NOWAIT or SKIP LOCKED

Pagination and Ordering Rules

Rule Issue Type Severity What it detects
OffsetPaginationDetector OFFSET_PAGINATION WARNING Large OFFSET value
OrderByLimitWithoutIndexDetector ORDER_BY_LIMIT_WITHOUT_INDEX WARNING ORDER BY + LIMIT without index
OrderByRandDetector ORDER_BY_RAND ERROR ORDER BY RAND() causes full table scan and sort
NonDeterministicPaginationDetector NON_DETERMINISTIC_PAGINATION INFO ORDER BY + LIMIT on non-unique column
LimitWithoutOrderByDetector LIMIT_WITHOUT_ORDER_BY WARNING LIMIT without ORDER BY returns non-deterministic rows

ORM / Hibernate Rules

Rule Issue Type Severity What it detects
CollectionManagementDetector COLLECTION_DELETE_REINSERT WARNING DELETE-all + re-INSERT pattern indicates inefficient collection management
DerivedDeleteDetector DERIVED_DELETE_LOADS_ENTITIES WARNING Derived delete loads entities before individual deletes
MergeableQueriesDetector MERGEABLE_QUERIES INFO Multiple queries to same table could be merged

Other Rules

Rule Issue Type Severity What it detects
SargabilityDetector NON_SARGABLE_EXPRESSION ERROR Arithmetic on columns (col + 1 = ?)
ImplicitTypeConversionDetector IMPLICIT_TYPE_CONVERSION WARNING String column compared to number
LargeInListDetector LARGE_IN_LIST WARNING IN clause with too many values
WriteAmplificationDetector WRITE_AMPLIFICATION WARNING Table with > 6 indexes
NotInSubqueryDetector NOT_IN_SUBQUERY ERROR NOT IN subquery returns empty when subquery contains NULL
GroupByFunctionDetector GROUP_BY_FUNCTION WARNING Function in GROUP BY prevents index usage
RegexpInsteadOfLikeDetector REGEXP_INSTEAD_OF_LIKE WARNING REGEXP/RLIKE prevents index usage
FindInSetDetector FIND_IN_SET_USAGE WARNING FIND_IN_SET indicates comma-separated values violating 1NF
WindowFunctionWithoutPartitionDetector WINDOW_FUNCTION_WITHOUT_PARTITION WARNING Window function without PARTITION BY
ForceIndexHintDetector FORCE_INDEX_HINT INFO FORCE/USE/IGNORE INDEX hint overrides optimizer
QueryCountRegressionDetector QUERY_COUNT_REGRESSION WARNING Query count regression vs baseline

Index Metadata Collection

QueryAudit collects index information from two sources and merges them:

                    +-------------------------+
                    | IndexMetadata (merged)   |
                    +-----------+-------------+
                                |
                  +-------------+-------------+
                  |                           |
   +--------------+----------+   +------------+-----------+
   | Database metadata       |   | JPA @Table metadata    |
   | (SHOW INDEX / pg_catalog)|   | (@Index annotations)   |
   |                         |   |                        |
   | Authoritative:          |   | Supplementary:         |
   | real indexes from       |   | indexes declared in    |
   | the test database       |   | JPA entity classes     |
   +-------------------------+   +------------------------+

Database metadata (via IndexMetadataProvider SPI) is authoritative -- it reflects the actual indexes that exist in the test database. JPA metadata supplements this with indexes declared in @Table(indexes = ...) that might not yet exist in the test schema (e.g., when using ddl-auto=create-drop).

IndexMetadataProvider SPI

public interface IndexMetadataProvider {
    String supportedDatabase();   // e.g., "mysql"
    IndexMetadata getIndexMetadata(Connection connection) throws SQLException;
}

Implementations are discovered via Java ServiceLoader. The provider's supportedDatabase() return value is matched against Connection.getMetaData().getDatabaseProductName().


Query Count Regression Detection

QueryAudit tracks query counts per test method across runs using a baseline file (.query-audit-counts).

  Run 1:  OrderServiceTest.findOrders -> 5 SELECT, 0 INSERT -> saved to baseline
  Run 2:  OrderServiceTest.findOrders -> 15 SELECT, 0 INSERT -> regression detected!
          (3x increase, +10 queries)

Regression thresholds (both must be met):

Condition Threshold Rationale
Increase ratio >= 1.5x (50%) Catches significant increases
Absolute increase >= 5 queries Avoids false positives on small tests (2 -> 4 = 2x but only +2)

Severity mapping:

Ratio Severity
>= 3.0x ERROR
>= 2.0x WARNING
>= 1.5x WARNING

Updating the Baseline

./gradlew test -DqueryGuard.updateBaseline=true

Report Generation

QueryAudit includes three reporter implementations, all in query-audit-core:

Console Report (ConsoleReporter)

ANSI-colored output printed to stdout after each test method. The default reporter.

JSON Report (JsonReporter)

Machine-readable JSON output. Uses StringBuilder exclusively to avoid external JSON library dependencies.

HTML Report (HtmlReporter / HtmlReportAggregator)

Aggregated report generated by HtmlReportAggregator and written to build/reports/query-audit/index.html after all tests complete. Provides filtering, search, and drill-down by test class. Self-contained single HTML file with embedded CSS and JavaScript.


Design Principles

Zero False Positives on CONFIRMED Issues

CONFIRMED issues are structurally certain based on SQL parsing and index metadata. QueryAudit never guesses. Issues that depend on data volume or query planner behavior are classified as INFO, not CONFIRMED.

Test-Time Only

QueryAudit is a test dependency. It never runs in production. The datasource proxy wrapping only happens in the test classpath.

Transparent to Application Code

Application code does not need to know about QueryAudit. The DataSource proxy is injected transparently via BeanPostProcessor (Spring Boot) or reflection-based DataSource resolution (plain JUnit 5).

Extensibility via SPI

  • New database: Implement IndexMetadataProvider in a new module, register via ServiceLoader
  • New detection rule: Implement DetectionRule, register via ServiceLoader or programmatically
  • New reporter: Implement Reporter interface

See Also