How to trace back from SQL queries to the app code?

Published on July 28, 2024

You open your RDS monitoring view and notice some queries that are hampering performance. Or perhaps you don't use fancy dashboards and prefer to directly query the pg_stat_activity view. Either way, the problem remains the same: Where in the application code are these SQL queries being executed?

While investigating the internals of Knex.js and the pg client it uses, I discovered a way to modify queries as they're processed. The solution lies in harnessing the power of stack traces. The idea is to attach a trace of the query caller as a comment to the original query.

Your knex config

 import type { Knex } from "knex";
+import { PostgresClientWithTrace } from "./knex-tracing";
 
 const config: { [key: string]: Knex.Config } = {
   development: {
-    client: "postgresql",
+    client: PostgresClientWithTrace as any,
     connection: {
       database: "knex_test",
       user: "postgres",

`knex-tracing.ts`

//@ts-expect-error - This isn't in the type definitions
import Client_PG from "knex/lib/dialects/postgres/index.js";
export class PostgresClientWithTrace extends Client_PG {
  query(connection: any, obj: unknown) {
    const formattedStackTrace = formatStackTraceForSQL(captureStackTrace());

    if (typeof obj === "string") {
      return super.query(connection, formattedStackTrace + obj);
    } else if (obj && typeof obj === "object" && typeof (obj as { sql: string }).sql === "string") {
      (obj as { sql: string }).sql = formattedStackTrace + (obj as { sql: string }).sql;
    }
    return super.query(connection, obj);
  }
}

function captureStackTrace() {
  const stack = new Error().stack || "";
  // Modify the stack trace extraction to get rid of stack frames
  // that are part of this captureStackTrace function
  // and other not-so-useful stack frames.
  const filteredStack = stack
    .split("\n")
    .slice(3)
    .filter((line) => {
      return (
        !line.includes("node_modules") &&
        !line.includes("internal/modules/cjs/loader.js") &&
        !line.includes("(node:") &&
        !line.includes("at processTicksAndRejections")
      );
    })
    .join("\n");
  return filteredStack;
}

function formatStackTraceForSQL(stack: string) {
  // Took inspiration from https://github.com/knex/knex/pull/5289/files#diff-064c6305016fff5f7df80ad10883851565d770f919dcbd05c6b40f6d9828e8eaR245-R248
  const sanitized = stack.replace(/\/\*|\*\/|\?/g, "");
  return `/* ${sanitized} */\n`;
}

How does queries look?

Before (without stack trace)

After (with stack trace)

Limitations:

This example uses Knex and PostgreSQL, but similar concepts can be applied to other libraries and databases.
It doesn't support stream API but adding support for it should be straightforward.
The stack traces are generated from the "built" JavaScript code. While there's no source mapping, a trace generally provides sufficient hints to pinpoint the exact location in the source code.