diff --git a/README.md b/README.md index 8f612cb..a39fdea 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,142 @@ -# c77_dbh +# c77_dbh - Postgres Database Health Extension + +**c77_dbh** is a PostgreSQL extension designed to monitor and maintain database health, providing actionable insights for database administrators and non-technical users alike. It offers a suite of functions and tables to track vacuum performance, index efficiency, materialized view refresh health, and TOAST table overhead—all packaged as a self-contained, drop-in solution for Postgres environments. + +The extension pushes logic into the database itself, minimizing reliance on external frameworks, and outputs JSON-formatted data ideal for dashboards. It dynamically adjusts thresholds based on your database’s behavior and adapts to optional Postgres extensions (e.g., `pgstattuple`, `pg_stat_statements`) for enhanced precision. + +## Features + +- **Vacuum Health Monitoring**: Tracks table bloat, vacuum history, and TOAST overhead with dynamic thresholds. +- **Index Health Analysis**: Monitors index usage and bloat, excluding or isolating TOAST indexes as needed. +- **Materialized View Refresh Tracking**: Measures refresh performance with event triggers, identifying bottlenecks (I/O, CPU, RAM). +- **Human-Readable Insights**: Provides status (Red/Yellow/Green), severity scores, insights, and actions for each metric. +- **Dashboard-Ready**: JSON outputs structured for easy integration into graphical interfaces. +- **Resource Awareness**: Gracefully falls back when optional extensions are unavailable. + +## Installation + +### Prerequisites +- PostgreSQL 13 or higher (tested up to 15 as of March 2025). +- Optional extensions for enhanced functionality: + - `pgstattuple`: Precise bloat estimates for tables and indexes. + - `pg_stat_statements`: Detailed query and refresh performance metrics. + - `pg_cron`: Automate periodic stat updates (recommended). + +### Steps +1. **Clone the Repository**: + ```bash + git clone https://git.jctr3.com/trogers1884/c77_dbh.git + cd c77_dbh + ``` + +2. **Build and Install**: + ```bash + make + sudo make install + ``` + (Note: Assumes a standard Postgres extension build process; adjust paths if needed.) + +3. **Enable in Database**: + ```sql + CREATE EXTENSION c77_dbh; + ``` + +4. **Grant Permissions** (optional, adjust roles as needed): + ```sql + GRANT SELECT ON ALL TABLES IN SCHEMA public TO public; + GRANT ALL ON ALL TABLES IN SCHEMA public TO your_admin_role; + ``` + +## Usage + +### Core Components +- **Tables**: + - `c77_dbh_vacuum_stats`: Stores table vacuum and bloat stats. + - `c77_dbh_index_stats`: Tracks index size, usage, and bloat. + - `c77_dbh_mv_stats`: Records materialized view refresh metrics. + +- **Functions**: + - `c77_dbh_update_vacuum_stats()`: Updates vacuum stats for all tables. + - `c77_dbh_update_index_stats()`: Updates index stats. + - `c77_dbh_get_vacuum_health()`: Returns JSON with table health status. + - `c77_dbh_get_index_health()`: JSON for non-TOAST index health. + - `c77_dbh_get_toast_health()`: JSON for TOAST index health. + - `c77_dbh_get_mv_health()`: JSON for materialized view refresh health. + - `c77_dbh_get_mv_details(mv_name text)`: Detailed JSON for a specific MV. + +- **Event Triggers**: Automatically track materialized view creation, alteration, and refreshes. + +### Example Workflow +1. **Initialize Stats**: + ```sql + SELECT c77_dbh_update_vacuum_stats(); + SELECT c77_dbh_update_index_stats(); + ``` + +2. **Query Health**: + ```sql + SELECT * FROM c77_dbh_get_vacuum_health(); + SELECT * FROM c77_dbh_get_index_health(); + SELECT * FROM c77_dbh_get_mv_health(); + ``` + +3. **Automate Updates** (with `pg_cron`): + ```sql + SELECT cron.schedule('update_vacuum_stats', '0 * * * *', $$SELECT c77_dbh_update_vacuum_stats()$$); + SELECT cron.schedule('update_index_stats', '0 * * * *', $$SELECT c77_dbh_update_index_stats()$$); + ``` + +4. **Sample Output** (from `c77_dbh_get_vacuum_health()`): + ```json + [ + { + "metric": "Table Vacuum Health", + "table_name": "sales.orders", + "status": "Red", + "severity": 5, + "insight": "Table bloat is critically high; TOAST overhead significant.", + "action": "Run VACUUM FULL; review large fields.", + "table_size": "128 MB", + "bloat_estimate": "65%", + "details": {"toast_bloat": "40%", "median_bloat": "20%"} + } + ] + ``` + +### Dashboard Integration +- Use the JSON outputs to populate a dashboard: + - **Vacuum Health**: Pie chart of statuses, table of critical issues. + - **Index Health**: Bar chart of usage categories, recommendations. + - **MV Refresh**: Progress bars per view, bottleneck alerts. +- Link to detail pages using `details` fields or separate functions like `c77_dbh_get_toast_health()`. + +## Configuration + +- **Frequency**: Adjust update schedules based on database activity (hourly for high-write systems, daily for stable ones). +- **Permissions**: Default setup grants `SELECT` to `public` and full access to an admin role (e.g., `homestead`). Modify as needed. +- **Extensions**: Install `pgstattuple` and `pg_stat_statements` for best results; fallbacks work without them. + +## Limitations + +- TOAST bloat estimates are crude without `pgstattuple`. +- Materialized view bottleneck analysis is less precise without `pg_stat_statements`. +- Dynamic thresholds rely on sufficient historical data; new databases may need time to stabilize. + +## Future Development + +- Query performance monitoring with `pg_stat_statements`. +- Automated maintenance actions (e.g., trigger `VACUUM` on Red status). +- Integration with external frameworks (e.g., Laravel package). + +## Contributing + +Contributions are welcome! Fork the repo, submit pull requests, or open issues at `https://git.jctr3.com/trogers1884/c77_dbh`. + +## License + +MIT License – see `LICENSE` file for details. + +## Credits + +Developed by [Your Name] with assistance from xAI’s Grok. Inspired by Dr. W. Edwards Deming’s statistical quality control principles. -Postgres Database Health \ No newline at end of file diff --git a/dasboardexample.html b/dasboardexample.html new file mode 100644 index 0000000..71d6fb1 --- /dev/null +++ b/dasboardexample.html @@ -0,0 +1,122 @@ + + + + Database Health Dashboard + + + +
+
+

Database Health Dashboard

+
+
+
Vacuum Health
+
Index Health
+
MV Refresh
+
+ + +
+

Vacuum Health Overview

+
+ + + +
+
+

Tables Scanned: 25

+

Critical Issues: 2 tables

+

Warnings: 5 tables

+

Healthy: 18 tables

+
+
+

+ Recommendation: Run VACUUM FULL on 2 tables with critical bloat; TOAST overhead notable in 1 case. +

+ View Detailed Vacuum Stats +
+ + +
+

Index Health Overview

+
+ + + +
+
+

Indexes Monitored: 19

+

Critical Issues: 1 unused

+

Warnings: 3 low usage

+

Healthy: 15 indexes

+
+
+

+ Recommendation: Consider dropping 1 unused index; review query patterns for 3 others. +

+ View Detailed Index Stats + View TOAST Index Stats +
+ + +
+

Materialized View Refresh Overview

+
+ +
+ + +
+
+ + +
+
+
+

MVs Monitored: 2

+

Critical Issues: 0 MVs

+

Warnings: 1 MV

+

Healthy: 1 MV

+
+
+

+ Recommendation: Monitor MV2 refresh; consider scheduling during off-peak hours. +

+ View Detailed MV Stats +
+
+ + \ No newline at end of file diff --git a/health_functions.sql b/health_functions.sql index 61448bb..e9ad889 100644 --- a/health_functions.sql +++ b/health_functions.sql @@ -1,110 +1,95 @@ CREATE OR REPLACE FUNCTION public.c77_dbh_get_mv_health() RETURNS JSON AS $$ DECLARE - v_has_pg_stat_statements boolean; -- Check if extension is available - v_stats_reset timestamp; -- Time of last stats reset - v_base_io_load numeric; -- Fallback I/O load - v_base_mem_hit_ratio numeric; -- Fallback memory hit ratio + v_has_pg_stat_statements boolean; + v_stats_reset timestamp; + v_base_io_load numeric; + v_base_mem_hit_ratio numeric; + v_avg_refresh_time interval; -- Average refresh time across all MVs BEGIN - -- Check if pg_stat_statements is installed SELECT EXISTS ( SELECT 1 FROM pg_extension WHERE extname = 'pg_stat_statements' ) INTO v_has_pg_stat_statements; - -- Get stats_reset, base I/O load, and base memory hit ratio for the current database SELECT d.stats_reset, (d.blks_read + d.blks_hit) / EXTRACT(EPOCH FROM (NOW() - d.stats_reset)), - CASE - WHEN b.buffers_clean + b.buffers_backend = 0 THEN 1.0 - ELSE b.buffers_clean::numeric / (b.buffers_clean + b.buffers_backend) - END + CASE WHEN b.buffers_clean + b.buffers_backend = 0 THEN 1.0 + ELSE b.buffers_clean::numeric / (b.buffers_clean + b.buffers_backend) END INTO v_stats_reset, v_base_io_load, v_base_mem_hit_ratio FROM pg_stat_database d CROSS JOIN pg_stat_bgwriter b WHERE d.datname = current_database(); + -- Calculate average refresh time across all MVs + SELECT AVG(refresh_mv_time_total / GREATEST(refresh_count, 1)) + INTO v_avg_refresh_time + FROM public.c77_dbh_mv_stats + WHERE refresh_count > 0; + RETURN ( SELECT json_agg( json_build_object( 'metric', 'Materialized View Refresh', 'mv_name', m.mv_name, 'status', CASE - WHEN m.refresh_mv_time_last > m.refresh_mv_time_max * 0.9 THEN 'Red' - WHEN m.refresh_mv_time_last > (m.refresh_mv_time_total / GREATEST(m.refresh_count, 1)) * 1.2 THEN 'Yellow' + WHEN m.refresh_mv_time_last > v_avg_refresh_time * 2 THEN 'Red' + WHEN m.refresh_mv_time_last > v_avg_refresh_time * 1.5 THEN 'Yellow' ELSE 'Green' END, 'severity', CASE - WHEN m.refresh_mv_time_last > m.refresh_mv_time_max * 0.9 THEN 5 - WHEN m.refresh_mv_time_last > (m.refresh_mv_time_total / GREATEST(m.refresh_count, 1)) * 1.2 THEN 3 + WHEN m.refresh_mv_time_last > v_avg_refresh_time * 2 THEN 5 + WHEN m.refresh_mv_time_last > v_avg_refresh_time * 1.5 THEN 3 ELSE 1 END, 'insight', CASE - WHEN v_io_load > 100 THEN 'Refreshes are slower due to high I/O load.' - WHEN v_cpu_time > (CASE WHEN v_has_pg_stat_statements THEN 1000 ELSE 0.5 END) THEN 'Refreshes are slower due to CPU usage.' - WHEN v_mem_hit_ratio < 0.9 THEN 'Refreshes are slower due to memory pressure.' - ELSE 'Refreshes are performing normally.' + WHEN v_io_load > 100 THEN 'Refreshes are slow due to high I/O load.' + WHEN v_cpu_time > (CASE WHEN v_has_pg_stat_statements THEN 1000 ELSE 0.5 END) THEN 'Refreshes are slow due to CPU usage.' + WHEN v_mem_hit_ratio < 0.9 THEN 'Refreshes are slow due to memory pressure.' + ELSE 'Refresh performance is within normal range.' END, 'action', CASE WHEN v_io_load > 100 THEN 'Optimize I/O or schedule during off-peak hours.' WHEN v_cpu_time > (CASE WHEN v_has_pg_stat_statements THEN 1000 ELSE 0.5 END) THEN 'Increase CPU capacity or optimize queries.' - WHEN v_mem_hit_ratio < 0.9 THEN 'Adjust memory settings (e.g., work_mem, shared_buffers).' + WHEN v_mem_hit_ratio < 0.9 THEN 'Adjust memory settings (e.g., work_mem).' ELSE 'No action needed.' END, 'last_refresh_time', m.refresh_mv_time_last::text, 'refresh_count', m.refresh_count, 'io_load', round(v_io_load)::text || ' blocks/sec', - 'cpu_time', (CASE - WHEN v_has_pg_stat_statements THEN round(v_cpu_time)::text || ' ms/call' - ELSE (v_cpu_time * 100)::text || '% waiting' - END), + 'cpu_time', (CASE WHEN v_has_pg_stat_statements THEN round(v_cpu_time)::text || ' ms/call' + ELSE (v_cpu_time * 100)::text || '% waiting' END), 'buffer_hit_ratio', round(v_mem_hit_ratio * 100)::text || '%', 'bottleneck', CASE WHEN v_io_load > 100 THEN 'I/O' WHEN v_cpu_time > (CASE WHEN v_has_pg_stat_statements THEN 1000 ELSE 0.5 END) THEN 'CPU' WHEN v_mem_hit_ratio < 0.9 THEN 'RAM' ELSE 'None' - END + END, + 'details', json_build_object( + 'avg_refresh_time', v_avg_refresh_time::text, + 'explanation', 'Thresholds based on average refresh time (' || v_avg_refresh_time::text || ').' + ) ) ) FROM public.c77_dbh_mv_stats m CROSS JOIN LATERAL ( - -- Calculate bottleneck stats per MV SELECT - CASE - WHEN v_has_pg_stat_statements THEN - COALESCE(SUM(s.shared_blks_read) / EXTRACT(EPOCH FROM (NOW() - v_stats_reset)), 0) - ELSE - v_base_io_load - END AS v_io_load, - CASE - WHEN v_has_pg_stat_statements THEN - COALESCE(SUM(s.total_time) / GREATEST(SUM(s.calls), 1), 0) - ELSE - CASE - WHEN EXISTS ( - SELECT 1 - FROM pg_stat_activity - WHERE query ILIKE '%REFRESH MATERIALIZED VIEW ' || m.mv_name || '%' - AND wait_event_type = 'CPU' - AND state = 'active' - ) THEN 1 - ELSE 0 - END - END AS v_cpu_time, - CASE - WHEN v_has_pg_stat_statements THEN - CASE - WHEN SUM(s.shared_blks_hit + s.shared_blks_read) = 0 THEN 1.0 - ELSE SUM(s.shared_blks_hit)::numeric / (SUM(s.shared_blks_hit) + SUM(s.shared_blks_read)) - END - ELSE - v_base_mem_hit_ratio - END AS v_mem_hit_ratio - FROM ( - -- Subquery to avoid duplicate logic - SELECT 1 AS dummy - ) AS dummy + CASE WHEN v_has_pg_stat_statements THEN + COALESCE(SUM(s.shared_blks_read) / EXTRACT(EPOCH FROM (NOW() - v_stats_reset)), 0) + ELSE v_base_io_load END AS v_io_load, + CASE WHEN v_has_pg_stat_statements THEN + COALESCE(SUM(s.total_time) / GREATEST(SUM(s.calls), 1), 0) + ELSE CASE WHEN EXISTS ( + SELECT 1 FROM pg_stat_activity + WHERE query ILIKE '%REFRESH MATERIALIZED VIEW ' || m.mv_name || '%' + AND wait_event_type = 'CPU' AND state = 'active' + ) THEN 1 ELSE 0 END END AS v_cpu_time, + CASE WHEN v_has_pg_stat_statements THEN + CASE WHEN SUM(s.shared_blks_hit + s.shared_blks_read) = 0 THEN 1.0 + ELSE SUM(s.shared_blks_hit)::numeric / (SUM(s.shared_blks_hit) + SUM(s.shared_blks_read)) END + ELSE v_base_mem_hit_ratio END AS v_mem_hit_ratio + FROM (SELECT 1 AS dummy) AS dummy LEFT JOIN pg_stat_statements s ON v_has_pg_stat_statements AND s.query ILIKE '%REFRESH MATERIALIZED VIEW ' || m.mv_name || '%' AND s.dbid = (SELECT datid FROM pg_stat_database WHERE datname = current_database()) @@ -114,7 +99,6 @@ END; $$ LANGUAGE plpgsql; - CREATE OR REPLACE FUNCTION public.c77_dbh_get_mv_details(p_mv_name text) RETURNS JSON AS $$ DECLARE diff --git a/public.c77_dbh_get_vacuum_health.sql b/public.c77_dbh_get_vacuum_health.sql index a163c08..a3e4c7e 100644 --- a/public.c77_dbh_get_vacuum_health.sql +++ b/public.c77_dbh_get_vacuum_health.sql @@ -1,49 +1,96 @@ CREATE OR REPLACE FUNCTION public.c77_dbh_get_vacuum_health() RETURNS JSON AS $$ DECLARE + v_median_bloat numeric; + v_bloat_stddev numeric; v_has_pgstattuple boolean; BEGIN - -- Check if pgstattuple is installed SELECT EXISTS ( SELECT 1 FROM pg_extension WHERE extname = 'pgstattuple' ) INTO v_has_pgstattuple; + -- Calculate median and stddev including TOAST-adjusted bloat + WITH toast_bloat AS ( + SELECT + (n.nspname || '.' || c.relname) AS table_name, + CASE WHEN v_has_pgstattuple THEN + COALESCE((pgstattuple(c.oid)).free_percent + (pgstattuple(c.oid)).dead_tuple_percent, 0) + ELSE + CASE WHEN s.n_live_tup + s.n_dead_tup = 0 THEN 0 + ELSE (s.n_dead_tup::numeric / (s.n_live_tup + s.n_dead_tup)) * 100 END + END AS bloat_estimate + FROM pg_stat_all_tables s + JOIN pg_class c ON c.oid = s.relid + JOIN pg_namespace n ON n.oid = c.relnamespace + WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') + ) + SELECT + percentile_cont(0.5) WITHIN GROUP (ORDER BY bloat_estimate), + stddev(bloat_estimate) + INTO v_median_bloat, v_bloat_stddev + FROM toast_bloat; + RETURN ( SELECT json_agg( json_build_object( 'metric', 'Table Vacuum Health', - 'table_name', table_name, + 'table_name', v.table_name, 'status', CASE - WHEN bloat_estimate > 50 THEN 'Red' - WHEN bloat_estimate > 20 THEN 'Yellow' + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + 2 * v_bloat_stddev THEN 'Red' + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + v_bloat_stddev THEN 'Yellow' ELSE 'Green' END, 'severity', CASE - WHEN bloat_estimate > 50 THEN 5 - WHEN bloat_estimate > 20 THEN 3 + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + 2 * v_bloat_stddev THEN 5 + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + v_bloat_stddev THEN 3 ELSE 1 END, 'insight', CASE - WHEN bloat_estimate > 50 THEN 'Table has critical bloat; vacuum and analyze urgently.' - WHEN bloat_estimate > 20 THEN 'Table has moderate bloat; consider vacuuming.' - ELSE 'Table is in good health.' + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + 2 * v_bloat_stddev THEN + 'Table bloat is critically high' || + (CASE WHEN t.toast_bloat > v.bloat_estimate THEN '; TOAST overhead is significant.' ELSE '.' END) + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + v_bloat_stddev THEN + 'Table bloat is above average' || + (CASE WHEN t.toast_bloat > v.bloat_estimate THEN '; TOAST overhead notable.' ELSE '.' END) + ELSE 'Table bloat is within normal range.' END, 'action', CASE - WHEN bloat_estimate > 50 THEN 'Run VACUUM FULL or CLUSTER; check autovacuum settings.' - WHEN bloat_estimate > 20 THEN 'Run VACUUM and ANALYZE; monitor autovacuum.' - ELSE 'No action needed.' + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + 2 * v_bloat_stddev THEN + 'Run VACUUM FULL or CLUSTER' || + (CASE WHEN t.toast_bloat > v.bloat_estimate THEN '; review large fields for TOAST impact.' ELSE '.' END) + WHEN (v.bloat_estimate + COALESCE(t.toast_bloat, 0)) > v_median_bloat + v_bloat_stddev THEN + 'Run VACUUM and ANALYZE' || + (CASE WHEN t.toast_bloat > v.bloat_estimate THEN '; check TOAST usage.' ELSE '.' END) + ELSE 'No immediate action required.' END, - 'last_vacuum', last_vacuum::text, - 'last_autovacuum', last_autovacuum::text, - 'vacuum_count', vacuum_count, - 'dead_tuples', dead_tuples, - 'live_tuples', live_tuples, - 'table_size', table_size::text || ' bytes', - 'bloat_estimate', bloat_estimate::text || '%' || - (CASE WHEN v_has_pgstattuple THEN ' (precise)' ELSE ' (estimated)' END) + 'last_vacuum', v.last_vacuum::text, + 'last_autovacuum', v.last_autovacuum::text, + 'vacuum_count', v.vacuum_count, + 'dead_tuples', v.dead_tuples, + 'live_tuples', v.live_tuples, + 'table_size', pg_size_pretty(v.table_size), + 'bloat_estimate', v.bloat_estimate::text || '%' || + (CASE WHEN v_has_pgstattuple THEN ' (precise)' ELSE ' (estimated)' END), + 'details', json_build_object( + 'median_bloat', round(v_median_bloat)::text || '%', + 'bloat_stddev', round(v_bloat_stddev)::text || '%', + 'toast_bloat', round(COALESCE(t.toast_bloat, 0))::text || '%', + 'explanation', 'Includes TOAST bloat in total health assessment.' + ) ) ) - FROM public.c77_dbh_vacuum_stats + FROM public.c77_dbh_vacuum_stats v + LEFT JOIN ( + SELECT + (n.nspname || '.' || c.relname) AS table_name, + CASE WHEN v_has_pgstattuple THEN + COALESCE((pgstattuple(tst.oid)).free_percent + (pgstattuple(tst.oid)).dead_tuple_percent, 0) + ELSE 0 END AS toast_bloat + FROM pg_class c + JOIN pg_namespace n ON n.oid = c.relnamespace + LEFT JOIN pg_class tst ON tst.oid = c.reltoastrelid + WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') AND c.reltoastrelid != 0 + ) t ON v.table_name = t.table_name ); END; $$ LANGUAGE plpgsql; \ No newline at end of file diff --git a/public.c77_dbh_index_stats.sql b/public.c77_dbh_index_stats.sql new file mode 100644 index 0000000..6e294a2 --- /dev/null +++ b/public.c77_dbh_index_stats.sql @@ -0,0 +1,174 @@ +CREATE TABLE IF NOT EXISTS public.c77_dbh_index_stats ( + index_name text COLLATE pg_catalog."default" PRIMARY KEY, -- Fully qualified index name + table_name text, -- Associated table + index_size bigint DEFAULT 0, -- Size in bytes + scan_count bigint DEFAULT 0, -- Number of scans + bloat_estimate numeric DEFAULT 0, -- Estimated bloat percentage + last_updated timestamp without time zone DEFAULT now() -- Last stats update +); + +CREATE OR REPLACE FUNCTION public.c77_dbh_update_index_stats() + RETURNS void AS $$ +DECLARE + v_has_pgstattuple boolean; +BEGIN + SELECT EXISTS ( + SELECT 1 FROM pg_extension WHERE extname = 'pgstattuple' + ) INTO v_has_pgstattuple; + + INSERT INTO public.c77_dbh_index_stats ( + index_name, table_name, index_size, scan_count, bloat_estimate, last_updated + ) + SELECT + (n.nspname || '.' || i.relname) AS index_name, + (n.nspname || '.' || t.relname) AS table_name, + pg_relation_size(i.oid) AS index_size, + COALESCE(s.idx_scan, 0) AS scan_count, + CASE WHEN v_has_pgstattuple THEN + ROUND(CAST(COALESCE((pgstattuple(i.oid)).free_percent + (pgstattuple(i.oid)).dead_tuple_percent, 0) AS numeric), 2) + ELSE + 0 -- Placeholder; could estimate based on size vs. usage + END AS bloat_estimate, + now() AS last_updated + FROM pg_index ix + JOIN pg_class i ON i.oid = ix.indexrelid + JOIN pg_class t ON t.oid = ix.indrelid + JOIN pg_namespace n ON n.oid = i.relnamespace + LEFT JOIN pg_stat_all_indexes s ON s.indexrelid = i.oid + WHERE n.nspname NOT IN ('pg_catalog', 'information_schema') + ON CONFLICT (index_name) DO UPDATE + SET + table_name = EXCLUDED.table_name, + index_size = EXCLUDED.index_size, + scan_count = EXCLUDED.scan_count, + bloat_estimate = EXCLUDED.bloat_estimate, + last_updated = EXCLUDED.last_updated; +END; +$$ LANGUAGE plpgsql; + +CREATE OR REPLACE FUNCTION public.c77_dbh_get_index_health() + RETURNS JSON AS $$ +DECLARE + v_median_bloat numeric; + v_median_usage numeric; +BEGIN + SELECT + percentile_cont(0.5) WITHIN GROUP (ORDER BY bloat_estimate), + percentile_cont(0.5) WITHIN GROUP (ORDER BY scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) + INTO v_median_bloat, v_median_usage + FROM public.c77_dbh_index_stats + WHERE index_name NOT LIKE 'pg_toast.%'; + + RETURN ( + SELECT json_agg( + json_build_object( + 'metric', 'Index Health', + 'index_name', index_name, + 'table_name', table_name, + 'status', CASE + WHEN bloat_estimate > v_median_bloat * 2 OR scan_count = 0 THEN 'Red' + WHEN bloat_estimate > v_median_bloat * 1.2 OR (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 'Yellow' + ELSE 'Green' + END, + 'severity', CASE + WHEN bloat_estimate > v_median_bloat * 2 OR scan_count = 0 THEN 5 + WHEN bloat_estimate > v_median_bloat * 1.2 OR (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 3 + ELSE 1 + END, + 'insight', CASE + WHEN scan_count = 0 THEN 'Index is unused; consider dropping.' + WHEN bloat_estimate > v_median_bloat * 2 THEN 'Index bloat is critically high.' + WHEN (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 'Index usage is low relative to size.' + ELSE 'Index is healthy and well-utilized.' + END, + 'action', CASE + WHEN scan_count = 0 THEN 'Drop the index if not needed.' + WHEN bloat_estimate > v_median_bloat * 2 THEN 'Run REINDEX; check autovacuum.' + WHEN (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 'Review query patterns or consider reindexing.' + ELSE 'No action needed.' + END, + 'index_size', pg_size_pretty(index_size), + 'scan_count', scan_count, + 'bloat_estimate', bloat_estimate::text || '%', + 'details', json_build_object( + 'usage_rate', round((scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0))::numeric, 2)::text || ' scans/MB', + 'median_bloat', round(v_median_bloat)::text || '%', + 'median_usage', round(v_median_usage, 2)::text || ' scans/MB' + ) + ) + ) + FROM public.c77_dbh_index_stats + WHERE index_name NOT LIKE 'pg_toast.%' + ); +END; +$$ LANGUAGE plpgsql; + + +CREATE OR REPLACE FUNCTION public.c77_dbh_get_toast_health() + RETURNS JSON AS $$ +DECLARE + v_median_bloat numeric; + v_median_usage numeric; + v_has_pgstattuple boolean; +BEGIN + SELECT EXISTS ( + SELECT 1 FROM pg_extension WHERE extname = 'pgstattuple' + ) INTO v_has_pgstattuple; + + SELECT + percentile_cont(0.5) WITHIN GROUP (ORDER BY bloat_estimate), + percentile_cont(0.5) WITHIN GROUP (ORDER BY scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) + INTO v_median_bloat, v_median_usage + FROM public.c77_dbh_index_stats + WHERE index_name LIKE 'pg_toast.%'; + + RETURN ( + SELECT json_agg( + json_build_object( + 'metric', 'TOAST Index Health', + 'index_name', index_name, + 'parent_table', table_name, + 'status', CASE + WHEN bloat_estimate > v_median_bloat * 2 OR scan_count = 0 THEN 'Red' + WHEN bloat_estimate > v_median_bloat * 1.2 OR (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 'Yellow' + ELSE 'Green' + END, + 'severity', CASE + WHEN bloat_estimate > v_median_bloat * 2 OR scan_count = 0 THEN 5 + WHEN bloat_estimate > v_median_bloat * 1.2 OR (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 3 + ELSE 1 + END, + 'insight', CASE + WHEN scan_count = 0 THEN 'TOAST index is unused; parent table may not need toasted columns indexed.' + WHEN bloat_estimate > v_median_bloat * 2 THEN 'TOAST index bloat is critically high.' + WHEN (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 'TOAST index usage is low relative to size.' + ELSE 'TOAST index is healthy and well-utilized.' + END, + 'action', CASE + WHEN scan_count = 0 THEN 'Review parent table queries; consider dropping TOAST index.' + WHEN bloat_estimate > v_median_bloat * 2 THEN 'Run REINDEX on TOAST table; optimize large fields in parent.' + WHEN (scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0)) < v_median_usage * 0.5 THEN 'Check parent table usage patterns.' + ELSE 'No action needed.' + END, + 'index_size', pg_size_pretty(index_size), + 'scan_count', scan_count, + 'bloat_estimate', bloat_estimate::text || '%' || + (CASE WHEN v_has_pgstattuple THEN ' (precise)' ELSE ' (estimated)' END), + 'details', json_build_object( + 'usage_rate', round((scan_count::numeric / NULLIF(index_size::numeric / 1024^2, 0))::numeric, 2)::text || ' scans/MB', + 'median_bloat', round(v_median_bloat)::text || '%', + 'median_usage', round(v_median_usage, 2)::text || ' scans/MB', + 'parent_columns', ( + SELECT string_agg(attname, ', ') + FROM pg_attribute a + JOIN pg_class c ON c.oid = (SELECT oid FROM pg_class WHERE relname = split_part(table_name, '.', 2) AND relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = split_part(table_name, '.', 1))) + WHERE a.attrelid = c.oid AND a.attlen = -1 -- Variable-length columns likely toasted + ) + ) + ) + ) + FROM public.c77_dbh_index_stats + WHERE index_name LIKE 'pg_toast.%' + ); +END; +$$ LANGUAGE plpgsql;