Dead rows in a materialized view

栏目: IT技术 · 发布时间: 5年前

内容简介：In the application that I work on, we have a materialized view. It was created from several joined tables. It’s used to speed up searching for data without joining seven or eight tables in every query. At least it should make searching faster but in practi

In the application that I work on, we have a materialized view. It was created from several joined tables. It’s used to speed up searching for data without joining seven or eight tables in every query. At least it should make searching faster but in practice, it didn’t and I will describe to you why.

Queries to that view were really slow, but only on the production server, on staging everything was fine. That view was exactly the same as on staging server, but queries were ~50 times slower on production.

The first step was to check execution plans for any query on both servers.

EXPLAIN ANALYZE SELECT COUNT(*) FROM 'table_name';

On staging execution plan involved parallel scan, on production index scan.

When I discovered that, the first thing that came to my mind was the possibility of different configurations that disabled parallel queries on production. I checked that and it turned out to be false. My next attempts were to check various Postgres views that kept some data about tables. After looking through many numbers that didn’t help I finally discovered that our table (materialized view actually, but in this context, it makes no difference) on the production server it took 7GB of disc space and on staging only 300MB.

SELECT pg_size_pretty( pg_total_relation_size('table_name') );

I expected that row count had to be totally different but unexpectedly it was ~320k on production and ~310k on staging.

After a really long investigation, I found the reason why those numbers were so different, the table on production contained an enormous number of dead rows.

SELECT n_dead_tup 
FROM pg_stat_all_tables
WHERE relname = 'table_name'

At this time I was almost sure that’s the reason for our whole problem (or at least I hoped so). I started researching those dead rows, what they are, and why they appear. The major defect of that approach is that after every single update on one of the records from joined tables, the whole materialized view is refreshed. Refreshed i.e. whole view is dropped and then it’s building query is run. We have two types of a refresh in PostgreSQL:

non concurrently

This refresh type does not produce any dead rows, but for the time of refreshing view is locked, and no data can be read from it. This is an unacceptable solution for my project.

concurrently

It allows reading data during the process. It’s possible thanks to duplicating all data before deleting it. During refresh, all SELECT queries see that duplicated data, and after the process, all queries have access to newly created view, and duplicates remain as dead rows. It’s the way how the view is bloated with tons of unnecessary data.

And here comes VACUUM mechanism that is used to remove all dead rows from the table or materialized view.

VACUUM table_name

This command removes all dead queries from the given table, but it has to be run ‘manually’, or by some application code. There is also a mechanism called autovacuum . PostgreSQL has some workers (quantity set in configuration) that all the time search our database and run VACUUM on tables that are in need. What do we know now? We have many dead rows in the materialized view, we also have a mechanism that should clean them. Why isn’t it? In order to check that I added query that logs last auto and vacuum time with dead rows count.

SELECT n_dead_tup, last_autovacuum, last_vacuum 
FROM pg_stat_all_tables
WHERE relname = 'table_name'

After a few days, I checked that logs. The first thing to notice was that the Autovacuum process is working but it’s triggered only in evenings. All dead rows are cleaned and the next day from something like 7 a.m. they appear and it’s count is growing. I checked our search in the morning and indeed it was as fast as it should be, but within a day it was getting much slower. Following really long research why is that happening I found that Autovacuum cannot be run on the table when it’s locked by SHARE UPDATE EXCLUSIVE lock. What locks a table that way? Obviously it’s REFRESH MATERIALIZED VIEW CONCURRENTLY. When that view is refreshed in our application? Other logs added and the answer is: the view is refreshed almost whole time (during a workday from morning to evening). Refresh is as I mentioned triggered by every data update on each of the tables that problematic view is made of. And here comes our final answer. Dead rows aren’t cleaned because Autovacuum cannot be run during refresh and refresh is running continuously from morning to evening.

Possible solutions:

Architectural changes that would prevent an application from refreshing whole materialized view during every data update. This is definitely possible to just update those rows that really changed without dropping and building from the scratch whole table.

This is definitely the best solution and I would choose that if I had much more time to spend on fixing that problem, but it would be too long to refactor every place in the application that can update one of the included tables.
The second possible fix is to append some breaks between refreshes that AUTOVACUUM process could be fired on our view. Theoretically, it is a fine idea, it would reduce database overload, it should be quite fast to implement, but I think there is too much room for unforeseen consequences and finally, I decided to give up that idea. Even though adding breaks would not take much time to implement, testing it carefully would and as I noticed above there were no more time after that really long investigation
The third and final solution I anticipated was the simplest one. In the job used to refresh the materialized view, I added code that ‘manually’ vacuums view from dead rows. During that whole research on I run vacuum manually many times and I knew it worked. Going this way assures us that there is no possibility that dead rows stay in view because they are vacuumed as soon as refresh (that creates them) ends.

Weighing up the pros and cons I finally chose the third solution. Time was a problem and it was the only one that could be tested immediately. It worked, but I think refactoring the application to not refresh the whole view would be the best option.

We create dedicated software for companies, bring ideas to life and enjoy what we do.

Looking for professional team?

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持码农网

查看所有标签

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

深入理解C#（第3版）

斯基特 (Jon Skeet) / 姚琪琳 / 人民邮电出版社 / 2014-4-1 / 99.00元

本书是世界顶级技术专家“十年磨一剑”的经典之作，在C#和.NET领域享有盛誉。与其他泛泛介绍C#的书籍不同，本书深度探究C#的特性，并结合技术发展，引领读者深入C#的时空。作者从语言设计的动机出发，介绍支持这些特性的核心概念。作者将新的语言特性放在C#语言发展的背景之上，用极富实际意义的示例，向读者展示编写代码和设计解决方案的最佳方式。同时作者将多年的C#开发经验与读者分享，读者可咀其精华、免走弯......一起来看看《深入理解C#（第3版）》这本书的介绍吧!

码农工具

Dead rows in a materialized view

We create dedicated software for companies, bring ideas to life and enjoy what we do.

深入理解C#（第3版）

Base64 编码/解码

UNIX 时间戳转换

HEX HSV 转换工具