PostgreSQL business scenarios outstanding problems and optimization

Pankaj kushwaha
4 min readJan 20, 2021

PostgreSQL outstanding problems and optimization

Typically, conventional software is profitable through the sale of legitimate software licenses. With the saturation of the market and the need for sustainable growth, new versions can be released by software developers to enable consumers to continue purchasing new versions in order to retain continuous development. But this kind of software installation is easy for consumers to move to other items in the form of the client. Around the same time, conventional tech firms are still thinking about transformation, with complete competition in the industry. For example, it is not so easy to offer paas and saas services, users buy services through software subscriptions, and data is in the cloud, customer migration, and subscription fees need to be charged annually. Another benefit is that it is easy to identify users, configure user features, and broaden a wider user market with app subscriptions. The free version is used to increase the number of clients and the habits of train users. To boost sales, the business version is used. Forming a moat is better.

In addition to the disparity in the mode of sales, another major difference is that the initial customer supplied the hardware, and now the cloud supplies the hardware, from the one-off selling of customer installation to the long-term subscription of paas and saas.

Software vendors, particularly databases, need to bear the hardware costs. Multiple clients, or one for each customer, may share databases. The cost-effectiveness of multiple customer sharing (data isolation) will obviously be higher, thereby reducing tech companies’ costs.

Then several businesses need to be equipped with a database to separate consumer data into a schema or database. What problems would this add to the database if there are several schemas or multiple databases in a database?

Assuming a database holds 10,000 schemas, there are 2,000 tables and 5,000 indexes in each schema, then there are at least 70 million things.

Challenge

1. There may be several connections, especially during peak times, when it is used by many businesses.

If you are using the database mode (one database per enterprise), it is not possible to reuse connections through databases, and the number of connections is more demanding.

If you use the schema mode (one schema per enterprise) to connect to the database using the same user, the connection can be replicated across organizations, and the connections needed are much smaller than the database mode.

Recommendation: Mode of the schema, same person. It’s easy to use different users with different services. For the same operation, the same user is suggested.

2. Since there are several connections, the OOM issue is that each link accesses several items (for example, the same user accesses several business tables, and the layout and index of each table must be stored in the connection’s local memory), causing the linked local memory to be incredibly large.

The memory will become larger and larger as the user uses a long connection.

Recommendation: Reduce the number of links overall (especially in microservice scenarios, each service is configured with a large max connections, which may burst database connections during avalanches, resulting in problems with oom and full connections).

Suggestion: Set the connection life cycle of the connection pool to the maximum time that a connection can last, and if it reaches it, it will be automatically released (thus reducing the continuous skyrocketing memory of each connection).

Suggestion: Configure the idle timeout relation in the connection pool. It will be automatically published after a certain amount of idle time. The advantages are the same as those above.

Recommendation: Use a wide page to decrease page table usage.

Recommendation: Kernel level supports the global metadata cache feature (table structure, index structure, etc.) to avoid the situation where a copy of the accessed metadata needs to be cached for each link.

3. when there is a business bug that results in any slow queries, a chain reaction would be triggered by the peak time, the slower the more links are opened, and eventually an avalanche.

Recommendation: In front-end contact, stop repetitive clicks and repeated requests.

Recommendation: To avoid avalanches, configure a SQL timeout.

Recommendations: Queuing requests, deterioration of industry, discarding requests and ensuring that certain requests from customers are regular.

4. Mixed intervention of difficult and important demands,

Recommendation: To add read-only instances to spread the burden, dedicated read-only instances may be used for complex requests (with low rt requirements).

5. Expanding info. If there are long transactions in the database, and at the same period there are a large number of updating and deleting operations, the waste produced during the transaction phase can not be recycled in time, which can eventually lead to expansion

Suggestion: stop lengthy transactions, pay attention to bloat management, and use pg repack to decrease bloat.

6. Various clients interfere with one another

Recommendation: Multi-tenant functions are provided at the kernel level, including: isolation, flow control, resource control.

I like to learn new and better ways of doing things when working on a scale, and feel free to ask questions and make suggestions.
Also, check out another story on this.
Thanks for reading this.

--

--

Pankaj kushwaha

Database/System Administrator | DevOPS | Cloud Specialist | DevOPS