PostgreSQL: Up and Running - Deepstash

Explore the World's Best Ideas

Join today and uncover 100+ curated journeys from 50+ topics. Unlock access to our mobile app with extensive features.

Textual data types

Textual data types

Postgres offers 3 textual data types, each with its own purpose, but offering the same query performance:

  • char(n) → when you expect your data to always be N characters long (if it's shorter, Postgres adds spaces at the end)
  • varchar(n) → when you want to limit your data's length to N characters
  • text → when you want to store texts of any length

varchar and text can store strings that use up to 1GB, but if you want to change the limit of a varchar field, you have to rewrite the table. This process requires an exclusive lock on the table, meaning nobody can use it until the rewrite is complete.

10

53 reads

Table storage layout

Table storage layout

The basic unit of storage in Postgres is a page (or block). Each table and index is organized in 8KB logical pages, which are grouped into 1GB segments on disk. Whenever Postgres has to read a row from disk, it reads its physical location from the ctid column and then loads its entire corresponding page from disk.

10

49 reads

Materialized views

Materialized views

Materialized views (MV) are views that persist the results of their underlying query, can be refreshed periodically and let you add indexes on them. They're useful when you want to speed up an expensive query and you're ok working with stale data.

There are two ways to refresh a MV:

  1. Blocking - which completes faster, but blocks all clients from querying the MV;
  2. Concurrently - which completes slower, but lets clients query the MV, although their queries take longer to run. This option works only if the MV has a unique index.

E.g. Naming convention: [app]_vw_[mv_name]_materialized

11

35 reads

Table statistics are essential for optimal query plans

Table statistics are essential for optimal query plans

The query planner looks at available indexes, cost settings, strategy settings, and distribution of data, in order to plan the most efficient way to execute a query.

Stats about the distribution of data are stored in the pg_statistic table and presented in a more readable form in the pg_stats view. They are generated based on ~20% of the table data (less for large tables) and are constantly updated by the autovacuum daemon. You can also manually update them using ANALYZE or VACUUM ANALYZE, the later which is strongly recommended after inserting or deleting a large number of rows.

10

25 reads

How to read query plans

How to read query plans

You can view a query plan using the EXPLAIN (ANALYZE, VERBOSE, BUFFERS) command. The plan is organized as a tree, where each node has these details:

  • operation e.g. Index Scan, Group Aggregate, Sort
  • cost=startup cost..total cost → The startup is the time before retrieval of data and includes scanning of indexes, joins of tables, organizing additional parallel workers
  • rows=N → estimated number of rows retrieved by the node
  • width=W → estimated avg. size of rows produced by the node, in bytes
  • buffers → number of pages read from cache (hit) and from disk (read)

11

33 reads

IDEAS CURATED BY

ocpodariu

Alt account of @ocp. I use it to stash ideas about software engineering

Discover Key Ideas from Books on Similar Topics

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates