Citus Notes: COPY

:: postgres, citus, internals

(These are some notes I took while studying Citus code, so it is probably more detail oriented than higher picture oriented).

Citus overrides the utility hook with multi_ProcessUtility. This function calls ProcessCopyStmt() for COPY statements, which calls CitusCopyFrom(), which calls CopyToExistingShards().

CopyToExistingShards() uses the postgres/src/include/commands/copy.h API to read tuples:

  • BeginCopyFrom()
  • NextCopyFrom()
  • EndCopyFrom()

and it uses the CitusCopyDestReceiver API to write tuples. CitusCopyDestReceiver is a specialization of postgres’ DataReceiver, which contains the following methods:

  • rStartup/rShutdown: per-executor-run initialization and shutdown
  • rDestroy: destroy the object itself.
  • receiveSlot: called for each tuple to be output.

... More ...

A better viewer for PostgreSQL debug trees

:: postgres, racket

If you set debug_print_parse, debug_print_rewritten, or debug_print_plan to true, PostgreSQL will log some of the interesting internal data structures during the query execution. But these logs are usually too long and difficult to inspect.

I recently switched to using Frog to generate my blog. Last week I wrote a little Frog plugin to allow me embed these trees in a nice tree view in blog posts.

... More ...

PostgreSQL Internals: TRUNCATE

:: postgres, internals

You can use TRUNCATE in postgres to delete all of the rows in a table. The main advantage of it compared to using DELETE is performance. For example, using DELETE to delete all rows in a table with 1 million rows takes about 2.3 seconds, but truncating the same table would take about 10ms.

But how does postgres implement TRUNCATE that it is so fast?

... More ...

cstore_fdw and ‘Files are Hard’

:: postgres

I recently came accross the "Files are hard" article, and it made me wonder how reliable is cstore_fdw’s design and implementation. cstore_fdw is a columnar store for PostgreSQL that I designed and developed in my previous job at Citus Data.

I am writing this post so my decisions for cstore_fdw’s design get reviewed by more people, and I get some feedback and improve the design.

... More ...

Haskell Skyline

:: programming, haskell, fp

Recently I started learning Haskell by studying the Intro to FP Programming course on Edx. Since then, I try to model different problems using Haskell.

One of these problems is the Skyline problem, which goes like this:

You are given a set of rectangular buildings in a city, and you should return the skyline view of the city. Input is a sequence of tuples (x_{left}, height, x_{right}), each describing a building. The output is a sequence of pairs (x, height) meaning that the height of skyline changed to height at the given x coordinate.

... More ...