Files
reveal.js/paper/ PGConf NYC 2021 - Understanding of Jsonb Performance.vtt
heimoshuiyu 046a59461e 添加PostgreSQL JSONB性能优化演示文稿
- 创建了关于PostgreSQL JSONB性能优化的完整演示文稿
- 包含TOAST阈值问题分析、JSONB操作符性能比较
- 讨论部分更新挑战和高级优化技术
- 对比PostgreSQL JSONB与MongoDB的性能表现
- 添加了相关图表和演讲者注释
2025-12-11 09:50:13 +08:00

2696 lines
49 KiB
WebVTT
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

WEBVTT
1
00:00:14.469 --> 00:00:17.180
Thank you very much for attending my lecture.
非常感谢大家听我讲座。
2
00:00:17.180 --> 00:00:21.910
I'm very happy to be with you.
我很高兴和你在一起。
3
00:00:21.910 --> 00:00:25.320
And today I will talk about JSONB performance.
今天我将谈谈JSONB的性能。
4
00:00:25.320 --> 00:00:29.090
This slide are available already.
这张幻灯片已经有售了。
5
00:00:29.090 --> 00:00:33.040
This joint talk with my colleague Nikita Glukhov.
这是我与同事尼基塔·格卢霍夫的联合讲座。
6
00:00:33.040 --> 00:00:39.470
This is a picture of an elephant with projects I was working on.
这是一张大象和我正在做的项目的照片。
7
00:00:39.470 --> 00:00:41.199
Maybe you know some of them.
也许你认识其中一些。
8
00:00:41.199 --> 00:00:53.699
I'm research scientist at Moscow university and most interesting to me is I'm major Postgres contributor.
我是莫斯科大学的研究科学家最有趣的是我是Postgres的主要贡献者。
9
00:00:53.699 --> 00:01:00.780
And my colleague Nikita Glukhov he's working also in Postgres Professional company.
我的同事尼基塔·格卢霍夫也在Postgres Professional公司工作。
10
00:01:00.780 --> 00:01:08.310
He's also Postgres contributor and for several years he did several big projects.
他也是Postgres的撰稿人多年来参与了多个大型项目。
11
00:01:08.310 --> 00:01:13.009
So we today will talk about the JSON performance.
所以今天我们要谈谈JSON的性能。
12
00:01:13.009 --> 00:01:22.400
And the reason why I decided to talk about this here, we know that JSON is, as we said,
我之所以决定在这里讨论这个是因为我们知道JSON正如我们所说
13
00:01:22.400 --> 00:01:24.340
one type fits all.
一种类型适用于所有人。
14
00:01:24.340 --> 00:01:44.370
You know that modern architecture is micro series architecture and JSON is very good for this architecture because client applications, front end, back end and now database, they use JSON.
你知道现代架构是微系列架构而JSON非常适合这种架构因为客户端应用、前端、后端现在还有数据库都用JSON。
15
00:01:44.370 --> 00:01:48.630
It is very, very easy for start up to start their project.
创业公司启动项目非常非常容易。
16
00:01:48.630 --> 00:02:03.909
You don't need the relational scheme, how to make it proper, how to make it it's very difficult when you start your project, when you start your business to predict what the schema will be.
你不需要关系方案,如何让它变得恰当,当你开始项目、创业时,预测模式会是什么非常困难。
17
00:02:03.909 --> 00:02:05.480
In the several months.
几个月来,
18
00:02:05.480 --> 00:02:07.890
With JSON you don't have any problem.
用JSON就没问题。
19
00:02:07.890 --> 00:02:10.000
Just have JSON.
只要有 JSON 就行。
20
00:02:10.000 --> 00:02:23.530
And all server side languages support JSON SQL JSON so it's not something different from SQL.
而且所有服务器端语言都支持 JSON、SQL、JSON所以这和 SQL 没什么区别。
21
00:02:23.530 --> 00:02:29.470
What's very important that JSON relaxed object relational mismatch.
非常重要的是JSON 放松了对象关系的不匹配。
22
00:02:29.470 --> 00:02:35.660
In the code you work with object, in relational database you work with relations.
在代码中你处理对象,在关系数据库中你处理关系。
23
00:02:35.660 --> 00:02:41.370
And you have some contradictions between the programmers and between d/b/a.
而且程序员之间和d/b/a之间存在一些矛盾。
24
00:02:41.370 --> 00:02:47.510
But when you hear use JSON, there's no any contradiction.
但当你听到使用JSON时就没有矛盾。
25
00:02:47.510 --> 00:02:52.950
So this way JSON becomes very, very popular.
这样JSON才变得非常非常受欢迎。
26
00:02:52.950 --> 00:02:57.540
And I would say that now I see that JSONB rush.
我现在看到了那种 JSONB 的冲刺。
27
00:02:57.540 --> 00:03:12.000
Because I am speaking in many countries and I see that many people they have they ask me about JSON and they said that we don't know much about SQL.
因为我在很多国家演讲看到很多人问我关于JSON的事他们说我们对SQL了解不多。
28
00:03:12.000 --> 00:03:13.670
We use JSON.
29
00:03:13.670 --> 00:03:17.860
And what we need is just to have JavaScript instead of SQL.
30
00:03:17.860 --> 00:03:22.000
This is very interesting career actually.
31
00:03:22.000 --> 00:03:24.370
I have some thinking about this.
32
00:03:24.370 --> 00:03:28.459
Because actually it's easy.
33
00:03:28.459 --> 00:03:33.510
I can internally transform JavaScript to SQL and execute it.
34
00:03:33.510 --> 00:03:38.310
But that's about my future project.
35
00:03:38.310 --> 00:03:46.650
And JSONB is actually a main driver of Postgres popularity.
36
00:03:46.650 --> 00:03:50.489
You see the create table JSON of the it's a common mistake.
37
00:03:50.489 --> 00:03:54.230
They put everything into JSONB.
38
00:03:54.230 --> 00:04:00.440
That is because people don't know about how JSONB performance.
39
00:04:00.440 --> 00:04:03.680
I will talk about this later.
40
00:04:03.680 --> 00:04:14.180
Another reason about this talk, nobody made the comparison between performance of JSONB operators.
41
00:04:14.180 --> 00:04:22.680
So this talk I will explain which operator to use, what is better and so on.
42
00:04:22.680 --> 00:04:30.430
And another reason is that I work 25 years in Postgres, maybe 26.
43
00:04:30.430 --> 00:04:34.440
I started from 1995.
44
00:04:34.440 --> 00:04:43.100
And almost all my projects, they connected to the extending Postgres to support this nonstructural data.
45
00:04:43.100 --> 00:05:02.310
So we started from arrays, h store, full text search, now working on JSONB and SQL and this is my interest and I believe that JSON is very useful for Postgres community.
46
00:05:02.310 --> 00:05:04.610
You see this picture.
47
00:05:04.610 --> 00:05:14.729
This how popularity of four databases change over time.
48
00:05:14.729 --> 00:05:18.520
The only database which grows is Postgres.
49
00:05:18.520 --> 00:05:27.210
I use the official say official numbers from DB engine and relational popularity.
50
00:05:27.210 --> 00:05:35.900
Postgres becomes popular from the time we committed JSONB into Postgres.
51
00:05:35.900 --> 00:05:43.720
I believe that JSONB is one of the main driver of popularity.
52
00:05:43.720 --> 00:06:03.070
And because No SQL people become upset and go to Postgres because Postgres got good JSON data type.
53
00:06:03.070 --> 00:06:14.009
Our work on JSONB made possible SQL standard 2016.
54
00:06:14.009 --> 00:06:25.180
So the success of Postgres made this, all other databases now have JSON and that's why we have SQL standard on this.
55
00:06:25.180 --> 00:06:31.500
To me it's very important to have to continue work on JSON in Postgres.
56
00:06:31.500 --> 00:06:35.750
Because we have many, many users.
57
00:06:35.750 --> 00:06:49.480
These are numbers from PostGreSQL you see that the most popular is JSONB.
58
00:06:49.480 --> 00:06:51.380
It's the biggest.
59
00:06:51.380 --> 00:07:19.479
If we take the popularity in the telegram chart on PostGreSQL it's several thousand, any time you have several thousand people on land and JSON and JSONB is third popular award used by Postgres people.
60
00:07:19.479 --> 00:07:23.470
The first is select, the second is SQL and the third is JSON.
61
00:07:23.470 --> 00:07:29.449
This is like some argument that JSON is very popular in the Postgres community.
62
00:07:29.449 --> 00:07:38.600
So we were working on several big projects, some of them already committed, some of them waiting for commit.
63
00:07:38.600 --> 00:07:43.860
But now we change the priority for our development.
64
00:07:43.860 --> 00:07:49.040
So we want to have JSONB the first class citizen in Postgres.
65
00:07:49.040 --> 00:08:02.080
It's means we want to have efficient storage, select, update, good API, the reason for this is SQL JSON is important, of course.
66
00:08:02.080 --> 00:08:04.460
Because this part of standard.
67
00:08:04.460 --> 00:08:12.729
But actually people who work with Postgres, they have no idea to be compatible with Oracle or Microsoft.
68
00:08:12.729 --> 00:08:20.330
I know that people, the startups use started with Postgres and never change it.
69
00:08:20.330 --> 00:08:23.200
And JSONB is already a mature data type.
70
00:08:23.200 --> 00:08:31.259
We have a load of functionality in Postgres and we have not enough resources in community.
71
00:08:31.259 --> 00:08:35.940
To even to review and commit the patches.
72
00:08:35.940 --> 00:08:41.930
You see that four years we have patches for SQL JSON functions.
73
00:08:41.930 --> 00:08:47.490
JSON table will wait also four years.
74
00:08:47.490 --> 00:08:49.759
Maybe for PG15 we'll have some committed.
75
00:08:49.759 --> 00:08:53.370
But I understand that community just has no resources.
76
00:08:53.370 --> 00:09:02.449
And my interest mostly is concentrate on improving JSONB, not standard.
77
00:09:02.449 --> 00:09:07.709
So I mostly aware about Postgres users.
78
00:09:07.709 --> 00:09:18.439
And I'm not very interested in the compatibility to Oracle or Microsoft SQL server.
79
00:09:18.439 --> 00:09:22.300
So this is a popular mistake.
80
00:09:22.300 --> 00:09:25.999
People put everything into JSON and that's not good idea.
81
00:09:25.999 --> 00:09:35.790
You can see it very easy that ID outside of JSONB and ID inside.
82
00:09:35.790 --> 00:09:43.749
If JSONB is grow, the performance is great degrade very quickly.
83
00:09:43.749 --> 00:09:50.089
Don't do this.
84
00:09:50.089 --> 00:09:56.209
We want to demonstrate the performance of nested containers.
85
00:09:56.209 --> 00:10:04.839
You usually, we created simple tables with nested objects, and we just test several operators.
86
00:10:04.839 --> 00:10:07.749
It's error operator.
87
00:10:07.749 --> 00:10:12.329
Most people use error operator to access the key.
88
00:10:12.329 --> 00:10:17.029
A hash arrow.
89
00:10:17.029 --> 00:10:19.959
The new one is subscripting.
90
00:10:19.959 --> 00:10:24.649
So you can you have like an array syntax.
91
00:10:24.649 --> 00:10:27.079
Another one the force JSON pass.
92
00:10:27.079 --> 00:10:33.489
The other one do you know which is better?
93
00:10:33.489 --> 00:10:36.790
Nobody knows actually and we need to.
94
00:10:36.790 --> 00:10:45.709
So we did a lot of experiments and now I will show you some graphics.
95
00:10:45.709 --> 00:10:58.009
This is a raw JSONB size and execution time.
96
00:10:58.009 --> 00:11:00.550
This is arrow operator.
97
00:11:00.550 --> 00:11:06.360
JSONB grow and execution times grow.
98
00:11:06.360 --> 00:11:10.139
So the reason for this I will explain later.
99
00:11:10.139 --> 00:11:13.569
The same behavior actually is for other operator.
100
00:11:13.569 --> 00:11:18.019
But it's a bit different.
101
00:11:18.019 --> 00:11:22.149
Interesting that subscripting behave very good.
102
00:11:22.149 --> 00:11:27.199
And we have the color is nesting level.
103
00:11:27.199 --> 00:11:34.199
So we know execution time for the different nesting level.
104
00:11:34.199 --> 00:11:44.980
This two kilobytes JSONB becomes toasted.
105
00:11:44.980 --> 00:11:50.989
So we have degradation of performance.
106
00:11:50.989 --> 00:11:57.299
But before, we have the constant and we have over here for nesting.
107
00:11:57.299 --> 00:12:09.690
So the deeper we go, the performance is more worse.
108
00:12:09.690 --> 00:12:16.189
Here is a slowdown, relative to the root level.
109
00:12:16.189 --> 00:12:23.430
So the picture is also very strange relative to the root level for the same operator.
110
00:12:23.430 --> 00:12:31.269
So this JSON path and subscription.
111
00:12:31.269 --> 00:12:38.430
Arrow operator again shows some instability is not very predictable behavior.
112
00:12:38.430 --> 00:12:46.029
If you see the slowdown relative to the extract pass operator, situation becomes a bit better,
113
00:12:46.029 --> 00:12:53.410
a bit clearer because we see that JSON path is slowest.
114
00:12:53.410 --> 00:12:56.230
And subscripting is the fastest.
115
00:12:56.230 --> 00:13:01.999
Very unexpected behavior, but you know now this.
116
00:13:01.999 --> 00:13:10.959
After 2 kilobytes, everything becomes more or less because the dominates.
117
00:13:10.959 --> 00:13:19.750
You have major contribution to the performance.
118
00:13:19.750 --> 00:13:28.230
This picture demonstrates best operator depending on size and nesting level.
119
00:13:28.230 --> 00:13:33.410
So it's the same data, but in different picture, different format.
120
00:13:33.410 --> 00:13:41.579
We see that arrow operator is good only for small JSONB and the root level.
121
00:13:41.579 --> 00:13:47.110
I would say for root level you can safely use arrow operator.
122
00:13:47.110 --> 00:13:54.939
For the second, for the first level, I would not use for the big JSONB.
123
00:13:54.939 --> 00:14:00.529
Subscripting is the most useful good.
124
00:14:00.529 --> 00:14:08.089
And JSON path, you see here, you see some JSON path.
125
00:14:08.089 --> 00:14:12.230
But really JSON path is not about performance.
126
00:14:12.230 --> 00:14:19.970
Because JSON path is very flexible for the very complex queries you need JSON path.
127
00:14:19.970 --> 00:14:31.019
But for the simple queries like this, over here the JSON path is too big.
128
00:14:31.019 --> 00:14:37.470
So all operators have common overhead, it's deTOAST and iteration time.
129
00:14:37.470 --> 00:14:54.470
But arrow operator very fast for the small and root level because it has minimal initialization but it need to copy intermediate results to some temporary datums.
130
00:14:54.470 --> 00:14:56.209
Let's see this one.
131
00:14:56.209 --> 00:15:02.029
This picture how arrow operator executes.
132
00:15:02.029 --> 00:15:11.239
So when you have TOAST, this is TOAST pointer, then you find key 1.
133
00:15:11.239 --> 00:15:23.959
You find some value and you copy all this nested JSONB container into some intermediate data which go to the second execution.
134
00:15:23.959 --> 00:15:29.439
So key 2 and then you copy string to result.
135
00:15:29.439 --> 00:15:34.899
This example just for root in the first level.
136
00:15:34.899 --> 00:15:42.509
But you can imagine if you have 9 levels, big more levels, you have to repeat this operation.
137
00:15:42.509 --> 00:15:52.019
You have to copy all nested container to memory, to some intermediate datum.
138
00:15:52.019 --> 00:15:57.929
This surprise for the abstraction.
139
00:15:57.929 --> 00:16:00.920
In Postgres you can combine operator.
140
00:16:00.920 --> 00:16:05.609
So this way this arrow operator works like this.
141
00:16:05.609 --> 00:16:09.279
But extract path works different.
142
00:16:09.279 --> 00:16:18.279
JSON path, they have they share the same schema.
143
00:16:18.279 --> 00:16:20.609
Again, you have TOAST pointer.
144
00:16:20.609 --> 00:16:24.699
You deTOAST and then you just copy string to result.
145
00:16:24.699 --> 00:16:34.790
You find everything in one insight and there is no inside and there is no copy over here.
146
00:16:34.790 --> 00:16:44.339
So the conclusion is that you can use safely arrow operator for root level of any size,
147
00:16:44.339 --> 00:16:46.979
and for first level for small JSONB.
148
00:16:46.979 --> 00:16:55.869
Then use subscripting and extract path for large JSONB and if you have higher nesting level.
149
00:16:55.869 --> 00:16:56.869
This is my recommendation.
150
00:16:56.869 --> 00:17:03.359
JSON path is slowest, but it's very useful for complex queries.
151
00:17:03.359 --> 00:17:15.209
Now we want to analyze the performance of another very important queries is contains.
152
00:17:15.209 --> 00:17:20.939
So you want to find some key value in JSONB.
153
00:17:20.939 --> 00:17:27.390
So again, we have a table with arrays of various size.
154
00:17:27.390 --> 00:17:32.799
From 1 to 1 million entries.
155
00:17:32.799 --> 00:17:39.779
And we try to find several operations.
156
00:17:39.779 --> 00:17:42.139
It's contains operator.
157
00:17:42.139 --> 00:17:45.880
JSON pass match operator.
158
00:17:45.880 --> 00:17:51.490
JSON path exist operator with filter.
159
00:17:51.490 --> 00:17:55.029
And SQL, two variants of SQL.
160
00:17:55.029 --> 00:18:01.529
One is exists and another one is optimized version of this one.
161
00:18:01.529 --> 00:18:03.049
And we use this query.
162
00:18:03.049 --> 00:18:09.160
Like we use we look for the first element existed.
163
00:18:09.160 --> 00:18:10.380
Actually it's a zero.
164
00:18:10.380 --> 00:18:15.059
And we look for the nonexistent operators which is minus 1.
165
00:18:15.059 --> 00:18:20.179
And we see how different queries execute.
166
00:18:20.179 --> 00:18:30.750
We see that if we apply search first element the behavior more or less the same, except two.
167
00:18:30.750 --> 00:18:34.230
Contain separator is the fastest.
168
00:18:34.230 --> 00:18:43.639
Before TOAST, before JSONB TOASTed we have constant and constant time.
169
00:18:43.639 --> 00:18:50.390
And also we have much operator in lux mode.
170
00:18:50.390 --> 00:19:09.450
JSON path has two modes, which you instruct interpreter of JSON path how to work with errors for example.
171
00:19:09.450 --> 00:19:15.190
In lax mode.
172
00:19:15.190 --> 00:19:19.880
Execution stops when you find the result.
173
00:19:19.880 --> 00:19:27.929
And since zero is the first member in the array, it happens very fast.
174
00:19:27.929 --> 00:19:37.450
But in strict mode, green, you have to check all elements.
175
00:19:37.450 --> 00:19:44.950
Because in strict mode you have to see all errors, possible errors.
176
00:19:44.950 --> 00:19:47.919
Fortunately lax mode is default.
177
00:19:47.919 --> 00:20:04.110
So you see that search first element but for nonexistent element, behavior the same because you have to check all elements.
178
00:20:04.110 --> 00:20:08.289
You check all elements, you can find minus zero.
179
00:20:08.289 --> 00:20:14.899
And the difference in performance is just different overheads.
180
00:20:14.899 --> 00:20:16.639
Operator.
181
00:20:16.639 --> 00:20:20.940
So this is speed up relative to SQL exists.
182
00:20:20.940 --> 00:20:26.200
And conclusions is that contains is the fastest.
183
00:20:26.200 --> 00:20:30.559
SQL exists is the slowest.
184
00:20:30.559 --> 00:20:47.059
And the performance between much and exist operators for JSON path depends how many items you have to iterate.
185
00:20:47.059 --> 00:20:57.220
The most interesting, you see on all pictures, you see that performance very dependent on how big is JSON.
186
00:20:57.220 --> 00:21:04.399
After JSON in 2 kilobytes, the performance degrade.
187
00:21:04.399 --> 00:21:08.930
That's why we analyze the TOAST details.
188
00:21:08.930 --> 00:21:13.470
This part I call the curse of TOAST.
189
00:21:13.470 --> 00:21:16.330
This is unpredictable performance of JSONB.
190
00:21:16.330 --> 00:21:22.909
People actually ask me why when they start the project we have very good performance,
191
00:21:22.909 --> 00:21:28.600
but then we see sometimes that performance very unstable.
192
00:21:28.600 --> 00:21:34.129
And this query, this example demonstrates unpredictable behavior.
193
00:21:34.129 --> 00:21:40.330
So very simple JSON with ID and some array.
194
00:21:40.330 --> 00:21:47.899
And first you have select very fast.
195
00:21:47.899 --> 00:21:56.129
So we see that buffers hit 2,500 and some milliseconds.
196
00:21:56.129 --> 00:21:59.500
After you update, very simple update.
197
00:21:59.500 --> 00:22:03.700
You have 30,000 buffers.
198
00:22:03.700 --> 00:22:07.080
And 6 milliseconds.
199
00:22:07.080 --> 00:22:18.880
So people asking me why it happens, but it happens because rows gets TOASTed.
200
00:22:18.880 --> 00:22:30.009
So TOAST is a very useful technology in Postgres which allows you to store very long objects.
201
00:22:30.009 --> 00:22:34.889
We have limitation for 8 kilobyte page size.
202
00:22:34.889 --> 00:22:38.389
We have tuple limit to 2 kilobytes.
203
00:22:38.389 --> 00:22:42.679
Everything that's bigger to 2 kilobytes we move to storage.
204
00:22:42.679 --> 00:22:50.460
And we have implicit join when we get value.
205
00:22:50.460 --> 00:22:53.520
But situation is much worse, I will show later.
206
00:22:53.520 --> 00:22:55.259
But here is explanation.
207
00:22:55.259 --> 00:23:12.129
You can install very useful extension page inspect and see that regional JSON, which stored in line, not TOASTed in heap, we have 2,500 pages.
208
00:23:12.129 --> 00:23:16.169
And just 4tuples per page.
209
00:23:16.169 --> 00:23:27.659
After update, we see very strange that we have now 64 pages with 157 tuples per page.
210
00:23:27.659 --> 00:23:28.659
How this happens?
211
00:23:28.659 --> 00:23:35.750
Because with long JSON replaced by just tuple TOAST pointer.
212
00:23:35.750 --> 00:23:40.029
So tuple becomes very small.
213
00:23:40.029 --> 00:23:43.840
But number but everything move to the TOAST.
214
00:23:43.840 --> 00:23:54.670
And we can find using this query, we can find the name of TOAST relation very easy.
215
00:23:54.670 --> 00:24:05.179
And then we can inspect how about chunks where we store this long query in the TOAST.
216
00:24:05.179 --> 00:24:10.340
And access to the TOAST requires reading at least 3 additional buffers.
217
00:24:10.340 --> 00:24:19.220
Two TOAST index buffers, because you access TOAST not directly, but using the B tree index.
218
00:24:19.220 --> 00:24:25.450
So you access 2 TOAST index buffers and one TOAST from heap buffer.
219
00:24:25.450 --> 00:24:33.240
So easy calculation you explain that why we have 30,000 pages after update.
220
00:24:33.240 --> 00:24:41.390
We have 64 buffers, and overhead 3 buffers multiplied by number of rows.
221
00:24:41.390 --> 00:24:42.700
10,000.
222
00:24:42.700 --> 00:24:49.120
So this explains performance of the small update.
223
00:24:49.120 --> 00:24:52.250
And if you know what is TOAST.
224
00:24:52.250 --> 00:24:56.309
I can explain it's very detailed.
225
00:24:56.309 --> 00:25:06.639
TOAST compressed and then split into 2 kilobyte chunks and stored in the normal heap relations.
226
00:25:06.639 --> 00:25:08.220
You just don't see.
227
00:25:08.220 --> 00:25:10.830
It's hidden from you.
228
00:25:10.830 --> 00:25:14.299
And how to access these chunks.
229
00:25:14.299 --> 00:25:17.320
You have to use index.
230
00:25:17.320 --> 00:25:23.470
Over here only bytes from the first chunk.
231
00:25:23.470 --> 00:25:30.269
You hid 3, 4, 5 or additional blocks.
232
00:25:30.269 --> 00:25:32.650
That's the problem of TOAST.
233
00:25:32.650 --> 00:25:37.759
And TOAST also very complicated algorithm.
234
00:25:37.759 --> 00:25:40.360
It use four passes.
235
00:25:40.360 --> 00:25:47.500
So Postgres try to compact tuple to 2 kilobytes.
236
00:25:47.500 --> 00:25:52.320
We tried to compress the longest fields.
237
00:25:52.320 --> 00:26:06.820
If it's not compressed, if the result of tuple still more than 2 kilobytes, it replace fields by TOAST pointer and fields move to the TOAST relation.
238
00:26:06.820 --> 00:26:12.029
Actually you see that pass 1, pass 2, pass 3, pass 4.
239
00:26:12.029 --> 00:26:13.600
So it's not easy.
240
00:26:13.600 --> 00:26:18.679
The original tuple replaced by this.
241
00:26:18.679 --> 00:26:24.529
This is plain, some field which not touched.
242
00:26:24.529 --> 00:26:33.639
This is compressed field and 4 tuple pointers which pointed to the TOAST storage.
243
00:26:33.639 --> 00:26:35.820
We have here.
244
00:26:35.820 --> 00:26:44.149
And when you access, for example, this one, you have to read all this one.
245
00:26:44.149 --> 00:26:50.840
Even if you access plain attribute, you don't touch all this TOAST.
246
00:26:50.840 --> 00:27:01.879
But once the attribute is TOASTed, you have to combine all these chunks.
247
00:27:01.879 --> 00:27:10.990
First you need to find all chunks, combine to one buffer.
248
00:27:10.990 --> 00:27:12.820
And then decompress.
249
00:27:12.820 --> 00:27:15.529
So a lot of overhead.
250
00:27:15.529 --> 00:27:18.809
Let's see this example.
251
00:27:18.809 --> 00:27:20.139
Example is very easy.
252
00:27:20.139 --> 00:27:31.110
We have hundred JSONB of different sizes and JSONB looks like key 1 very long key 2 array,
253
00:27:31.110 --> 00:27:34.379
key 3 and key 4.
254
00:27:34.379 --> 00:27:38.749
We measure time of arrow operator.
255
00:27:38.749 --> 00:27:43.190
We actually we repeat 1,000 times in query.
256
00:27:43.190 --> 00:27:46.710
Then to have a more stable result.
257
00:27:46.710 --> 00:27:48.169
This result.
258
00:27:48.169 --> 00:27:49.629
You see?
259
00:27:49.629 --> 00:27:53.000
So we generate the query.
260
00:27:53.000 --> 00:28:01.270
We execute 1,000, and then result the time, divide by 1,000, to have more stable time.
261
00:28:01.270 --> 00:28:08.750
And we see the key access time literally increase JSONB size.
262
00:28:08.750 --> 00:28:12.910
Regardless of size and position.
263
00:28:12.910 --> 00:28:19.119
And this is very surprise for many people because they say I have just one small key.
264
00:28:19.119 --> 00:28:23.610
Why I my access time is so big?
265
00:28:23.610 --> 00:28:31.629
Because everything after here after the hundred kilobytes, everything in TOAST.
266
00:28:31.629 --> 00:28:38.649
And to get this small key, you have to deTOAST all this JSONB.
267
00:28:38.649 --> 00:28:41.220
And then decompress.
268
00:28:41.220 --> 00:28:43.919
You see three areas.
269
00:28:43.919 --> 00:28:46.920
One is in line.
270
00:28:46.920 --> 00:28:51.009
The performance is good, and time is constant.
271
00:28:51.009 --> 00:28:53.940
The second one is expressed in line.
272
00:28:53.940 --> 00:29:00.980
When Postgres success to compress and put into inline.
273
00:29:00.980 --> 00:29:06.360
So hundred kilobytes is actually 2 kilobytes compressed.
274
00:29:06.360 --> 00:29:10.019
Because here's a row.
275
00:29:10.019 --> 00:29:14.149
After compressed 100 kilobyte becomes 2 kilobyte.
276
00:29:14.149 --> 00:29:21.279
And you have some growth access time because you have to decompress.
277
00:29:21.279 --> 00:29:30.340
And after 100 kilobytes, everything toasted.
278
00:29:30.340 --> 00:29:33.350
Here is a number of blocks.
279
00:29:33.350 --> 00:29:43.230
200 kilobytes, you see no additional block to read because everything in line.
280
00:29:43.230 --> 00:29:47.549
After you read more, more, more, more blocks, you see 30 blocks, it's too much.
281
00:29:47.549 --> 00:29:56.779
>> Obviously these blocks are it might be worrying [off microphone] have you done a comparison to Mongo I'll ask the question again.
282
00:29:56.779 --> 00:30:04.049
Have you done a comparison with Mongo and in what areas does Mongo suffer these same issues?
283
00:30:04.049 --> 00:30:11.440
>> OLEG BARTUNOV: The last slide will be about some comparison with Mongo because that's all people is interested.
284
00:30:11.440 --> 00:30:12.590
Yes.
285
00:30:12.590 --> 00:30:17.190
This is the same.
286
00:30:17.190 --> 00:30:20.470
Now we see this is compressed size.
287
00:30:20.470 --> 00:30:24.820
So we see that only two areas.
288
00:30:24.820 --> 00:30:26.860
Inline this is size 2 kilobytes.
289
00:30:26.860 --> 00:30:30.100
After 2 kilobytes it would compress in line.
290
00:30:30.100 --> 00:30:38.720
In line they are more clearly seen because of their size is compressed 2 kilobytes.
291
00:30:38.720 --> 00:30:48.740
And the problem is access time doesn't depends on the key size and position.
292
00:30:48.740 --> 00:30:53.980
Everything suffers from the TOAST.
293
00:30:53.980 --> 00:30:56.610
Another problem is partial update.
294
00:30:56.610 --> 00:31:02.110
Because people also complain that I just want to update small key.
295
00:31:02.110 --> 00:31:04.460
Why the performance very bad?
296
00:31:04.460 --> 00:31:16.970
Again, again, because Postgres TOAST, because TOAST mechanism, algorithm works with JSONB as a black box.
297
00:31:16.970 --> 00:31:27.230
Because it was developed by when all data types were atomic.
298
00:31:27.230 --> 00:31:32.740
But now JSONB has a structure as other data type.
299
00:31:32.740 --> 00:31:36.100
And TOAST should be more smart.
300
00:31:36.100 --> 00:31:44.149
But currently TOAST storage duplicated when we update, well traffic increased and performance becomes very slow.
301
00:31:44.149 --> 00:31:47.960
Also you see the example.
302
00:31:47.960 --> 00:31:57.940
We have hundred gigabytes heap and relation 7.
303
00:31:57.940 --> 00:32:15.409
After update we have TOAST table doubled but also we have 130 megabytes well traffic.
304
00:32:15.409 --> 00:32:17.720
After small update.
305
00:32:17.720 --> 00:32:22.999
Because Postgres doesn't know anything about structure of JSONB.
306
00:32:22.999 --> 00:32:31.039
It just black box, double size and this is problem.
307
00:32:31.039 --> 00:32:37.519
So we have our project started this year in the beginning of this year.
308
00:32:37.519 --> 00:32:42.620
JSONB deTOAST improvement and our goal, ideal goal.
309
00:32:42.620 --> 00:32:48.129
So we won't have new dependency on JSONB size and position.
310
00:32:48.129 --> 00:32:57.320
Our access time should be proportional to the nesting level and update time should be proportional to nesting level.
311
00:32:57.320 --> 00:32:59.870
And the key size what we update.
312
00:32:59.870 --> 00:33:04.809
Not the whole JSONB size but what we update only.
313
00:33:04.809 --> 00:33:08.379
Original TOAST doesn't use inline.
314
00:33:08.379 --> 00:33:16.649
Once you TOAST, we have a lot of space in heap just free.
315
00:33:16.649 --> 00:33:26.480
So we want to utilize inline as much as possible because access to inline many times faster than to the toast.
316
00:33:26.480 --> 00:33:33.889
And we want to compress long fields in TOAST chunks separately for independent access and update.
317
00:33:33.889 --> 00:33:38.640
So if you want to update something, you don't need to touch all chunks.
318
00:33:38.640 --> 00:33:44.039
You need to find some chunk updated and log.
319
00:33:44.039 --> 00:33:45.039
That's all.
320
00:33:45.039 --> 00:33:47.399
But this ideal.
321
00:33:47.399 --> 00:33:51.399
And we have several experiments.
322
00:33:51.399 --> 00:33:54.090
So woo have a partial decompression.
323
00:33:54.090 --> 00:33:57.429
We saw JSONB object key by length.
324
00:33:57.429 --> 00:34:04.159
So short keys, they stored in the beginning.
325
00:34:04.159 --> 00:34:12.700
Partial deTOAST, partial decompression, in lapse line toast, compressed fields TOAST not much time.
326
00:34:12.700 --> 00:34:15.660
So I just say their names.
327
00:34:15.660 --> 00:34:18.830
But inplace updates.
328
00:34:18.830 --> 00:34:22.030
And here we see the results.
329
00:34:22.030 --> 00:34:24.770
This is a master.
330
00:34:24.770 --> 00:34:27.310
How master behave.
331
00:34:27.310 --> 00:34:40.179
After, for example, after partial depression, after the partial decompression, some keys becomes faster.
332
00:34:40.179 --> 00:34:43.639
Here is all keys behave the same.
333
00:34:43.639 --> 00:34:52.059
But after partial decompression some keys become faster because they're in the beginning and decompressed faster.
334
00:34:52.059 --> 00:34:56.230
After the sorting keys, we see another keys came down.
335
00:34:56.230 --> 00:35:03.490
Because key 3 for example was blocked by long key 2.
336
00:35:03.490 --> 00:35:09.810
Was blocked after the sorting key 3 becomes behind the long objects.
337
00:35:09.810 --> 00:35:15.099
And this is, you see, it becomes lower.
338
00:35:15.099 --> 00:35:22.480
And after all this experiment, we get very interesting results, very good result.
339
00:35:22.480 --> 00:35:29.470
So we have still growing time but this we understand for the big arrays.
340
00:35:29.470 --> 00:35:31.710
Big arrays.
341
00:35:31.710 --> 00:35:39.280
And the first element in array we access faster than the last, of course.
342
00:35:39.280 --> 00:35:41.619
But what to do with this?
343
00:35:41.619 --> 00:35:44.799
We find it's another problem.
344
00:35:44.799 --> 00:35:59.890
But you see some very simple optimization with still stay with heap with TOAST, we can find like several orders of magnitude performance gain.
345
00:35:59.890 --> 00:36:10.080
Here is a very interesting picture how much different optimization gain to the performance.
346
00:36:10.080 --> 00:36:13.380
So we see that this is the short keys.
347
00:36:13.380 --> 00:36:17.200
Key 1 and key 3 for example here.
348
00:36:17.200 --> 00:36:20.849
Key 3 green sorting.
349
00:36:20.849 --> 00:36:28.270
Because it was blocked by key 2 but after sorting keys it got a lot of performance gain.
350
00:36:28.270 --> 00:36:29.270
And so on.
351
00:36:29.270 --> 00:36:34.280
So it's easy to interpret this picture.
352
00:36:34.280 --> 00:36:37.369
Slides available but not much time.
353
00:36:37.369 --> 00:36:45.810
And if we return back to this popular mistakes, the mistakes becomes not very serious.
354
00:36:45.810 --> 00:37:00.450
Because now this ID, which stored inside JSONB, not growing infinitely, but stay constant, still have some overhead, but it's okay, more or less.
355
00:37:00.450 --> 00:37:03.930
After all this experiment.
356
00:37:03.930 --> 00:37:08.609
We have also make some experimental sliced deTOAST.
357
00:37:08.609 --> 00:37:13.950
To improve access to array element stored in chance.
358
00:37:13.950 --> 00:37:30.830
And the last you see this arrays elements, they now not growing infinitely but this is very experimental.
359
00:37:30.830 --> 00:37:31.830
Update.
360
00:37:31.830 --> 00:37:40.109
Update is very, very painful process in Postgres.
361
00:37:40.109 --> 00:37:43.780
And for JSONB especially.
362
00:37:43.780 --> 00:37:47.790
Here is a master.
363
00:37:47.790 --> 00:37:56.339
After the shared TOAST, shared TOAST means we update only selected chunks.
364
00:37:56.339 --> 00:38:01.710
Other chunks were shared.
365
00:38:01.710 --> 00:38:17.650
After the shared TOAST we have good results and only on the last array elements still grow because the last element in the array.
366
00:38:17.650 --> 00:38:32.119
In in place update, we have I think it's good, not good results for updates.
367
00:38:32.119 --> 00:38:37.049
Again for updates we have again several orders of magnitude.
368
00:38:37.049 --> 00:38:45.790
Updates is very important, because people use JSONB in OETP environment.
369
00:38:45.790 --> 00:38:48.559
So update is very important.
370
00:38:48.559 --> 00:38:52.180
Access is good for analytical processing.
371
00:38:52.180 --> 00:38:58.710
But for LTP update is very important.
372
00:38:58.710 --> 00:39:01.319
This is a number of blocks read.
373
00:39:01.319 --> 00:39:09.069
You see clearly we have much less blocks to read.
374
00:39:09.069 --> 00:39:16.470
Just to remind you, this is not linear scale, this is algorithmic scale.
375
00:39:16.470 --> 00:39:19.790
This is WAL traffic.
376
00:39:19.790 --> 00:39:23.370
In master you have very, very big WAL traffic.
377
00:39:23.370 --> 00:39:26.540
In shared it's smaller.
378
00:39:26.540 --> 00:39:30.740
And here we have very controlled WAL traffic.
379
00:39:30.740 --> 00:39:38.170
You looked only what you update.
380
00:39:38.170 --> 00:39:49.740
The remain people asking what if I use JSONB and relational structure, which is better?
381
00:39:49.740 --> 00:39:57.060
So we made several again access the whole document testing.
382
00:39:57.060 --> 00:40:13.640
So this JSONB, this relational tables for relational, you need to have join, for JSON you don't need join and we have several queries and this is the result picture.
383
00:40:13.640 --> 00:40:25.900
So you see that this is a very, very fast the darker blue is good.
384
00:40:25.900 --> 00:40:27.599
This is bad.
385
00:40:27.599 --> 00:40:34.530
To access the whole document in JSONB it's no surprise, very fast.
386
00:40:34.530 --> 00:40:37.869
Because you don't need to just need to read it.
387
00:40:37.869 --> 00:40:49.200
And for but when you want to transfer JSONB, you have problem.
388
00:40:49.200 --> 00:40:53.730
Because you need to convert JSONB to the text.
389
00:40:53.730 --> 00:40:57.070
Conversion to the text in Postgres is also painful.
390
00:40:57.070 --> 00:41:21.420
You need to check all bits, you know, to characters so it's not easy operation and we see for the large JSONB the time is not very good and it becomes also here.
391
00:41:21.420 --> 00:41:30.740
Here we made experiment when we don't need to transfer in textual forum as text.
392
00:41:30.740 --> 00:41:37.960
This is called UB JSON we transfer to the client just binary transfer.
393
00:41:37.960 --> 00:41:40.640
We see it becomes better.
394
00:41:40.640 --> 00:41:45.880
There is no degradation.
395
00:41:45.880 --> 00:41:52.721
For the relational, you see the problem.
396
00:41:52.721 --> 00:41:58.339
We transfer this is the time for select.
397
00:41:58.339 --> 00:42:03.670
This is for select transfer and this is transfer binary.
398
00:42:03.670 --> 00:42:07.339
There is a method, for arrays you can transfer binary.
399
00:42:07.339 --> 00:42:14.290
And clearly we see that JSONB here is the winner.
400
00:42:14.290 --> 00:42:18.760
And this explain why it's so popular.
401
00:42:18.760 --> 00:42:21.510
Because micro service, what is a micro service?
402
00:42:21.510 --> 00:42:29.720
It's a small service which expects some predefined query, some aggregate.
403
00:42:29.720 --> 00:42:34.020
And JSONB is a actually aggregate.
404
00:42:34.020 --> 00:42:42.120
You don't need to join data from different tables.
405
00:42:42.120 --> 00:42:43.920
You just have aggregation.
406
00:42:43.920 --> 00:42:48.770
And microservice access this JSONB and performance is very good.
407
00:42:48.770 --> 00:42:50.599
Very simple.
408
00:42:50.599 --> 00:42:56.980
So if you use microservice architecture you'll just happy with JSONB.
409
00:42:56.980 --> 00:42:58.320
That's popular.
410
00:42:58.320 --> 00:43:06.530
But when you access key and update, you have a bit different result.
411
00:43:06.530 --> 00:43:15.410
Because the first one is the last one is select.
412
00:43:15.410 --> 00:43:27.849
The relational, current situation, and this situation with JSONB after all our optimization.
413
00:43:27.849 --> 00:43:33.500
So you see that after all optimization, as fast as relational access.
414
00:43:33.500 --> 00:43:41.940
We understand that for JSONB to get some key, you have to do some operation.
415
00:43:41.940 --> 00:43:47.760
But for relational, you just need to get some you don't need any overhead.
416
00:43:47.760 --> 00:43:57.550
And very nice is that after our optimization, we behave the same as relational.
417
00:43:57.550 --> 00:44:01.670
But for current situation like this.
418
00:44:01.670 --> 00:44:08.020
So if you want to update especially, update access key, relational is the winner.
419
00:44:08.020 --> 00:44:12.619
Our optimization helps a lot.
420
00:44:12.619 --> 00:44:15.470
This is slowdown.
421
00:44:15.470 --> 00:44:23.230
You see that JSONB slow down against this relational.
422
00:44:23.230 --> 00:44:28.220
But after our optimization, we have you see like the same.
423
00:44:28.220 --> 00:44:31.799
Like relational.
424
00:44:31.799 --> 00:44:34.650
And the same is update slowdown.
425
00:44:34.650 --> 00:44:46.770
So for update we still have some this is JSONB original in master, and here is our.
426
00:44:46.770 --> 00:44:51.359
So our optimization helps for updates.
427
00:44:51.359 --> 00:44:54.880
And also WAL traffic.
428
00:44:54.880 --> 00:45:00.809
So we're here for master, we hit a lot of WAL traffic.
429
00:45:00.809 --> 00:45:02.609
We look for a lot.
430
00:45:02.609 --> 00:45:08.500
This is relational and this is our optimization.
431
00:45:08.500 --> 00:45:12.670
There looks like the same.
432
00:45:12.670 --> 00:45:16.110
Also we have access array member.
433
00:45:16.110 --> 00:45:17.300
Very popular operation.
434
00:45:17.300 --> 00:45:21.530
We have array and you want to get some member of this array.
435
00:45:21.530 --> 00:45:32.099
We have first key, middle key and the last key.
436
00:45:32.099 --> 00:45:39.680
And here's relational, optimized, and non optimized JSONB.
437
00:45:39.680 --> 00:45:50.000
So we see access array member is not very good for JSONB, but with our optimization,
438
00:45:50.000 --> 00:45:56.059
again, again we approach to the relational.
439
00:45:56.059 --> 00:46:02.650
And update array member, we also compare how to update array member.
440
00:46:02.650 --> 00:46:05.960
And here is JSONB.
441
00:46:05.960 --> 00:46:09.920
This is JSONB optimize and this is relational.
442
00:46:09.920 --> 00:46:15.869
Of course when you update array member in relational table, it's very easy.
443
00:46:15.869 --> 00:46:21.020
You just update one row.
444
00:46:21.020 --> 00:46:27.539
And for the JSONB, you update the whole JSONB.
445
00:46:27.539 --> 00:46:30.839
We can be very big.
446
00:46:30.839 --> 00:46:37.670
But our optimization helps a lot again.
447
00:46:37.670 --> 00:46:39.420
This is a WAL.
448
00:46:39.420 --> 00:46:46.109
And conclusion is that JSONB is good for full object access.
449
00:46:46.109 --> 00:46:47.329
So microservices.
450
00:46:47.329 --> 00:46:49.190
It's much faster than relational way.
451
00:46:49.190 --> 00:46:55.480
In relational way you have to join, have you to aggregate and very difficult to tune the process.
452
00:46:55.480 --> 00:46:57.900
With JSONB you have no problem.
453
00:46:57.900 --> 00:47:00.799
Also JSONB is very good for storing metadata.
454
00:47:00.799 --> 00:47:07.710
In short, metadata in separate JSONB field.
455
00:47:07.710 --> 00:47:14.289
And currently PG14 not optimized, as I showed you, for update.
456
00:47:14.289 --> 00:47:16.430
And access to array member.
457
00:47:16.430 --> 00:47:27.400
But we demonstrated all our optimization, which give which resulted in orders of magnitudes for select and update.
458
00:47:27.400 --> 00:47:35.270
And the question is how to integrate now all this, our patches, to the Postgres.
459
00:47:35.270 --> 00:47:38.920
And the first step is to make a data type aware TOAST.
460
00:47:38.920 --> 00:47:45.470
Because currently TOAST is the common for all data type.
461
00:47:45.470 --> 00:47:52.330
But we suggest that TOAST should be extended.
462
00:47:52.330 --> 00:47:57.990
So data type knows better how to TOAST it.
463
00:47:57.990 --> 00:48:10.819
And that allows us to, allows many other developers give a lot of performance improving.
464
00:48:10.819 --> 00:48:15.900
We have example when we improve streaming.
465
00:48:15.900 --> 00:48:23.880
You know, some people want to stream data into the Postgres.
466
00:48:23.880 --> 00:48:25.460
For example, movie.
467
00:48:25.460 --> 00:48:29.410
It's crazy, but they stream there.
468
00:48:29.410 --> 00:48:34.970
Before you image how it behaves slowly because it's logged every time.
469
00:48:34.970 --> 00:48:43.540
You add one byte and you look all one gigabyte.
470
00:48:43.540 --> 00:48:49.930
After our optimization we have special extension, it works very fast.
471
00:48:49.930 --> 00:48:52.180
We look only this one byte.
472
00:48:52.180 --> 00:48:54.650
That's all.
473
00:48:54.650 --> 00:49:01.810
Because we made a special for byte here, special TOAST.
474
00:49:01.810 --> 00:49:09.650
So we need to the community accept some data type.
475
00:49:09.650 --> 00:49:11.710
Just two slides.
476
00:49:11.710 --> 00:49:14.309
So we have to do.
477
00:49:14.309 --> 00:49:20.849
On physical level we provide random access to keys and arrays.
478
00:49:20.849 --> 00:49:22.809
On physical level this is easy.
479
00:49:22.809 --> 00:49:26.720
We already have sliced deTOAST.
480
00:49:26.720 --> 00:49:27.880
We need to do some compression.
481
00:49:27.880 --> 00:49:32.220
But it's most important to make it at the logical level.
482
00:49:32.220 --> 00:49:40.170
So we know how to which toast chink we need to work on logical level.
483
00:49:40.170 --> 00:49:43.150
And this number of patches.
484
00:49:43.150 --> 00:49:47.359
This not exists, not exists yes.
485
00:49:47.359 --> 00:49:57.490
But this all our patches and our roadmap how to work with community to submit all this picture.
486
00:49:57.490 --> 00:50:00.579
All this our result.
487
00:50:00.579 --> 00:50:03.049
And references.
488
00:50:03.049 --> 00:50:06.880
And I invited you to join our development team.
489
00:50:06.880 --> 00:50:12.650
This is not a company project, this is open source community project.
490
00:50:12.650 --> 00:50:14.480
So everybody invited to join us.
491
00:50:14.480 --> 00:50:23.950
This is what asked me, Simon asked about non scientific comparison Postgres with Mongo.
492
00:50:23.950 --> 00:50:30.369
I said nonscientific because scientific benchmark is very, very complicated task.
493
00:50:30.369 --> 00:50:31.750
Very, very.
494
00:50:31.750 --> 00:50:35.900
But here is nonscientific.
495
00:50:35.900 --> 00:50:46.339
Mongo need double size of Postgres because Mongo keep uncompressed data in memory.
496
00:50:46.339 --> 00:50:55.000
And we see that this is manual progress.
497
00:50:55.000 --> 00:50:59.680
After our optimization, we have performance better than Mongo.
498
00:50:59.680 --> 00:51:11.010
But if we turn on parallel support, because all this without any parallelism to compare.
499
00:51:11.010 --> 00:51:18.599
But after the parallel support we have very fast Postgres compared to Mongo.
500
00:51:18.599 --> 00:51:30.610
But as I said, this is Mongo, when memory is just 4 gigabytes, and Mongo is not very good when you don't have enough memory.
501
00:51:30.610 --> 00:51:33.510
So it behave like the Postgres.
502
00:51:33.510 --> 00:51:35.440
Postgres is much better.
503
00:51:35.440 --> 00:51:39.650
It works with memory.
504
00:51:39.650 --> 00:51:51.530
So that means that we have our community, our community have a good chance to attract Mongo users to the Postgres.
505
00:51:51.530 --> 00:51:54.890
Because Postgres is a very good, solid, database.
506
00:51:54.890 --> 00:51:55.980
Good community.
507
00:51:55.980 --> 00:51:59.250
We're all open source, independent.
508
00:51:59.250 --> 00:52:00.490
And we have JSON.
509
00:52:00.490 --> 00:52:08.770
Just need to better performance and to be more friendly to the young people who started working with Postgres.
510
00:52:08.770 --> 00:52:11.089
This is picture of my kids.
511
00:52:11.089 --> 00:52:13.470
They climb trees.
512
00:52:13.470 --> 00:52:16.300
And sometimes they tear their pants.
513
00:52:16.300 --> 00:52:17.740
And I have two options.
514
00:52:17.740 --> 00:52:19.070
I can forbid them to do this.
515
00:52:19.070 --> 00:52:21.380
I can teach them.
516
00:52:21.380 --> 00:52:26.640
So let's say that JSON is not the wrong technology.
517
00:52:26.640 --> 00:52:29.380
Let's make it a first class citizen in Postgres.
518
00:52:29.380 --> 00:52:31.359
Be friendly.
519
00:52:31.359 --> 00:52:44.680
Because still some senior Postgres people still say that oh, relational database, relations, you need just to read these books.
520
00:52:44.680 --> 00:52:47.970
But young people, startups, they don't have time.
521
00:52:47.970 --> 00:52:52.640
They can hire some very sophisticated senior database architecture.
522
00:52:52.640 --> 00:52:56.180
They want to make their business.
523
00:52:56.180 --> 00:52:58.359
They need JSON.
524
00:52:58.359 --> 00:53:05.000
So my position is to make Postgres friendly to these people.
525
00:53:05.000 --> 00:53:08.220
This is actually our duty.
526
00:53:08.220 --> 00:53:11.180
Database should be smart.
527
00:53:11.180 --> 00:53:15.490
People should work should do their project.
528
00:53:15.490 --> 00:53:25.359
So that's why I say that at least at the end, we are universal database.
529
00:53:25.359 --> 00:53:28.940
So I say that all you need is just Postgres.
530
00:53:28.940 --> 00:53:39.160
With JSON we will have a lot of fun, a lot of new people, and our popularity will continue to grow.
531
00:53:39.160 --> 00:53:41.530
That is what I want to finish.
532
00:53:41.530 --> 00:53:42.530
Thank you for attendance.
533
00:53:42.530 --> 00:53:44.530
[ Applause ] I think we have not much time for questions and answers.
534
00:53:44.530 --> 00:53:45.530
I will be available the whole day.
535
00:53:45.530 --> 00:53:46.530
You can ask me, you can discuss with me what I'm very interested in your data.
536
00:53:46.530 --> 00:53:47.530
In your data, in your query.
537
00:53:47.530 --> 00:53:48.530
You know it's very difficult to optimize something if you don't have real data and a real query.
538
00:53:48.530 --> 00:53:49.530
So I don't need personal data.
539
00:53:49.530 --> 00:53:49.542
If you can share, I will be very useful and help us.