- 创建了关于PostgreSQL JSONB性能优化的完整演示文稿 - 包含TOAST阈值问题分析、JSONB操作符性能比较 - 讨论部分更新挑战和高级优化技术 - 对比PostgreSQL JSONB与MongoDB的性能表现 - 添加了相关图表和演讲者注释
2696 lines
49 KiB
WebVTT
2696 lines
49 KiB
WebVTT
WEBVTT
|
||
|
||
1
|
||
00:00:14.469 --> 00:00:17.180
|
||
Thank you very much for attending my lecture.
|
||
非常感谢大家听我讲座。
|
||
|
||
2
|
||
00:00:17.180 --> 00:00:21.910
|
||
I'm very happy to be with you.
|
||
我很高兴和你在一起。
|
||
|
||
3
|
||
00:00:21.910 --> 00:00:25.320
|
||
And today I will talk about JSONB performance.
|
||
今天我将谈谈JSONB的性能。
|
||
|
||
4
|
||
00:00:25.320 --> 00:00:29.090
|
||
This slide are available already.
|
||
这张幻灯片已经有售了。
|
||
|
||
5
|
||
00:00:29.090 --> 00:00:33.040
|
||
This joint talk with my colleague Nikita Glukhov.
|
||
这是我与同事尼基塔·格卢霍夫的联合讲座。
|
||
|
||
6
|
||
00:00:33.040 --> 00:00:39.470
|
||
This is a picture of an elephant with projects I was working on.
|
||
这是一张大象和我正在做的项目的照片。
|
||
|
||
7
|
||
00:00:39.470 --> 00:00:41.199
|
||
Maybe you know some of them.
|
||
也许你认识其中一些。
|
||
|
||
8
|
||
00:00:41.199 --> 00:00:53.699
|
||
I'm research scientist at Moscow university and most interesting to me is I'm major Postgres contributor.
|
||
我是莫斯科大学的研究科学家,最有趣的是我是Postgres的主要贡献者。
|
||
|
||
9
|
||
00:00:53.699 --> 00:01:00.780
|
||
And my colleague Nikita Glukhov he's working also in Postgres Professional company.
|
||
我的同事尼基塔·格卢霍夫也在Postgres Professional公司工作。
|
||
|
||
10
|
||
00:01:00.780 --> 00:01:08.310
|
||
He's also Postgres contributor and for several years he did several big projects.
|
||
他也是Postgres的撰稿人,多年来参与了多个大型项目。
|
||
|
||
11
|
||
00:01:08.310 --> 00:01:13.009
|
||
So we today will talk about the JSON performance.
|
||
所以今天我们要谈谈JSON的性能。
|
||
|
||
12
|
||
00:01:13.009 --> 00:01:22.400
|
||
And the reason why I decided to talk about this here, we know that JSON is, as we said,
|
||
我之所以决定在这里讨论这个,是因为我们知道JSON正如我们所说,
|
||
|
||
13
|
||
00:01:22.400 --> 00:01:24.340
|
||
one type fits all.
|
||
一种类型适用于所有人。
|
||
|
||
14
|
||
00:01:24.340 --> 00:01:44.370
|
||
You know that modern architecture is micro series architecture and JSON is very good for this architecture because client applications, front end, back end and now database, they use JSON.
|
||
你知道现代架构是微系列架构,而JSON非常适合这种架构,因为客户端应用、前端、后端,现在还有数据库,都用JSON。
|
||
|
||
15
|
||
00:01:44.370 --> 00:01:48.630
|
||
It is very, very easy for start up to start their project.
|
||
创业公司启动项目非常非常容易。
|
||
|
||
16
|
||
00:01:48.630 --> 00:02:03.909
|
||
You don't need the relational scheme, how to make it proper, how to make it it's very difficult when you start your project, when you start your business to predict what the schema will be.
|
||
你不需要关系方案,如何让它变得恰当,当你开始项目、创业时,预测模式会是什么非常困难。
|
||
|
||
17
|
||
00:02:03.909 --> 00:02:05.480
|
||
In the several months.
|
||
几个月来,
|
||
|
||
18
|
||
00:02:05.480 --> 00:02:07.890
|
||
With JSON you don't have any problem.
|
||
用JSON就没问题。
|
||
|
||
19
|
||
00:02:07.890 --> 00:02:10.000
|
||
Just have JSON.
|
||
只要有 JSON 就行。
|
||
|
||
20
|
||
00:02:10.000 --> 00:02:23.530
|
||
And all server side languages support JSON SQL JSON so it's not something different from SQL.
|
||
而且所有服务器端语言都支持 JSON、SQL、JSON,所以这和 SQL 没什么区别。
|
||
|
||
21
|
||
00:02:23.530 --> 00:02:29.470
|
||
What's very important that JSON relaxed object relational mismatch.
|
||
非常重要的是,JSON 放松了对象关系的不匹配。
|
||
|
||
22
|
||
00:02:29.470 --> 00:02:35.660
|
||
In the code you work with object, in relational database you work with relations.
|
||
在代码中你处理对象,在关系数据库中你处理关系。
|
||
|
||
23
|
||
00:02:35.660 --> 00:02:41.370
|
||
And you have some contradictions between the programmers and between d/b/a.
|
||
而且程序员之间和d/b/a之间存在一些矛盾。
|
||
|
||
24
|
||
00:02:41.370 --> 00:02:47.510
|
||
But when you hear use JSON, there's no any contradiction.
|
||
但当你听到使用JSON时,就没有矛盾。
|
||
|
||
25
|
||
00:02:47.510 --> 00:02:52.950
|
||
So this way JSON becomes very, very popular.
|
||
这样JSON才变得非常非常受欢迎。
|
||
|
||
26
|
||
00:02:52.950 --> 00:02:57.540
|
||
And I would say that now I see that JSONB rush.
|
||
我现在看到了那种 JSONB 的冲刺。
|
||
|
||
27
|
||
00:02:57.540 --> 00:03:12.000
|
||
Because I am speaking in many countries and I see that many people they have they ask me about JSON and they said that we don't know much about SQL.
|
||
因为我在很多国家演讲,看到很多人问我关于JSON的事,他们说我们对SQL了解不多。
|
||
|
||
28
|
||
00:03:12.000 --> 00:03:13.670
|
||
We use JSON.
|
||
|
||
|
||
29
|
||
00:03:13.670 --> 00:03:17.860
|
||
And what we need is just to have JavaScript instead of SQL.
|
||
|
||
|
||
30
|
||
00:03:17.860 --> 00:03:22.000
|
||
This is very interesting career actually.
|
||
|
||
|
||
31
|
||
00:03:22.000 --> 00:03:24.370
|
||
I have some thinking about this.
|
||
|
||
|
||
32
|
||
00:03:24.370 --> 00:03:28.459
|
||
Because actually it's easy.
|
||
|
||
|
||
33
|
||
00:03:28.459 --> 00:03:33.510
|
||
I can internally transform JavaScript to SQL and execute it.
|
||
|
||
|
||
34
|
||
00:03:33.510 --> 00:03:38.310
|
||
But that's about my future project.
|
||
|
||
|
||
35
|
||
00:03:38.310 --> 00:03:46.650
|
||
And JSONB is actually a main driver of Postgres popularity.
|
||
|
||
|
||
36
|
||
00:03:46.650 --> 00:03:50.489
|
||
You see the create table JSON of the it's a common mistake.
|
||
|
||
|
||
37
|
||
00:03:50.489 --> 00:03:54.230
|
||
They put everything into JSONB.
|
||
|
||
|
||
38
|
||
00:03:54.230 --> 00:04:00.440
|
||
That is because people don't know about how JSONB performance.
|
||
|
||
|
||
39
|
||
00:04:00.440 --> 00:04:03.680
|
||
I will talk about this later.
|
||
|
||
|
||
40
|
||
00:04:03.680 --> 00:04:14.180
|
||
Another reason about this talk, nobody made the comparison between performance of JSONB operators.
|
||
|
||
|
||
41
|
||
00:04:14.180 --> 00:04:22.680
|
||
So this talk I will explain which operator to use, what is better and so on.
|
||
|
||
|
||
42
|
||
00:04:22.680 --> 00:04:30.430
|
||
And another reason is that I work 25 years in Postgres, maybe 26.
|
||
|
||
|
||
43
|
||
00:04:30.430 --> 00:04:34.440
|
||
I started from 1995.
|
||
|
||
|
||
44
|
||
00:04:34.440 --> 00:04:43.100
|
||
And almost all my projects, they connected to the extending Postgres to support this nonstructural data.
|
||
|
||
|
||
45
|
||
00:04:43.100 --> 00:05:02.310
|
||
So we started from arrays, h store, full text search, now working on JSONB and SQL and this is my interest and I believe that JSON is very useful for Postgres community.
|
||
|
||
|
||
46
|
||
00:05:02.310 --> 00:05:04.610
|
||
You see this picture.
|
||
|
||
|
||
47
|
||
00:05:04.610 --> 00:05:14.729
|
||
This how popularity of four databases change over time.
|
||
|
||
|
||
48
|
||
00:05:14.729 --> 00:05:18.520
|
||
The only database which grows is Postgres.
|
||
|
||
|
||
49
|
||
00:05:18.520 --> 00:05:27.210
|
||
I use the official say official numbers from DB engine and relational popularity.
|
||
|
||
|
||
50
|
||
00:05:27.210 --> 00:05:35.900
|
||
Postgres becomes popular from the time we committed JSONB into Postgres.
|
||
|
||
|
||
51
|
||
00:05:35.900 --> 00:05:43.720
|
||
I believe that JSONB is one of the main driver of popularity.
|
||
|
||
|
||
52
|
||
00:05:43.720 --> 00:06:03.070
|
||
And because No SQL people become upset and go to Postgres because Postgres got good JSON data type.
|
||
|
||
|
||
53
|
||
00:06:03.070 --> 00:06:14.009
|
||
Our work on JSONB made possible SQL standard 2016.
|
||
|
||
|
||
54
|
||
00:06:14.009 --> 00:06:25.180
|
||
So the success of Postgres made this, all other databases now have JSON and that's why we have SQL standard on this.
|
||
|
||
|
||
55
|
||
00:06:25.180 --> 00:06:31.500
|
||
To me it's very important to have to continue work on JSON in Postgres.
|
||
|
||
|
||
56
|
||
00:06:31.500 --> 00:06:35.750
|
||
Because we have many, many users.
|
||
|
||
|
||
57
|
||
00:06:35.750 --> 00:06:49.480
|
||
These are numbers from PostGreSQL you see that the most popular is JSONB.
|
||
|
||
|
||
58
|
||
00:06:49.480 --> 00:06:51.380
|
||
It's the biggest.
|
||
|
||
|
||
59
|
||
00:06:51.380 --> 00:07:19.479
|
||
If we take the popularity in the telegram chart on PostGreSQL it's several thousand, any time you have several thousand people on land and JSON and JSONB is third popular award used by Postgres people.
|
||
|
||
|
||
60
|
||
00:07:19.479 --> 00:07:23.470
|
||
The first is select, the second is SQL and the third is JSON.
|
||
|
||
|
||
61
|
||
00:07:23.470 --> 00:07:29.449
|
||
This is like some argument that JSON is very popular in the Postgres community.
|
||
|
||
|
||
62
|
||
00:07:29.449 --> 00:07:38.600
|
||
So we were working on several big projects, some of them already committed, some of them waiting for commit.
|
||
|
||
|
||
63
|
||
00:07:38.600 --> 00:07:43.860
|
||
But now we change the priority for our development.
|
||
|
||
|
||
64
|
||
00:07:43.860 --> 00:07:49.040
|
||
So we want to have JSONB the first class citizen in Postgres.
|
||
|
||
|
||
65
|
||
00:07:49.040 --> 00:08:02.080
|
||
It's means we want to have efficient storage, select, update, good API, the reason for this is SQL JSON is important, of course.
|
||
|
||
|
||
66
|
||
00:08:02.080 --> 00:08:04.460
|
||
Because this part of standard.
|
||
|
||
|
||
67
|
||
00:08:04.460 --> 00:08:12.729
|
||
But actually people who work with Postgres, they have no idea to be compatible with Oracle or Microsoft.
|
||
|
||
|
||
68
|
||
00:08:12.729 --> 00:08:20.330
|
||
I know that people, the startups use started with Postgres and never change it.
|
||
|
||
|
||
69
|
||
00:08:20.330 --> 00:08:23.200
|
||
And JSONB is already a mature data type.
|
||
|
||
|
||
70
|
||
00:08:23.200 --> 00:08:31.259
|
||
We have a load of functionality in Postgres and we have not enough resources in community.
|
||
|
||
|
||
71
|
||
00:08:31.259 --> 00:08:35.940
|
||
To even to review and commit the patches.
|
||
|
||
|
||
72
|
||
00:08:35.940 --> 00:08:41.930
|
||
You see that four years we have patches for SQL JSON functions.
|
||
|
||
|
||
73
|
||
00:08:41.930 --> 00:08:47.490
|
||
JSON table will wait also four years.
|
||
|
||
|
||
74
|
||
00:08:47.490 --> 00:08:49.759
|
||
Maybe for PG15 we'll have some committed.
|
||
|
||
|
||
75
|
||
00:08:49.759 --> 00:08:53.370
|
||
But I understand that community just has no resources.
|
||
|
||
|
||
76
|
||
00:08:53.370 --> 00:09:02.449
|
||
And my interest mostly is concentrate on improving JSONB, not standard.
|
||
|
||
|
||
77
|
||
00:09:02.449 --> 00:09:07.709
|
||
So I mostly aware about Postgres users.
|
||
|
||
|
||
78
|
||
00:09:07.709 --> 00:09:18.439
|
||
And I'm not very interested in the compatibility to Oracle or Microsoft SQL server.
|
||
|
||
|
||
79
|
||
00:09:18.439 --> 00:09:22.300
|
||
So this is a popular mistake.
|
||
|
||
|
||
80
|
||
00:09:22.300 --> 00:09:25.999
|
||
People put everything into JSON and that's not good idea.
|
||
|
||
|
||
81
|
||
00:09:25.999 --> 00:09:35.790
|
||
You can see it very easy that ID outside of JSONB and ID inside.
|
||
|
||
|
||
82
|
||
00:09:35.790 --> 00:09:43.749
|
||
If JSONB is grow, the performance is great degrade very quickly.
|
||
|
||
|
||
83
|
||
00:09:43.749 --> 00:09:50.089
|
||
Don't do this.
|
||
|
||
|
||
84
|
||
00:09:50.089 --> 00:09:56.209
|
||
We want to demonstrate the performance of nested containers.
|
||
|
||
|
||
85
|
||
00:09:56.209 --> 00:10:04.839
|
||
You usually, we created simple tables with nested objects, and we just test several operators.
|
||
|
||
|
||
86
|
||
00:10:04.839 --> 00:10:07.749
|
||
It's error operator.
|
||
|
||
|
||
87
|
||
00:10:07.749 --> 00:10:12.329
|
||
Most people use error operator to access the key.
|
||
|
||
|
||
88
|
||
00:10:12.329 --> 00:10:17.029
|
||
A hash arrow.
|
||
|
||
|
||
89
|
||
00:10:17.029 --> 00:10:19.959
|
||
The new one is subscripting.
|
||
|
||
|
||
90
|
||
00:10:19.959 --> 00:10:24.649
|
||
So you can you have like an array syntax.
|
||
|
||
|
||
91
|
||
00:10:24.649 --> 00:10:27.079
|
||
Another one the force JSON pass.
|
||
|
||
|
||
92
|
||
00:10:27.079 --> 00:10:33.489
|
||
The other one do you know which is better?
|
||
|
||
|
||
93
|
||
00:10:33.489 --> 00:10:36.790
|
||
Nobody knows actually and we need to.
|
||
|
||
|
||
94
|
||
00:10:36.790 --> 00:10:45.709
|
||
So we did a lot of experiments and now I will show you some graphics.
|
||
|
||
|
||
95
|
||
00:10:45.709 --> 00:10:58.009
|
||
This is a raw JSONB size and execution time.
|
||
|
||
|
||
96
|
||
00:10:58.009 --> 00:11:00.550
|
||
This is arrow operator.
|
||
|
||
|
||
97
|
||
00:11:00.550 --> 00:11:06.360
|
||
JSONB grow and execution times grow.
|
||
|
||
|
||
98
|
||
00:11:06.360 --> 00:11:10.139
|
||
So the reason for this I will explain later.
|
||
|
||
|
||
99
|
||
00:11:10.139 --> 00:11:13.569
|
||
The same behavior actually is for other operator.
|
||
|
||
|
||
100
|
||
00:11:13.569 --> 00:11:18.019
|
||
But it's a bit different.
|
||
|
||
|
||
101
|
||
00:11:18.019 --> 00:11:22.149
|
||
Interesting that subscripting behave very good.
|
||
|
||
|
||
102
|
||
00:11:22.149 --> 00:11:27.199
|
||
And we have the color is nesting level.
|
||
|
||
|
||
103
|
||
00:11:27.199 --> 00:11:34.199
|
||
So we know execution time for the different nesting level.
|
||
|
||
|
||
104
|
||
00:11:34.199 --> 00:11:44.980
|
||
This two kilobytes JSONB becomes toasted.
|
||
|
||
|
||
105
|
||
00:11:44.980 --> 00:11:50.989
|
||
So we have degradation of performance.
|
||
|
||
|
||
106
|
||
00:11:50.989 --> 00:11:57.299
|
||
But before, we have the constant and we have over here for nesting.
|
||
|
||
|
||
107
|
||
00:11:57.299 --> 00:12:09.690
|
||
So the deeper we go, the performance is more worse.
|
||
|
||
|
||
108
|
||
00:12:09.690 --> 00:12:16.189
|
||
Here is a slowdown, relative to the root level.
|
||
|
||
|
||
109
|
||
00:12:16.189 --> 00:12:23.430
|
||
So the picture is also very strange relative to the root level for the same operator.
|
||
|
||
|
||
110
|
||
00:12:23.430 --> 00:12:31.269
|
||
So this JSON path and subscription.
|
||
|
||
|
||
111
|
||
00:12:31.269 --> 00:12:38.430
|
||
Arrow operator again shows some instability is not very predictable behavior.
|
||
|
||
|
||
112
|
||
00:12:38.430 --> 00:12:46.029
|
||
If you see the slowdown relative to the extract pass operator, situation becomes a bit better,
|
||
|
||
|
||
113
|
||
00:12:46.029 --> 00:12:53.410
|
||
a bit clearer because we see that JSON path is slowest.
|
||
|
||
|
||
114
|
||
00:12:53.410 --> 00:12:56.230
|
||
And subscripting is the fastest.
|
||
|
||
|
||
115
|
||
00:12:56.230 --> 00:13:01.999
|
||
Very unexpected behavior, but you know now this.
|
||
|
||
|
||
116
|
||
00:13:01.999 --> 00:13:10.959
|
||
After 2 kilobytes, everything becomes more or less because the dominates.
|
||
|
||
|
||
117
|
||
00:13:10.959 --> 00:13:19.750
|
||
You have major contribution to the performance.
|
||
|
||
|
||
118
|
||
00:13:19.750 --> 00:13:28.230
|
||
This picture demonstrates best operator depending on size and nesting level.
|
||
|
||
|
||
119
|
||
00:13:28.230 --> 00:13:33.410
|
||
So it's the same data, but in different picture, different format.
|
||
|
||
|
||
120
|
||
00:13:33.410 --> 00:13:41.579
|
||
We see that arrow operator is good only for small JSONB and the root level.
|
||
|
||
|
||
121
|
||
00:13:41.579 --> 00:13:47.110
|
||
I would say for root level you can safely use arrow operator.
|
||
|
||
|
||
122
|
||
00:13:47.110 --> 00:13:54.939
|
||
For the second, for the first level, I would not use for the big JSONB.
|
||
|
||
|
||
123
|
||
00:13:54.939 --> 00:14:00.529
|
||
Subscripting is the most useful good.
|
||
|
||
|
||
124
|
||
00:14:00.529 --> 00:14:08.089
|
||
And JSON path, you see here, you see some JSON path.
|
||
|
||
|
||
125
|
||
00:14:08.089 --> 00:14:12.230
|
||
But really JSON path is not about performance.
|
||
|
||
|
||
126
|
||
00:14:12.230 --> 00:14:19.970
|
||
Because JSON path is very flexible for the very complex queries you need JSON path.
|
||
|
||
|
||
127
|
||
00:14:19.970 --> 00:14:31.019
|
||
But for the simple queries like this, over here the JSON path is too big.
|
||
|
||
|
||
128
|
||
00:14:31.019 --> 00:14:37.470
|
||
So all operators have common overhead, it's deTOAST and iteration time.
|
||
|
||
|
||
129
|
||
00:14:37.470 --> 00:14:54.470
|
||
But arrow operator very fast for the small and root level because it has minimal initialization but it need to copy intermediate results to some temporary datums.
|
||
|
||
|
||
130
|
||
00:14:54.470 --> 00:14:56.209
|
||
Let's see this one.
|
||
|
||
|
||
131
|
||
00:14:56.209 --> 00:15:02.029
|
||
This picture how arrow operator executes.
|
||
|
||
|
||
132
|
||
00:15:02.029 --> 00:15:11.239
|
||
So when you have TOAST, this is TOAST pointer, then you find key 1.
|
||
|
||
|
||
133
|
||
00:15:11.239 --> 00:15:23.959
|
||
You find some value and you copy all this nested JSONB container into some intermediate data which go to the second execution.
|
||
|
||
|
||
134
|
||
00:15:23.959 --> 00:15:29.439
|
||
So key 2 and then you copy string to result.
|
||
|
||
|
||
135
|
||
00:15:29.439 --> 00:15:34.899
|
||
This example just for root in the first level.
|
||
|
||
|
||
136
|
||
00:15:34.899 --> 00:15:42.509
|
||
But you can imagine if you have 9 levels, big more levels, you have to repeat this operation.
|
||
|
||
|
||
137
|
||
00:15:42.509 --> 00:15:52.019
|
||
You have to copy all nested container to memory, to some intermediate datum.
|
||
|
||
|
||
138
|
||
00:15:52.019 --> 00:15:57.929
|
||
This surprise for the abstraction.
|
||
|
||
|
||
139
|
||
00:15:57.929 --> 00:16:00.920
|
||
In Postgres you can combine operator.
|
||
|
||
|
||
140
|
||
00:16:00.920 --> 00:16:05.609
|
||
So this way this arrow operator works like this.
|
||
|
||
|
||
141
|
||
00:16:05.609 --> 00:16:09.279
|
||
But extract path works different.
|
||
|
||
|
||
142
|
||
00:16:09.279 --> 00:16:18.279
|
||
JSON path, they have they share the same schema.
|
||
|
||
|
||
143
|
||
00:16:18.279 --> 00:16:20.609
|
||
Again, you have TOAST pointer.
|
||
|
||
|
||
144
|
||
00:16:20.609 --> 00:16:24.699
|
||
You deTOAST and then you just copy string to result.
|
||
|
||
|
||
145
|
||
00:16:24.699 --> 00:16:34.790
|
||
You find everything in one insight and there is no inside and there is no copy over here.
|
||
|
||
|
||
146
|
||
00:16:34.790 --> 00:16:44.339
|
||
So the conclusion is that you can use safely arrow operator for root level of any size,
|
||
|
||
|
||
147
|
||
00:16:44.339 --> 00:16:46.979
|
||
and for first level for small JSONB.
|
||
|
||
|
||
148
|
||
00:16:46.979 --> 00:16:55.869
|
||
Then use subscripting and extract path for large JSONB and if you have higher nesting level.
|
||
|
||
|
||
149
|
||
00:16:55.869 --> 00:16:56.869
|
||
This is my recommendation.
|
||
|
||
|
||
150
|
||
00:16:56.869 --> 00:17:03.359
|
||
JSON path is slowest, but it's very useful for complex queries.
|
||
|
||
|
||
151
|
||
00:17:03.359 --> 00:17:15.209
|
||
Now we want to analyze the performance of another very important queries is contains.
|
||
|
||
|
||
152
|
||
00:17:15.209 --> 00:17:20.939
|
||
So you want to find some key value in JSONB.
|
||
|
||
|
||
153
|
||
00:17:20.939 --> 00:17:27.390
|
||
So again, we have a table with arrays of various size.
|
||
|
||
|
||
154
|
||
00:17:27.390 --> 00:17:32.799
|
||
From 1 to 1 million entries.
|
||
|
||
|
||
155
|
||
00:17:32.799 --> 00:17:39.779
|
||
And we try to find several operations.
|
||
|
||
|
||
156
|
||
00:17:39.779 --> 00:17:42.139
|
||
It's contains operator.
|
||
|
||
|
||
157
|
||
00:17:42.139 --> 00:17:45.880
|
||
JSON pass match operator.
|
||
|
||
|
||
158
|
||
00:17:45.880 --> 00:17:51.490
|
||
JSON path exist operator with filter.
|
||
|
||
|
||
159
|
||
00:17:51.490 --> 00:17:55.029
|
||
And SQL, two variants of SQL.
|
||
|
||
|
||
160
|
||
00:17:55.029 --> 00:18:01.529
|
||
One is exists and another one is optimized version of this one.
|
||
|
||
|
||
161
|
||
00:18:01.529 --> 00:18:03.049
|
||
And we use this query.
|
||
|
||
|
||
162
|
||
00:18:03.049 --> 00:18:09.160
|
||
Like we use we look for the first element existed.
|
||
|
||
|
||
163
|
||
00:18:09.160 --> 00:18:10.380
|
||
Actually it's a zero.
|
||
|
||
|
||
164
|
||
00:18:10.380 --> 00:18:15.059
|
||
And we look for the nonexistent operators which is minus 1.
|
||
|
||
|
||
165
|
||
00:18:15.059 --> 00:18:20.179
|
||
And we see how different queries execute.
|
||
|
||
|
||
166
|
||
00:18:20.179 --> 00:18:30.750
|
||
We see that if we apply search first element the behavior more or less the same, except two.
|
||
|
||
|
||
167
|
||
00:18:30.750 --> 00:18:34.230
|
||
Contain separator is the fastest.
|
||
|
||
|
||
168
|
||
00:18:34.230 --> 00:18:43.639
|
||
Before TOAST, before JSONB TOASTed we have constant and constant time.
|
||
|
||
|
||
169
|
||
00:18:43.639 --> 00:18:50.390
|
||
And also we have much operator in lux mode.
|
||
|
||
|
||
170
|
||
00:18:50.390 --> 00:19:09.450
|
||
JSON path has two modes, which you instruct interpreter of JSON path how to work with errors for example.
|
||
|
||
|
||
171
|
||
00:19:09.450 --> 00:19:15.190
|
||
In lax mode.
|
||
|
||
|
||
172
|
||
00:19:15.190 --> 00:19:19.880
|
||
Execution stops when you find the result.
|
||
|
||
|
||
173
|
||
00:19:19.880 --> 00:19:27.929
|
||
And since zero is the first member in the array, it happens very fast.
|
||
|
||
|
||
174
|
||
00:19:27.929 --> 00:19:37.450
|
||
But in strict mode, green, you have to check all elements.
|
||
|
||
|
||
175
|
||
00:19:37.450 --> 00:19:44.950
|
||
Because in strict mode you have to see all errors, possible errors.
|
||
|
||
|
||
176
|
||
00:19:44.950 --> 00:19:47.919
|
||
Fortunately lax mode is default.
|
||
|
||
|
||
177
|
||
00:19:47.919 --> 00:20:04.110
|
||
So you see that search first element but for nonexistent element, behavior the same because you have to check all elements.
|
||
|
||
|
||
178
|
||
00:20:04.110 --> 00:20:08.289
|
||
You check all elements, you can find minus zero.
|
||
|
||
|
||
179
|
||
00:20:08.289 --> 00:20:14.899
|
||
And the difference in performance is just different overheads.
|
||
|
||
|
||
180
|
||
00:20:14.899 --> 00:20:16.639
|
||
Operator.
|
||
|
||
|
||
181
|
||
00:20:16.639 --> 00:20:20.940
|
||
So this is speed up relative to SQL exists.
|
||
|
||
|
||
182
|
||
00:20:20.940 --> 00:20:26.200
|
||
And conclusions is that contains is the fastest.
|
||
|
||
|
||
183
|
||
00:20:26.200 --> 00:20:30.559
|
||
SQL exists is the slowest.
|
||
|
||
|
||
184
|
||
00:20:30.559 --> 00:20:47.059
|
||
And the performance between much and exist operators for JSON path depends how many items you have to iterate.
|
||
|
||
|
||
185
|
||
00:20:47.059 --> 00:20:57.220
|
||
The most interesting, you see on all pictures, you see that performance very dependent on how big is JSON.
|
||
|
||
|
||
186
|
||
00:20:57.220 --> 00:21:04.399
|
||
After JSON in 2 kilobytes, the performance degrade.
|
||
|
||
|
||
187
|
||
00:21:04.399 --> 00:21:08.930
|
||
That's why we analyze the TOAST details.
|
||
|
||
|
||
188
|
||
00:21:08.930 --> 00:21:13.470
|
||
This part I call the curse of TOAST.
|
||
|
||
|
||
189
|
||
00:21:13.470 --> 00:21:16.330
|
||
This is unpredictable performance of JSONB.
|
||
|
||
|
||
190
|
||
00:21:16.330 --> 00:21:22.909
|
||
People actually ask me why when they start the project we have very good performance,
|
||
|
||
|
||
191
|
||
00:21:22.909 --> 00:21:28.600
|
||
but then we see sometimes that performance very unstable.
|
||
|
||
|
||
192
|
||
00:21:28.600 --> 00:21:34.129
|
||
And this query, this example demonstrates unpredictable behavior.
|
||
|
||
|
||
193
|
||
00:21:34.129 --> 00:21:40.330
|
||
So very simple JSON with ID and some array.
|
||
|
||
|
||
194
|
||
00:21:40.330 --> 00:21:47.899
|
||
And first you have select very fast.
|
||
|
||
|
||
195
|
||
00:21:47.899 --> 00:21:56.129
|
||
So we see that buffers hit 2,500 and some milliseconds.
|
||
|
||
|
||
196
|
||
00:21:56.129 --> 00:21:59.500
|
||
After you update, very simple update.
|
||
|
||
|
||
197
|
||
00:21:59.500 --> 00:22:03.700
|
||
You have 30,000 buffers.
|
||
|
||
|
||
198
|
||
00:22:03.700 --> 00:22:07.080
|
||
And 6 milliseconds.
|
||
|
||
|
||
199
|
||
00:22:07.080 --> 00:22:18.880
|
||
So people asking me why it happens, but it happens because rows gets TOASTed.
|
||
|
||
|
||
200
|
||
00:22:18.880 --> 00:22:30.009
|
||
So TOAST is a very useful technology in Postgres which allows you to store very long objects.
|
||
|
||
|
||
201
|
||
00:22:30.009 --> 00:22:34.889
|
||
We have limitation for 8 kilobyte page size.
|
||
|
||
|
||
202
|
||
00:22:34.889 --> 00:22:38.389
|
||
We have tuple limit to 2 kilobytes.
|
||
|
||
|
||
203
|
||
00:22:38.389 --> 00:22:42.679
|
||
Everything that's bigger to 2 kilobytes we move to storage.
|
||
|
||
|
||
204
|
||
00:22:42.679 --> 00:22:50.460
|
||
And we have implicit join when we get value.
|
||
|
||
|
||
205
|
||
00:22:50.460 --> 00:22:53.520
|
||
But situation is much worse, I will show later.
|
||
|
||
|
||
206
|
||
00:22:53.520 --> 00:22:55.259
|
||
But here is explanation.
|
||
|
||
|
||
207
|
||
00:22:55.259 --> 00:23:12.129
|
||
You can install very useful extension page inspect and see that regional JSON, which stored in line, not TOASTed in heap, we have 2,500 pages.
|
||
|
||
|
||
208
|
||
00:23:12.129 --> 00:23:16.169
|
||
And just 4tuples per page.
|
||
|
||
|
||
209
|
||
00:23:16.169 --> 00:23:27.659
|
||
After update, we see very strange that we have now 64 pages with 157 tuples per page.
|
||
|
||
|
||
210
|
||
00:23:27.659 --> 00:23:28.659
|
||
How this happens?
|
||
|
||
|
||
211
|
||
00:23:28.659 --> 00:23:35.750
|
||
Because with long JSON replaced by just tuple TOAST pointer.
|
||
|
||
|
||
212
|
||
00:23:35.750 --> 00:23:40.029
|
||
So tuple becomes very small.
|
||
|
||
|
||
213
|
||
00:23:40.029 --> 00:23:43.840
|
||
But number but everything move to the TOAST.
|
||
|
||
|
||
214
|
||
00:23:43.840 --> 00:23:54.670
|
||
And we can find using this query, we can find the name of TOAST relation very easy.
|
||
|
||
|
||
215
|
||
00:23:54.670 --> 00:24:05.179
|
||
And then we can inspect how about chunks where we store this long query in the TOAST.
|
||
|
||
|
||
216
|
||
00:24:05.179 --> 00:24:10.340
|
||
And access to the TOAST requires reading at least 3 additional buffers.
|
||
|
||
|
||
217
|
||
00:24:10.340 --> 00:24:19.220
|
||
Two TOAST index buffers, because you access TOAST not directly, but using the B tree index.
|
||
|
||
|
||
218
|
||
00:24:19.220 --> 00:24:25.450
|
||
So you access 2 TOAST index buffers and one TOAST from heap buffer.
|
||
|
||
|
||
219
|
||
00:24:25.450 --> 00:24:33.240
|
||
So easy calculation you explain that why we have 30,000 pages after update.
|
||
|
||
|
||
220
|
||
00:24:33.240 --> 00:24:41.390
|
||
We have 64 buffers, and overhead 3 buffers multiplied by number of rows.
|
||
|
||
|
||
221
|
||
00:24:41.390 --> 00:24:42.700
|
||
10,000.
|
||
|
||
|
||
222
|
||
00:24:42.700 --> 00:24:49.120
|
||
So this explains performance of the small update.
|
||
|
||
|
||
223
|
||
00:24:49.120 --> 00:24:52.250
|
||
And if you know what is TOAST.
|
||
|
||
|
||
224
|
||
00:24:52.250 --> 00:24:56.309
|
||
I can explain it's very detailed.
|
||
|
||
|
||
225
|
||
00:24:56.309 --> 00:25:06.639
|
||
TOAST compressed and then split into 2 kilobyte chunks and stored in the normal heap relations.
|
||
|
||
|
||
226
|
||
00:25:06.639 --> 00:25:08.220
|
||
You just don't see.
|
||
|
||
|
||
227
|
||
00:25:08.220 --> 00:25:10.830
|
||
It's hidden from you.
|
||
|
||
|
||
228
|
||
00:25:10.830 --> 00:25:14.299
|
||
And how to access these chunks.
|
||
|
||
|
||
229
|
||
00:25:14.299 --> 00:25:17.320
|
||
You have to use index.
|
||
|
||
|
||
230
|
||
00:25:17.320 --> 00:25:23.470
|
||
Over here only bytes from the first chunk.
|
||
|
||
|
||
231
|
||
00:25:23.470 --> 00:25:30.269
|
||
You hid 3, 4, 5 or additional blocks.
|
||
|
||
|
||
232
|
||
00:25:30.269 --> 00:25:32.650
|
||
That's the problem of TOAST.
|
||
|
||
|
||
233
|
||
00:25:32.650 --> 00:25:37.759
|
||
And TOAST also very complicated algorithm.
|
||
|
||
|
||
234
|
||
00:25:37.759 --> 00:25:40.360
|
||
It use four passes.
|
||
|
||
|
||
235
|
||
00:25:40.360 --> 00:25:47.500
|
||
So Postgres try to compact tuple to 2 kilobytes.
|
||
|
||
|
||
236
|
||
00:25:47.500 --> 00:25:52.320
|
||
We tried to compress the longest fields.
|
||
|
||
|
||
237
|
||
00:25:52.320 --> 00:26:06.820
|
||
If it's not compressed, if the result of tuple still more than 2 kilobytes, it replace fields by TOAST pointer and fields move to the TOAST relation.
|
||
|
||
|
||
238
|
||
00:26:06.820 --> 00:26:12.029
|
||
Actually you see that pass 1, pass 2, pass 3, pass 4.
|
||
|
||
|
||
239
|
||
00:26:12.029 --> 00:26:13.600
|
||
So it's not easy.
|
||
|
||
|
||
240
|
||
00:26:13.600 --> 00:26:18.679
|
||
The original tuple replaced by this.
|
||
|
||
|
||
241
|
||
00:26:18.679 --> 00:26:24.529
|
||
This is plain, some field which not touched.
|
||
|
||
|
||
242
|
||
00:26:24.529 --> 00:26:33.639
|
||
This is compressed field and 4 tuple pointers which pointed to the TOAST storage.
|
||
|
||
|
||
243
|
||
00:26:33.639 --> 00:26:35.820
|
||
We have here.
|
||
|
||
|
||
244
|
||
00:26:35.820 --> 00:26:44.149
|
||
And when you access, for example, this one, you have to read all this one.
|
||
|
||
|
||
245
|
||
00:26:44.149 --> 00:26:50.840
|
||
Even if you access plain attribute, you don't touch all this TOAST.
|
||
|
||
|
||
246
|
||
00:26:50.840 --> 00:27:01.879
|
||
But once the attribute is TOASTed, you have to combine all these chunks.
|
||
|
||
|
||
247
|
||
00:27:01.879 --> 00:27:10.990
|
||
First you need to find all chunks, combine to one buffer.
|
||
|
||
|
||
248
|
||
00:27:10.990 --> 00:27:12.820
|
||
And then decompress.
|
||
|
||
|
||
249
|
||
00:27:12.820 --> 00:27:15.529
|
||
So a lot of overhead.
|
||
|
||
|
||
250
|
||
00:27:15.529 --> 00:27:18.809
|
||
Let's see this example.
|
||
|
||
|
||
251
|
||
00:27:18.809 --> 00:27:20.139
|
||
Example is very easy.
|
||
|
||
|
||
252
|
||
00:27:20.139 --> 00:27:31.110
|
||
We have hundred JSONB of different sizes and JSONB looks like key 1 very long key 2 array,
|
||
|
||
|
||
253
|
||
00:27:31.110 --> 00:27:34.379
|
||
key 3 and key 4.
|
||
|
||
|
||
254
|
||
00:27:34.379 --> 00:27:38.749
|
||
We measure time of arrow operator.
|
||
|
||
|
||
255
|
||
00:27:38.749 --> 00:27:43.190
|
||
We actually we repeat 1,000 times in query.
|
||
|
||
|
||
256
|
||
00:27:43.190 --> 00:27:46.710
|
||
Then to have a more stable result.
|
||
|
||
|
||
257
|
||
00:27:46.710 --> 00:27:48.169
|
||
This result.
|
||
|
||
|
||
258
|
||
00:27:48.169 --> 00:27:49.629
|
||
You see?
|
||
|
||
|
||
259
|
||
00:27:49.629 --> 00:27:53.000
|
||
So we generate the query.
|
||
|
||
|
||
260
|
||
00:27:53.000 --> 00:28:01.270
|
||
We execute 1,000, and then result the time, divide by 1,000, to have more stable time.
|
||
|
||
|
||
261
|
||
00:28:01.270 --> 00:28:08.750
|
||
And we see the key access time literally increase JSONB size.
|
||
|
||
|
||
262
|
||
00:28:08.750 --> 00:28:12.910
|
||
Regardless of size and position.
|
||
|
||
|
||
263
|
||
00:28:12.910 --> 00:28:19.119
|
||
And this is very surprise for many people because they say I have just one small key.
|
||
|
||
|
||
264
|
||
00:28:19.119 --> 00:28:23.610
|
||
Why I my access time is so big?
|
||
|
||
|
||
265
|
||
00:28:23.610 --> 00:28:31.629
|
||
Because everything after here after the hundred kilobytes, everything in TOAST.
|
||
|
||
|
||
266
|
||
00:28:31.629 --> 00:28:38.649
|
||
And to get this small key, you have to deTOAST all this JSONB.
|
||
|
||
|
||
267
|
||
00:28:38.649 --> 00:28:41.220
|
||
And then decompress.
|
||
|
||
|
||
268
|
||
00:28:41.220 --> 00:28:43.919
|
||
You see three areas.
|
||
|
||
|
||
269
|
||
00:28:43.919 --> 00:28:46.920
|
||
One is in line.
|
||
|
||
|
||
270
|
||
00:28:46.920 --> 00:28:51.009
|
||
The performance is good, and time is constant.
|
||
|
||
|
||
271
|
||
00:28:51.009 --> 00:28:53.940
|
||
The second one is expressed in line.
|
||
|
||
|
||
272
|
||
00:28:53.940 --> 00:29:00.980
|
||
When Postgres success to compress and put into inline.
|
||
|
||
|
||
273
|
||
00:29:00.980 --> 00:29:06.360
|
||
So hundred kilobytes is actually 2 kilobytes compressed.
|
||
|
||
|
||
274
|
||
00:29:06.360 --> 00:29:10.019
|
||
Because here's a row.
|
||
|
||
|
||
275
|
||
00:29:10.019 --> 00:29:14.149
|
||
After compressed 100 kilobyte becomes 2 kilobyte.
|
||
|
||
|
||
276
|
||
00:29:14.149 --> 00:29:21.279
|
||
And you have some growth access time because you have to decompress.
|
||
|
||
|
||
277
|
||
00:29:21.279 --> 00:29:30.340
|
||
And after 100 kilobytes, everything toasted.
|
||
|
||
|
||
278
|
||
00:29:30.340 --> 00:29:33.350
|
||
Here is a number of blocks.
|
||
|
||
|
||
279
|
||
00:29:33.350 --> 00:29:43.230
|
||
200 kilobytes, you see no additional block to read because everything in line.
|
||
|
||
|
||
280
|
||
00:29:43.230 --> 00:29:47.549
|
||
After you read more, more, more, more blocks, you see 30 blocks, it's too much.
|
||
|
||
|
||
281
|
||
00:29:47.549 --> 00:29:56.779
|
||
>> Obviously these blocks are it might be worrying [off microphone] have you done a comparison to Mongo I'll ask the question again.
|
||
|
||
|
||
282
|
||
00:29:56.779 --> 00:30:04.049
|
||
Have you done a comparison with Mongo and in what areas does Mongo suffer these same issues?
|
||
|
||
|
||
283
|
||
00:30:04.049 --> 00:30:11.440
|
||
>> OLEG BARTUNOV: The last slide will be about some comparison with Mongo because that's all people is interested.
|
||
|
||
|
||
284
|
||
00:30:11.440 --> 00:30:12.590
|
||
Yes.
|
||
|
||
|
||
285
|
||
00:30:12.590 --> 00:30:17.190
|
||
This is the same.
|
||
|
||
|
||
286
|
||
00:30:17.190 --> 00:30:20.470
|
||
Now we see this is compressed size.
|
||
|
||
|
||
287
|
||
00:30:20.470 --> 00:30:24.820
|
||
So we see that only two areas.
|
||
|
||
|
||
288
|
||
00:30:24.820 --> 00:30:26.860
|
||
Inline this is size 2 kilobytes.
|
||
|
||
|
||
289
|
||
00:30:26.860 --> 00:30:30.100
|
||
After 2 kilobytes it would compress in line.
|
||
|
||
|
||
290
|
||
00:30:30.100 --> 00:30:38.720
|
||
In line they are more clearly seen because of their size is compressed 2 kilobytes.
|
||
|
||
|
||
291
|
||
00:30:38.720 --> 00:30:48.740
|
||
And the problem is access time doesn't depends on the key size and position.
|
||
|
||
|
||
292
|
||
00:30:48.740 --> 00:30:53.980
|
||
Everything suffers from the TOAST.
|
||
|
||
|
||
293
|
||
00:30:53.980 --> 00:30:56.610
|
||
Another problem is partial update.
|
||
|
||
|
||
294
|
||
00:30:56.610 --> 00:31:02.110
|
||
Because people also complain that I just want to update small key.
|
||
|
||
|
||
295
|
||
00:31:02.110 --> 00:31:04.460
|
||
Why the performance very bad?
|
||
|
||
|
||
296
|
||
00:31:04.460 --> 00:31:16.970
|
||
Again, again, because Postgres TOAST, because TOAST mechanism, algorithm works with JSONB as a black box.
|
||
|
||
|
||
297
|
||
00:31:16.970 --> 00:31:27.230
|
||
Because it was developed by when all data types were atomic.
|
||
|
||
|
||
298
|
||
00:31:27.230 --> 00:31:32.740
|
||
But now JSONB has a structure as other data type.
|
||
|
||
|
||
299
|
||
00:31:32.740 --> 00:31:36.100
|
||
And TOAST should be more smart.
|
||
|
||
|
||
300
|
||
00:31:36.100 --> 00:31:44.149
|
||
But currently TOAST storage duplicated when we update, well traffic increased and performance becomes very slow.
|
||
|
||
|
||
301
|
||
00:31:44.149 --> 00:31:47.960
|
||
Also you see the example.
|
||
|
||
|
||
302
|
||
00:31:47.960 --> 00:31:57.940
|
||
We have hundred gigabytes heap and relation 7.
|
||
|
||
|
||
303
|
||
00:31:57.940 --> 00:32:15.409
|
||
After update we have TOAST table doubled but also we have 130 megabytes well traffic.
|
||
|
||
|
||
304
|
||
00:32:15.409 --> 00:32:17.720
|
||
After small update.
|
||
|
||
|
||
305
|
||
00:32:17.720 --> 00:32:22.999
|
||
Because Postgres doesn't know anything about structure of JSONB.
|
||
|
||
|
||
306
|
||
00:32:22.999 --> 00:32:31.039
|
||
It just black box, double size and this is problem.
|
||
|
||
|
||
307
|
||
00:32:31.039 --> 00:32:37.519
|
||
So we have our project started this year in the beginning of this year.
|
||
|
||
|
||
308
|
||
00:32:37.519 --> 00:32:42.620
|
||
JSONB deTOAST improvement and our goal, ideal goal.
|
||
|
||
|
||
309
|
||
00:32:42.620 --> 00:32:48.129
|
||
So we won't have new dependency on JSONB size and position.
|
||
|
||
|
||
310
|
||
00:32:48.129 --> 00:32:57.320
|
||
Our access time should be proportional to the nesting level and update time should be proportional to nesting level.
|
||
|
||
|
||
311
|
||
00:32:57.320 --> 00:32:59.870
|
||
And the key size what we update.
|
||
|
||
|
||
312
|
||
00:32:59.870 --> 00:33:04.809
|
||
Not the whole JSONB size but what we update only.
|
||
|
||
|
||
313
|
||
00:33:04.809 --> 00:33:08.379
|
||
Original TOAST doesn't use inline.
|
||
|
||
|
||
314
|
||
00:33:08.379 --> 00:33:16.649
|
||
Once you TOAST, we have a lot of space in heap just free.
|
||
|
||
|
||
315
|
||
00:33:16.649 --> 00:33:26.480
|
||
So we want to utilize inline as much as possible because access to inline many times faster than to the toast.
|
||
|
||
|
||
316
|
||
00:33:26.480 --> 00:33:33.889
|
||
And we want to compress long fields in TOAST chunks separately for independent access and update.
|
||
|
||
|
||
317
|
||
00:33:33.889 --> 00:33:38.640
|
||
So if you want to update something, you don't need to touch all chunks.
|
||
|
||
|
||
318
|
||
00:33:38.640 --> 00:33:44.039
|
||
You need to find some chunk updated and log.
|
||
|
||
|
||
319
|
||
00:33:44.039 --> 00:33:45.039
|
||
That's all.
|
||
|
||
|
||
320
|
||
00:33:45.039 --> 00:33:47.399
|
||
But this ideal.
|
||
|
||
|
||
321
|
||
00:33:47.399 --> 00:33:51.399
|
||
And we have several experiments.
|
||
|
||
|
||
322
|
||
00:33:51.399 --> 00:33:54.090
|
||
So woo have a partial decompression.
|
||
|
||
|
||
323
|
||
00:33:54.090 --> 00:33:57.429
|
||
We saw JSONB object key by length.
|
||
|
||
|
||
324
|
||
00:33:57.429 --> 00:34:04.159
|
||
So short keys, they stored in the beginning.
|
||
|
||
|
||
325
|
||
00:34:04.159 --> 00:34:12.700
|
||
Partial deTOAST, partial decompression, in lapse line toast, compressed fields TOAST not much time.
|
||
|
||
|
||
326
|
||
00:34:12.700 --> 00:34:15.660
|
||
So I just say their names.
|
||
|
||
|
||
327
|
||
00:34:15.660 --> 00:34:18.830
|
||
But inplace updates.
|
||
|
||
|
||
328
|
||
00:34:18.830 --> 00:34:22.030
|
||
And here we see the results.
|
||
|
||
|
||
329
|
||
00:34:22.030 --> 00:34:24.770
|
||
This is a master.
|
||
|
||
|
||
330
|
||
00:34:24.770 --> 00:34:27.310
|
||
How master behave.
|
||
|
||
|
||
331
|
||
00:34:27.310 --> 00:34:40.179
|
||
After, for example, after partial depression, after the partial decompression, some keys becomes faster.
|
||
|
||
|
||
332
|
||
00:34:40.179 --> 00:34:43.639
|
||
Here is all keys behave the same.
|
||
|
||
|
||
333
|
||
00:34:43.639 --> 00:34:52.059
|
||
But after partial decompression some keys become faster because they're in the beginning and decompressed faster.
|
||
|
||
|
||
334
|
||
00:34:52.059 --> 00:34:56.230
|
||
After the sorting keys, we see another keys came down.
|
||
|
||
|
||
335
|
||
00:34:56.230 --> 00:35:03.490
|
||
Because key 3 for example was blocked by long key 2.
|
||
|
||
|
||
336
|
||
00:35:03.490 --> 00:35:09.810
|
||
Was blocked after the sorting key 3 becomes behind the long objects.
|
||
|
||
|
||
337
|
||
00:35:09.810 --> 00:35:15.099
|
||
And this is, you see, it becomes lower.
|
||
|
||
|
||
338
|
||
00:35:15.099 --> 00:35:22.480
|
||
And after all this experiment, we get very interesting results, very good result.
|
||
|
||
|
||
339
|
||
00:35:22.480 --> 00:35:29.470
|
||
So we have still growing time but this we understand for the big arrays.
|
||
|
||
|
||
340
|
||
00:35:29.470 --> 00:35:31.710
|
||
Big arrays.
|
||
|
||
|
||
341
|
||
00:35:31.710 --> 00:35:39.280
|
||
And the first element in array we access faster than the last, of course.
|
||
|
||
|
||
342
|
||
00:35:39.280 --> 00:35:41.619
|
||
But what to do with this?
|
||
|
||
|
||
343
|
||
00:35:41.619 --> 00:35:44.799
|
||
We find it's another problem.
|
||
|
||
|
||
344
|
||
00:35:44.799 --> 00:35:59.890
|
||
But you see some very simple optimization with still stay with heap with TOAST, we can find like several orders of magnitude performance gain.
|
||
|
||
|
||
345
|
||
00:35:59.890 --> 00:36:10.080
|
||
Here is a very interesting picture how much different optimization gain to the performance.
|
||
|
||
|
||
346
|
||
00:36:10.080 --> 00:36:13.380
|
||
So we see that this is the short keys.
|
||
|
||
|
||
347
|
||
00:36:13.380 --> 00:36:17.200
|
||
Key 1 and key 3 for example here.
|
||
|
||
|
||
348
|
||
00:36:17.200 --> 00:36:20.849
|
||
Key 3 green sorting.
|
||
|
||
|
||
349
|
||
00:36:20.849 --> 00:36:28.270
|
||
Because it was blocked by key 2 but after sorting keys it got a lot of performance gain.
|
||
|
||
|
||
350
|
||
00:36:28.270 --> 00:36:29.270
|
||
And so on.
|
||
|
||
|
||
351
|
||
00:36:29.270 --> 00:36:34.280
|
||
So it's easy to interpret this picture.
|
||
|
||
|
||
352
|
||
00:36:34.280 --> 00:36:37.369
|
||
Slides available but not much time.
|
||
|
||
|
||
353
|
||
00:36:37.369 --> 00:36:45.810
|
||
And if we return back to this popular mistakes, the mistakes becomes not very serious.
|
||
|
||
|
||
354
|
||
00:36:45.810 --> 00:37:00.450
|
||
Because now this ID, which stored inside JSONB, not growing infinitely, but stay constant, still have some overhead, but it's okay, more or less.
|
||
|
||
|
||
355
|
||
00:37:00.450 --> 00:37:03.930
|
||
After all this experiment.
|
||
|
||
|
||
356
|
||
00:37:03.930 --> 00:37:08.609
|
||
We have also make some experimental sliced deTOAST.
|
||
|
||
|
||
357
|
||
00:37:08.609 --> 00:37:13.950
|
||
To improve access to array element stored in chance.
|
||
|
||
|
||
358
|
||
00:37:13.950 --> 00:37:30.830
|
||
And the last you see this arrays elements, they now not growing infinitely but this is very experimental.
|
||
|
||
|
||
359
|
||
00:37:30.830 --> 00:37:31.830
|
||
Update.
|
||
|
||
|
||
360
|
||
00:37:31.830 --> 00:37:40.109
|
||
Update is very, very painful process in Postgres.
|
||
|
||
|
||
361
|
||
00:37:40.109 --> 00:37:43.780
|
||
And for JSONB especially.
|
||
|
||
|
||
362
|
||
00:37:43.780 --> 00:37:47.790
|
||
Here is a master.
|
||
|
||
|
||
363
|
||
00:37:47.790 --> 00:37:56.339
|
||
After the shared TOAST, shared TOAST means we update only selected chunks.
|
||
|
||
|
||
364
|
||
00:37:56.339 --> 00:38:01.710
|
||
Other chunks were shared.
|
||
|
||
|
||
365
|
||
00:38:01.710 --> 00:38:17.650
|
||
After the shared TOAST we have good results and only on the last array elements still grow because the last element in the array.
|
||
|
||
|
||
366
|
||
00:38:17.650 --> 00:38:32.119
|
||
In in place update, we have I think it's good, not good results for updates.
|
||
|
||
|
||
367
|
||
00:38:32.119 --> 00:38:37.049
|
||
Again for updates we have again several orders of magnitude.
|
||
|
||
|
||
368
|
||
00:38:37.049 --> 00:38:45.790
|
||
Updates is very important, because people use JSONB in OETP environment.
|
||
|
||
|
||
369
|
||
00:38:45.790 --> 00:38:48.559
|
||
So update is very important.
|
||
|
||
|
||
370
|
||
00:38:48.559 --> 00:38:52.180
|
||
Access is good for analytical processing.
|
||
|
||
|
||
371
|
||
00:38:52.180 --> 00:38:58.710
|
||
But for LTP update is very important.
|
||
|
||
|
||
372
|
||
00:38:58.710 --> 00:39:01.319
|
||
This is a number of blocks read.
|
||
|
||
|
||
373
|
||
00:39:01.319 --> 00:39:09.069
|
||
You see clearly we have much less blocks to read.
|
||
|
||
|
||
374
|
||
00:39:09.069 --> 00:39:16.470
|
||
Just to remind you, this is not linear scale, this is algorithmic scale.
|
||
|
||
|
||
375
|
||
00:39:16.470 --> 00:39:19.790
|
||
This is WAL traffic.
|
||
|
||
|
||
376
|
||
00:39:19.790 --> 00:39:23.370
|
||
In master you have very, very big WAL traffic.
|
||
|
||
|
||
377
|
||
00:39:23.370 --> 00:39:26.540
|
||
In shared it's smaller.
|
||
|
||
|
||
378
|
||
00:39:26.540 --> 00:39:30.740
|
||
And here we have very controlled WAL traffic.
|
||
|
||
|
||
379
|
||
00:39:30.740 --> 00:39:38.170
|
||
You looked only what you update.
|
||
|
||
|
||
380
|
||
00:39:38.170 --> 00:39:49.740
|
||
The remain people asking what if I use JSONB and relational structure, which is better?
|
||
|
||
|
||
381
|
||
00:39:49.740 --> 00:39:57.060
|
||
So we made several again access the whole document testing.
|
||
|
||
|
||
382
|
||
00:39:57.060 --> 00:40:13.640
|
||
So this JSONB, this relational tables for relational, you need to have join, for JSON you don't need join and we have several queries and this is the result picture.
|
||
|
||
|
||
383
|
||
00:40:13.640 --> 00:40:25.900
|
||
So you see that this is a very, very fast the darker blue is good.
|
||
|
||
|
||
384
|
||
00:40:25.900 --> 00:40:27.599
|
||
This is bad.
|
||
|
||
|
||
385
|
||
00:40:27.599 --> 00:40:34.530
|
||
To access the whole document in JSONB it's no surprise, very fast.
|
||
|
||
|
||
386
|
||
00:40:34.530 --> 00:40:37.869
|
||
Because you don't need to just need to read it.
|
||
|
||
|
||
387
|
||
00:40:37.869 --> 00:40:49.200
|
||
And for but when you want to transfer JSONB, you have problem.
|
||
|
||
|
||
388
|
||
00:40:49.200 --> 00:40:53.730
|
||
Because you need to convert JSONB to the text.
|
||
|
||
|
||
389
|
||
00:40:53.730 --> 00:40:57.070
|
||
Conversion to the text in Postgres is also painful.
|
||
|
||
|
||
390
|
||
00:40:57.070 --> 00:41:21.420
|
||
You need to check all bits, you know, to characters so it's not easy operation and we see for the large JSONB the time is not very good and it becomes also here.
|
||
|
||
|
||
391
|
||
00:41:21.420 --> 00:41:30.740
|
||
Here we made experiment when we don't need to transfer in textual forum as text.
|
||
|
||
|
||
392
|
||
00:41:30.740 --> 00:41:37.960
|
||
This is called UB JSON we transfer to the client just binary transfer.
|
||
|
||
|
||
393
|
||
00:41:37.960 --> 00:41:40.640
|
||
We see it becomes better.
|
||
|
||
|
||
394
|
||
00:41:40.640 --> 00:41:45.880
|
||
There is no degradation.
|
||
|
||
|
||
395
|
||
00:41:45.880 --> 00:41:52.721
|
||
For the relational, you see the problem.
|
||
|
||
|
||
396
|
||
00:41:52.721 --> 00:41:58.339
|
||
We transfer this is the time for select.
|
||
|
||
|
||
397
|
||
00:41:58.339 --> 00:42:03.670
|
||
This is for select transfer and this is transfer binary.
|
||
|
||
|
||
398
|
||
00:42:03.670 --> 00:42:07.339
|
||
There is a method, for arrays you can transfer binary.
|
||
|
||
|
||
399
|
||
00:42:07.339 --> 00:42:14.290
|
||
And clearly we see that JSONB here is the winner.
|
||
|
||
|
||
400
|
||
00:42:14.290 --> 00:42:18.760
|
||
And this explain why it's so popular.
|
||
|
||
|
||
401
|
||
00:42:18.760 --> 00:42:21.510
|
||
Because micro service, what is a micro service?
|
||
|
||
|
||
402
|
||
00:42:21.510 --> 00:42:29.720
|
||
It's a small service which expects some predefined query, some aggregate.
|
||
|
||
|
||
403
|
||
00:42:29.720 --> 00:42:34.020
|
||
And JSONB is a actually aggregate.
|
||
|
||
|
||
404
|
||
00:42:34.020 --> 00:42:42.120
|
||
You don't need to join data from different tables.
|
||
|
||
|
||
405
|
||
00:42:42.120 --> 00:42:43.920
|
||
You just have aggregation.
|
||
|
||
|
||
406
|
||
00:42:43.920 --> 00:42:48.770
|
||
And microservice access this JSONB and performance is very good.
|
||
|
||
|
||
407
|
||
00:42:48.770 --> 00:42:50.599
|
||
Very simple.
|
||
|
||
|
||
408
|
||
00:42:50.599 --> 00:42:56.980
|
||
So if you use microservice architecture you'll just happy with JSONB.
|
||
|
||
|
||
409
|
||
00:42:56.980 --> 00:42:58.320
|
||
That's popular.
|
||
|
||
|
||
410
|
||
00:42:58.320 --> 00:43:06.530
|
||
But when you access key and update, you have a bit different result.
|
||
|
||
|
||
411
|
||
00:43:06.530 --> 00:43:15.410
|
||
Because the first one is the last one is select.
|
||
|
||
|
||
412
|
||
00:43:15.410 --> 00:43:27.849
|
||
The relational, current situation, and this situation with JSONB after all our optimization.
|
||
|
||
|
||
413
|
||
00:43:27.849 --> 00:43:33.500
|
||
So you see that after all optimization, as fast as relational access.
|
||
|
||
|
||
414
|
||
00:43:33.500 --> 00:43:41.940
|
||
We understand that for JSONB to get some key, you have to do some operation.
|
||
|
||
|
||
415
|
||
00:43:41.940 --> 00:43:47.760
|
||
But for relational, you just need to get some you don't need any overhead.
|
||
|
||
|
||
416
|
||
00:43:47.760 --> 00:43:57.550
|
||
And very nice is that after our optimization, we behave the same as relational.
|
||
|
||
|
||
417
|
||
00:43:57.550 --> 00:44:01.670
|
||
But for current situation like this.
|
||
|
||
|
||
418
|
||
00:44:01.670 --> 00:44:08.020
|
||
So if you want to update especially, update access key, relational is the winner.
|
||
|
||
|
||
419
|
||
00:44:08.020 --> 00:44:12.619
|
||
Our optimization helps a lot.
|
||
|
||
|
||
420
|
||
00:44:12.619 --> 00:44:15.470
|
||
This is slowdown.
|
||
|
||
|
||
421
|
||
00:44:15.470 --> 00:44:23.230
|
||
You see that JSONB slow down against this relational.
|
||
|
||
|
||
422
|
||
00:44:23.230 --> 00:44:28.220
|
||
But after our optimization, we have you see like the same.
|
||
|
||
|
||
423
|
||
00:44:28.220 --> 00:44:31.799
|
||
Like relational.
|
||
|
||
|
||
424
|
||
00:44:31.799 --> 00:44:34.650
|
||
And the same is update slowdown.
|
||
|
||
|
||
425
|
||
00:44:34.650 --> 00:44:46.770
|
||
So for update we still have some this is JSONB original in master, and here is our.
|
||
|
||
|
||
426
|
||
00:44:46.770 --> 00:44:51.359
|
||
So our optimization helps for updates.
|
||
|
||
|
||
427
|
||
00:44:51.359 --> 00:44:54.880
|
||
And also WAL traffic.
|
||
|
||
|
||
428
|
||
00:44:54.880 --> 00:45:00.809
|
||
So we're here for master, we hit a lot of WAL traffic.
|
||
|
||
|
||
429
|
||
00:45:00.809 --> 00:45:02.609
|
||
We look for a lot.
|
||
|
||
|
||
430
|
||
00:45:02.609 --> 00:45:08.500
|
||
This is relational and this is our optimization.
|
||
|
||
|
||
431
|
||
00:45:08.500 --> 00:45:12.670
|
||
There looks like the same.
|
||
|
||
|
||
432
|
||
00:45:12.670 --> 00:45:16.110
|
||
Also we have access array member.
|
||
|
||
|
||
433
|
||
00:45:16.110 --> 00:45:17.300
|
||
Very popular operation.
|
||
|
||
|
||
434
|
||
00:45:17.300 --> 00:45:21.530
|
||
We have array and you want to get some member of this array.
|
||
|
||
|
||
435
|
||
00:45:21.530 --> 00:45:32.099
|
||
We have first key, middle key and the last key.
|
||
|
||
|
||
436
|
||
00:45:32.099 --> 00:45:39.680
|
||
And here's relational, optimized, and non optimized JSONB.
|
||
|
||
|
||
437
|
||
00:45:39.680 --> 00:45:50.000
|
||
So we see access array member is not very good for JSONB, but with our optimization,
|
||
|
||
|
||
438
|
||
00:45:50.000 --> 00:45:56.059
|
||
again, again we approach to the relational.
|
||
|
||
|
||
439
|
||
00:45:56.059 --> 00:46:02.650
|
||
And update array member, we also compare how to update array member.
|
||
|
||
|
||
440
|
||
00:46:02.650 --> 00:46:05.960
|
||
And here is JSONB.
|
||
|
||
|
||
441
|
||
00:46:05.960 --> 00:46:09.920
|
||
This is JSONB optimize and this is relational.
|
||
|
||
|
||
442
|
||
00:46:09.920 --> 00:46:15.869
|
||
Of course when you update array member in relational table, it's very easy.
|
||
|
||
|
||
443
|
||
00:46:15.869 --> 00:46:21.020
|
||
You just update one row.
|
||
|
||
|
||
444
|
||
00:46:21.020 --> 00:46:27.539
|
||
And for the JSONB, you update the whole JSONB.
|
||
|
||
|
||
445
|
||
00:46:27.539 --> 00:46:30.839
|
||
We can be very big.
|
||
|
||
|
||
446
|
||
00:46:30.839 --> 00:46:37.670
|
||
But our optimization helps a lot again.
|
||
|
||
|
||
447
|
||
00:46:37.670 --> 00:46:39.420
|
||
This is a WAL.
|
||
|
||
|
||
448
|
||
00:46:39.420 --> 00:46:46.109
|
||
And conclusion is that JSONB is good for full object access.
|
||
|
||
|
||
449
|
||
00:46:46.109 --> 00:46:47.329
|
||
So microservices.
|
||
|
||
|
||
450
|
||
00:46:47.329 --> 00:46:49.190
|
||
It's much faster than relational way.
|
||
|
||
|
||
451
|
||
00:46:49.190 --> 00:46:55.480
|
||
In relational way you have to join, have you to aggregate and very difficult to tune the process.
|
||
|
||
|
||
452
|
||
00:46:55.480 --> 00:46:57.900
|
||
With JSONB you have no problem.
|
||
|
||
|
||
453
|
||
00:46:57.900 --> 00:47:00.799
|
||
Also JSONB is very good for storing metadata.
|
||
|
||
|
||
454
|
||
00:47:00.799 --> 00:47:07.710
|
||
In short, metadata in separate JSONB field.
|
||
|
||
|
||
455
|
||
00:47:07.710 --> 00:47:14.289
|
||
And currently PG14 not optimized, as I showed you, for update.
|
||
|
||
|
||
456
|
||
00:47:14.289 --> 00:47:16.430
|
||
And access to array member.
|
||
|
||
|
||
457
|
||
00:47:16.430 --> 00:47:27.400
|
||
But we demonstrated all our optimization, which give which resulted in orders of magnitudes for select and update.
|
||
|
||
|
||
458
|
||
00:47:27.400 --> 00:47:35.270
|
||
And the question is how to integrate now all this, our patches, to the Postgres.
|
||
|
||
|
||
459
|
||
00:47:35.270 --> 00:47:38.920
|
||
And the first step is to make a data type aware TOAST.
|
||
|
||
|
||
460
|
||
00:47:38.920 --> 00:47:45.470
|
||
Because currently TOAST is the common for all data type.
|
||
|
||
|
||
461
|
||
00:47:45.470 --> 00:47:52.330
|
||
But we suggest that TOAST should be extended.
|
||
|
||
|
||
462
|
||
00:47:52.330 --> 00:47:57.990
|
||
So data type knows better how to TOAST it.
|
||
|
||
|
||
463
|
||
00:47:57.990 --> 00:48:10.819
|
||
And that allows us to, allows many other developers give a lot of performance improving.
|
||
|
||
|
||
464
|
||
00:48:10.819 --> 00:48:15.900
|
||
We have example when we improve streaming.
|
||
|
||
|
||
465
|
||
00:48:15.900 --> 00:48:23.880
|
||
You know, some people want to stream data into the Postgres.
|
||
|
||
|
||
466
|
||
00:48:23.880 --> 00:48:25.460
|
||
For example, movie.
|
||
|
||
|
||
467
|
||
00:48:25.460 --> 00:48:29.410
|
||
It's crazy, but they stream there.
|
||
|
||
|
||
468
|
||
00:48:29.410 --> 00:48:34.970
|
||
Before you image how it behaves slowly because it's logged every time.
|
||
|
||
|
||
469
|
||
00:48:34.970 --> 00:48:43.540
|
||
You add one byte and you look all one gigabyte.
|
||
|
||
|
||
470
|
||
00:48:43.540 --> 00:48:49.930
|
||
After our optimization we have special extension, it works very fast.
|
||
|
||
|
||
471
|
||
00:48:49.930 --> 00:48:52.180
|
||
We look only this one byte.
|
||
|
||
|
||
472
|
||
00:48:52.180 --> 00:48:54.650
|
||
That's all.
|
||
|
||
|
||
473
|
||
00:48:54.650 --> 00:49:01.810
|
||
Because we made a special for byte here, special TOAST.
|
||
|
||
|
||
474
|
||
00:49:01.810 --> 00:49:09.650
|
||
So we need to the community accept some data type.
|
||
|
||
|
||
475
|
||
00:49:09.650 --> 00:49:11.710
|
||
Just two slides.
|
||
|
||
|
||
476
|
||
00:49:11.710 --> 00:49:14.309
|
||
So we have to do.
|
||
|
||
|
||
477
|
||
00:49:14.309 --> 00:49:20.849
|
||
On physical level we provide random access to keys and arrays.
|
||
|
||
|
||
478
|
||
00:49:20.849 --> 00:49:22.809
|
||
On physical level this is easy.
|
||
|
||
|
||
479
|
||
00:49:22.809 --> 00:49:26.720
|
||
We already have sliced deTOAST.
|
||
|
||
|
||
480
|
||
00:49:26.720 --> 00:49:27.880
|
||
We need to do some compression.
|
||
|
||
|
||
481
|
||
00:49:27.880 --> 00:49:32.220
|
||
But it's most important to make it at the logical level.
|
||
|
||
|
||
482
|
||
00:49:32.220 --> 00:49:40.170
|
||
So we know how to which toast chink we need to work on logical level.
|
||
|
||
|
||
483
|
||
00:49:40.170 --> 00:49:43.150
|
||
And this number of patches.
|
||
|
||
|
||
484
|
||
00:49:43.150 --> 00:49:47.359
|
||
This not exists, not exists yes.
|
||
|
||
|
||
485
|
||
00:49:47.359 --> 00:49:57.490
|
||
But this all our patches and our roadmap how to work with community to submit all this picture.
|
||
|
||
|
||
486
|
||
00:49:57.490 --> 00:50:00.579
|
||
All this our result.
|
||
|
||
|
||
487
|
||
00:50:00.579 --> 00:50:03.049
|
||
And references.
|
||
|
||
|
||
488
|
||
00:50:03.049 --> 00:50:06.880
|
||
And I invited you to join our development team.
|
||
|
||
|
||
489
|
||
00:50:06.880 --> 00:50:12.650
|
||
This is not a company project, this is open source community project.
|
||
|
||
|
||
490
|
||
00:50:12.650 --> 00:50:14.480
|
||
So everybody invited to join us.
|
||
|
||
|
||
491
|
||
00:50:14.480 --> 00:50:23.950
|
||
This is what asked me, Simon asked about non scientific comparison Postgres with Mongo.
|
||
|
||
|
||
492
|
||
00:50:23.950 --> 00:50:30.369
|
||
I said nonscientific because scientific benchmark is very, very complicated task.
|
||
|
||
|
||
493
|
||
00:50:30.369 --> 00:50:31.750
|
||
Very, very.
|
||
|
||
|
||
494
|
||
00:50:31.750 --> 00:50:35.900
|
||
But here is nonscientific.
|
||
|
||
|
||
495
|
||
00:50:35.900 --> 00:50:46.339
|
||
Mongo need double size of Postgres because Mongo keep uncompressed data in memory.
|
||
|
||
|
||
496
|
||
00:50:46.339 --> 00:50:55.000
|
||
And we see that this is manual progress.
|
||
|
||
|
||
497
|
||
00:50:55.000 --> 00:50:59.680
|
||
After our optimization, we have performance better than Mongo.
|
||
|
||
|
||
498
|
||
00:50:59.680 --> 00:51:11.010
|
||
But if we turn on parallel support, because all this without any parallelism to compare.
|
||
|
||
|
||
499
|
||
00:51:11.010 --> 00:51:18.599
|
||
But after the parallel support we have very fast Postgres compared to Mongo.
|
||
|
||
|
||
500
|
||
00:51:18.599 --> 00:51:30.610
|
||
But as I said, this is Mongo, when memory is just 4 gigabytes, and Mongo is not very good when you don't have enough memory.
|
||
|
||
|
||
501
|
||
00:51:30.610 --> 00:51:33.510
|
||
So it behave like the Postgres.
|
||
|
||
|
||
502
|
||
00:51:33.510 --> 00:51:35.440
|
||
Postgres is much better.
|
||
|
||
|
||
503
|
||
00:51:35.440 --> 00:51:39.650
|
||
It works with memory.
|
||
|
||
|
||
504
|
||
00:51:39.650 --> 00:51:51.530
|
||
So that means that we have our community, our community have a good chance to attract Mongo users to the Postgres.
|
||
|
||
|
||
505
|
||
00:51:51.530 --> 00:51:54.890
|
||
Because Postgres is a very good, solid, database.
|
||
|
||
|
||
506
|
||
00:51:54.890 --> 00:51:55.980
|
||
Good community.
|
||
|
||
|
||
507
|
||
00:51:55.980 --> 00:51:59.250
|
||
We're all open source, independent.
|
||
|
||
|
||
508
|
||
00:51:59.250 --> 00:52:00.490
|
||
And we have JSON.
|
||
|
||
|
||
509
|
||
00:52:00.490 --> 00:52:08.770
|
||
Just need to better performance and to be more friendly to the young people who started working with Postgres.
|
||
|
||
|
||
510
|
||
00:52:08.770 --> 00:52:11.089
|
||
This is picture of my kids.
|
||
|
||
|
||
511
|
||
00:52:11.089 --> 00:52:13.470
|
||
They climb trees.
|
||
|
||
|
||
512
|
||
00:52:13.470 --> 00:52:16.300
|
||
And sometimes they tear their pants.
|
||
|
||
|
||
513
|
||
00:52:16.300 --> 00:52:17.740
|
||
And I have two options.
|
||
|
||
|
||
514
|
||
00:52:17.740 --> 00:52:19.070
|
||
I can forbid them to do this.
|
||
|
||
|
||
515
|
||
00:52:19.070 --> 00:52:21.380
|
||
I can teach them.
|
||
|
||
|
||
516
|
||
00:52:21.380 --> 00:52:26.640
|
||
So let's say that JSON is not the wrong technology.
|
||
|
||
|
||
517
|
||
00:52:26.640 --> 00:52:29.380
|
||
Let's make it a first class citizen in Postgres.
|
||
|
||
|
||
518
|
||
00:52:29.380 --> 00:52:31.359
|
||
Be friendly.
|
||
|
||
|
||
519
|
||
00:52:31.359 --> 00:52:44.680
|
||
Because still some senior Postgres people still say that oh, relational database, relations, you need just to read these books.
|
||
|
||
|
||
520
|
||
00:52:44.680 --> 00:52:47.970
|
||
But young people, startups, they don't have time.
|
||
|
||
|
||
521
|
||
00:52:47.970 --> 00:52:52.640
|
||
They can hire some very sophisticated senior database architecture.
|
||
|
||
|
||
522
|
||
00:52:52.640 --> 00:52:56.180
|
||
They want to make their business.
|
||
|
||
|
||
523
|
||
00:52:56.180 --> 00:52:58.359
|
||
They need JSON.
|
||
|
||
|
||
524
|
||
00:52:58.359 --> 00:53:05.000
|
||
So my position is to make Postgres friendly to these people.
|
||
|
||
|
||
525
|
||
00:53:05.000 --> 00:53:08.220
|
||
This is actually our duty.
|
||
|
||
|
||
526
|
||
00:53:08.220 --> 00:53:11.180
|
||
Database should be smart.
|
||
|
||
|
||
527
|
||
00:53:11.180 --> 00:53:15.490
|
||
People should work should do their project.
|
||
|
||
|
||
528
|
||
00:53:15.490 --> 00:53:25.359
|
||
So that's why I say that at least at the end, we are universal database.
|
||
|
||
|
||
529
|
||
00:53:25.359 --> 00:53:28.940
|
||
So I say that all you need is just Postgres.
|
||
|
||
|
||
530
|
||
00:53:28.940 --> 00:53:39.160
|
||
With JSON we will have a lot of fun, a lot of new people, and our popularity will continue to grow.
|
||
|
||
|
||
531
|
||
00:53:39.160 --> 00:53:41.530
|
||
That is what I want to finish.
|
||
|
||
|
||
532
|
||
00:53:41.530 --> 00:53:42.530
|
||
Thank you for attendance.
|
||
|
||
|
||
533
|
||
00:53:42.530 --> 00:53:44.530
|
||
[ Applause ] I think we have not much time for questions and answers.
|
||
|
||
|
||
534
|
||
00:53:44.530 --> 00:53:45.530
|
||
I will be available the whole day.
|
||
|
||
|
||
535
|
||
00:53:45.530 --> 00:53:46.530
|
||
You can ask me, you can discuss with me what I'm very interested in your data.
|
||
|
||
|
||
536
|
||
00:53:46.530 --> 00:53:47.530
|
||
In your data, in your query.
|
||
|
||
|
||
537
|
||
00:53:47.530 --> 00:53:48.530
|
||
You know it's very difficult to optimize something if you don't have real data and a real query.
|
||
|
||
|
||
538
|
||
00:53:48.530 --> 00:53:49.530
|
||
So I don't need personal data.
|
||
|
||
|
||
539
|
||
00:53:49.530 --> 00:53:49.542
|
||
If you can share, I will be very useful and help us.
|