R系列之维恩图和upset图的绘制

这篇文章学习了使用VennDiagram包来进行韦恩图的绘制,并解决了VennDiagram绘制维恩图中出现的 不能直接展示,必须要保存为文件才能查看每次运行都会产生log文件两大缺点;随后也学习了使用UpSetR绘制UpSet图。

韦恩图

之前一直知道可以使用VennDiagram来进行韦恩图的绘制,但是一直不愿意用它,主要是因为其有两个非常致命的缺点:

  • 不能直接展示,必须要保存为文件才能查看,这样在调试图形的时候就比较麻烦
  • 每次运行都会产生log文件,多次运行调试之后会产生很多无用的log文件,很麻烦

后来发现其实这两个问题是可以解决的,这里在学习使用的同时记录一下解决问题的方法。

在console中展示图片

这里先记录直接展示图片不用保存的方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
library(VennDiagram)
temp <- venn.diagram(list(B = 1:1800, A = 1571:2020),
fill = c("red", "green"), alpha = c(0.5, 0.5), cex = 2,cat.fontface = 4,
lty =2, filename = NULL)
# 直接在console中展示图片
## 创建一个全新的空白页面,防止前面存在图片后面形成覆盖
grid.newpage()
## 在前面的空白页面上画图
grid.draw(temp)

# 使用和其他绘图类似的方法保存图片
pdf(file="test.pdf")
grid.newpage()
grid.draw(temp)
dev.off()

关键点:

  • 绘图中的filename = NULL,不用指定filename
  • 使用grid.draw显示绘图结果

参考链接-附有将多个venn保存到单个pdf中的方法


不输出log文件

log文件内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
INFO [2020-04-01 09:42:08] [[1]]
INFO [2020-04-01 09:42:08] list(B = 1:1800, A = 1571:2020)
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $fill
INFO [2020-04-01 09:42:08] c("red", "green")
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $alpha
INFO [2020-04-01 09:42:08] c(0.5, 0.5)
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $cex
INFO [2020-04-01 09:42:08] [1] 2
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $cat.fontface
INFO [2020-04-01 09:42:08] [1] 4
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $lty
INFO [2020-04-01 09:42:08] [1] 2
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $fontfamily
INFO [2020-04-01 09:42:08] [1] 3
INFO [2020-04-01 09:42:08]
INFO [2020-04-01 09:42:08] $filename
INFO [2020-04-01 09:42:08] NULL
INFO [2020-04-01 09:42:08]

没什么重要的信息,所以直接抑制其输出也没有什么问题.

VennDiagram包进行日志的输出主要是使用了futile.logger包,这个在导入VennDiagram包就可以看出来:

1
2
3
library(VennDiagram)
Loading required package: grid
Loading required package: futile.logger

知道使用的什么log包之后就可以通过限制log的输出级别来进行限制(前面显示的log文件中的log级别都是INFO)

1
2
3
4
5
6
7
8
9
# 添加这一行,将输出的log级别限制于ERROR之上
temp <- venn.diagram(list(B = 1:1800, A = 1571:2020),
fill = c("red", "green"), alpha = c(0.5, 0.5), cex = 2,cat.fontface = 4,
lty =2, filename = NULL)
grid.newpage()
grid.draw(temp)

# 恢复的话就是设置为INFO
futile.logger::flog.threshold(futile.logger::INFO, name = "VennDiagramLogger")


使用

在移除了两大障碍之后就可以安心学习这个包了。下面列出一些使用VennDiagram包可以实现的图片,后续使用的时候可以直接查阅:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
## Not run: 
# Example to print to screen
venn.plot <- venn.diagram(
x = list(
sample1 = c(1:40),
sample2 = c(30:60)
),
filename = NULL
);

# Save picture to non-TIFF file type
# currently working on adding this functionality directly into venn.diagram
venn.plot <- venn.diagram(
x = list (
A = 1:10,
B = 6:25
),
filename = NULL
);

jpeg("venn_jpeg.jpg");
grid.draw(venn.plot);
dev.off();

## End(Not run)

#dontrun-starts-here
### NB: All figures from the paper can be run, but are turned off from
### automatic execution to reduce burden on CRAN computing resources.
## Not run:
# Figure 1A
venn.plot <- venn.diagram(
x = list(
Label = 1:100
),
filename = "1A-single_Venn.tiff",
col = "black",
lwd = 9,
fontface = "bold",
fill = "grey",
alpha = 0.75,
cex = 4,
cat.cex = 3,
cat.fontface = "bold",
);

# Figure 1B
venn.plot <- venn.diagram(
x = list(
X = 1:150,
Y = 121:180
),
filename = "1B-double_Venn.tiff",
lwd = 4,
fill = c("cornflowerblue", "darkorchid1"),
alpha = 0.75,
label.col = "white",
cex = 4,
fontfamily = "serif",
fontface = "bold",
cat.col = c("cornflowerblue", "darkorchid1"),
cat.cex = 3,
cat.fontfamily = "serif",
cat.fontface = "bold",
cat.dist = c(0.03, 0.03),
cat.pos = c(-20, 14)
);

# Figure 1C
venn.plot <- venn.diagram(
x = list(
R = c(1:70, 71:110, 111:120, 121:140),
B = c(141:200, 71:110, 111:120, 201:230),
G = c(231:280, 111:120, 121:140, 201:230)
),
filename = "1C-triple_Venn.tiff",
col = "transparent",
fill = c("red", "blue", "green"),
alpha = 0.5,
label.col = c("darkred", "white", "darkblue", "white",
"white", "white", "darkgreen"),
cex = 2.5,
fontfamily = "serif",
fontface = "bold",
cat.default.pos = "text",
cat.col = c("darkred", "darkblue", "darkgreen"),
cat.cex = 2.5,
cat.fontfamily = "serif",
cat.dist = c(0.06, 0.06, 0.03),
cat.pos = 0
);

# Figure 1D
venn.plot <- venn.diagram(
x = list(
I = c(1:60, 61:105, 106:140, 141:160, 166:175, 176:180, 181:205,
206:220),
IV = c(531:605, 476:530, 336:375, 376:405, 181:205, 206:220, 166:175,
176:180),
II = c(61:105, 106:140, 181:205, 206:220, 221:285, 286:335, 336:375,
376:405),
III = c(406:475, 286:335, 106:140, 141:160, 166:175, 181:205, 336:375,
476:530)
),
filename = "1D-quadruple_Venn.tiff",
col = "black",
lty = "dotted",
lwd = 4,
fill = c("cornflowerblue", "green", "yellow", "darkorchid1"),
alpha = 0.50,
label.col = c("orange", "white", "darkorchid4", "white", "white", "white",
"white", "white", "darkblue", "white",
"white", "white", "white", "darkgreen", "white"),
cex = 2.5,
fontfamily = "serif",
fontface = "bold",
cat.col = c("darkblue", "darkgreen", "orange", "darkorchid4"),
cat.cex = 2.5,
cat.fontfamily = "serif"
);

# Figure 2-1
venn.plot <- venn.diagram(
x = list(
A = 1:105,
B = 101:115
),
filename = "2-1_special_case_ext-text.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = c(-20, 20),
ext.line.lty = "dotted",
ext.line.lwd = 2,
ext.pos = 12,
ext.dist = -0.12,
ext.length = 0.85
);

# Figure 2-2
venn.plot <- venn.diagram(
x = list(
A = 1:100,
B = 1:10
),
filename = "2-2_special_case_pairwise-inclusion.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = 0
);

# Figure 2-3
venn.plot <- venn.diagram(
x = list(
A = 1:150,
B = 151:250
),
filename = "2-3_special_case_pairwise-exclusion.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = c(0, 0),
cat.dist = 0.05
);

# Figure 2-4
venn.plot <- venn.diagram(
x = list(
A = c(1:50, 101:140, 141:160, 161:170),
B = c(171:230, 101:140, 161:170, 291:320),
C = c(141:160, 161:170, 291:320)
),
filename = "2-4_triple_special_case-001.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.dist = c(0.05, 0.05, -0.1)
);

# Figure 2-5
venn.plot <- venn.diagram(
x = list(
A = c(1:100),
B = c(61:70, 71:100),
C = c(41:60, 61:70)
),
filename = "2-5_triple_special_case-012AA.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = c(-25, 0, 30),
cat.dist = c(0.05, 0.05, 0.02)
);

# Figure 2-6
venn.plot <- venn.diagram(
x = list(
A = c(1:90),
B = c(1:25),
C = c(1:5)
),
filename = "2-6_triple_special_case-022AAAO.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = 0,
cat.dist = c(0.03, 0.03, 0.01)
);

# Figure 2-7
venn.plot <- venn.diagram(
x = list(
A = c(1:20),
B = c(21:80),
C = c(81:210)
),
filename = "2-7_triple_special_case-100.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.dist = 0.05
);

# Figure 2-8
venn.plot <- venn.diagram(
x = list(
A = c(1:80),
B = c(41:150),
C = c(71:100)
),
filename = "2-8_triple_special_case-011A.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.dist = c(0.07, 0.07, 0.02),
cat.pos = c(-20, 20, 20)
);

# Figure 2-9
venn.plot <- venn.diagram(
x = list(
A = c(1:10),
B = c(11:90),
C = c(81:90)
),
filename = "2-9_triple_special_case-121AO.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = 0,
cat.dist = c(0.04, 0.04, 0.02),
reverse = TRUE
);

# Complex three-way Venn with labels & sub-/super-scripts
venn.plot <- venn.diagram(
x = list(
I = c(1:60, 61:105, 106:140, 141:160, 166:175, 176:180, 181:205,
206:220),
II = c(531:605, 476:530, 336:375, 376:405, 181:205, 206:220, 166:175,
176:180),
III = c(61:105, 106:140, 181:205, 206:220, 221:285, 286:335, 336:375,
376:405)
),
category.names = c(
expression( bold('A'['1: subscript']) ),
expression( bold('B'^'2: going up') ),
expression( paste(bold('C'^'3'), bold('X'['i' <= 'r'^'2']^'2') ) )
),
filename = 'Fig3-1_triple_labels_sub_and_superscripts.tiff',
output = TRUE,
height = 3000,
width = 3000,
resolution = 300,
compression = 'lzw',
units = 'px',
lwd = 6,
lty = 'blank',
fill = c('yellow', 'purple', 'green'),
cex = 3.5,
fontface = "bold",
fontfamily = "sans",
cat.cex = 3,
cat.fontface = "bold",
cat.default.pos = "outer",
cat.pos = c(-27, 27, 135),
cat.dist = c(0.055, 0.055, 0.085),
cat.fontfamily = "sans",
rotation = 1
);

# Complex 3-way Venn using expressions
venn.plot <- venn.diagram(
x = list(
"Num A" = paste("Num", 1:100),
"Num B" = c(paste("Num", 61:70), paste("Num", 71:100)),
"Num C" = c(paste("Num", 41:60), paste("Num", 61:70))),
category.names = c(
expression( bold('A'['1']) ),
expression( bold('A'['2']) ),
expression( bold('A'['3']) )
),
euler.d = TRUE,
filename = "Fig3-2_Euler_3set_simple_with_subscripts.tiff",
cat.pos = c(-20, 0, 20),
cat.dist = c(0.05, 0.05, 0.02),
cex = 2.5,
cat.cex = 2.5,
reverse = TRUE
);

venn.plot <- venn.diagram(
list(A = 1:150, B = 121:170),
"Venn_2set_simple.tiff"
);
venn.plot <- venn.diagram(
list(A = 1:150, B = 121:170, C = 101:200),
"Venn_3set_simple.tiff"
);

## End(Not run)

# a more elaborate two-set Venn diagram with title and subtitle
venn.plot <- venn.diagram(
x = list(
"A" = 1:100,
"B" = 96:140
),
filename = "Venn_2set_complex.tiff",
scaled = TRUE,
ext.text = TRUE,
ext.line.lwd = 2,
ext.dist = -0.15,
ext.length = 0.9,
ext.pos = -4,
inverted = TRUE,
cex = 2.5,
cat.cex = 2.5,
rotation.degree = 45,
main = "Complex Venn Diagram",
sub = "Featuring: rotation and external lines",
main.cex = 2,
sub.cex = 1
);

## Not run:
# sample three-set Euler diagram
venn.plot <- venn.diagram(
x = list(
"Num A" = paste("Num", 1:100),
"Num B" = c(paste("Num", 61:70), paste("Num", 71:100)),
"Num C" = c(paste("Num", 41:60), paste("Num", 61:70))),
euler.d = TRUE,
filename = "Euler_3set_simple.tiff",
cat.pos = c(-20, 0, 20),
cat.dist = c(0.05, 0.05, 0.02),
cex = 2.5,
cat.cex = 2.5,
reverse = TRUE
);

# sample three-set Euler diagram
venn.plot <- venn.diagram(
x = list(
A = c(1:10),
B = c(11:90),
C = c(81:90)
),
euler.d = TRUE,
filename = "Euler_3set_scaled.tiff",
cex = 2.5,
cat.cex = 2.5,
cat.pos = 0
);

## End(Not run)

# sample four-set Venn Diagram
A <- sample(1:1000, 400, replace = FALSE);
B <- sample(1:1000, 600, replace = FALSE);
C <- sample(1:1000, 350, replace = FALSE);
D <- sample(1:1000, 550, replace = FALSE);
E <- sample(1:1000, 375, replace = FALSE);

venn.plot <- venn.diagram(
x = list(
A = A,
D = D,
B = B,
C = C
),
filename = "Venn_4set_pretty.tiff",
col = "transparent",
fill = c("cornflowerblue", "green", "yellow", "darkorchid1"),
alpha = 0.50,
label.col = c("orange", "white", "darkorchid4", "white",
"white", "white", "white", "white", "darkblue", "white",
"white", "white", "white", "darkgreen", "white"),
cex = 1.5,
fontfamily = "serif",
fontface = "bold",
cat.col = c("darkblue", "darkgreen", "orange", "darkorchid4"),
cat.cex = 1.5,
cat.pos = 0,
cat.dist = 0.07,
cat.fontfamily = "serif",
rotation.degree = 270,
margin = 0.2
);

# sample five-set Venn Diagram
venn.plot <- venn.diagram(
x = list(
A = A,
B = B,
C = C,
D = D,
E = E
),
filename = "Venn_5set_pretty.tiff",
col = "black",
fill = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
alpha = 0.50,
cex = c(1.5, 1.5, 1.5, 1.5, 1.5, 1, 0.8, 1, 0.8, 1, 0.8, 1, 0.8,
1, 0.8, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 0.55, 1, 1, 1, 1, 1, 1.5),
cat.col = c("dodgerblue", "goldenrod1", "darkorange1", "seagreen3", "orchid3"),
cat.cex = 1.5,
cat.fontface = "bold",
margin = 0.05
);

R_venn_VennDiagram_all.png


常用设置

  • 控制标题的位置:main.pos=c(0.5,1.05),其中的c(0.5,1.05)(x,y)坐标值
  • 设置标签在圆外面:cat.default.pos = "outer"

参考链接



UpSet plot

相当于把韦恩图展开,具体来讲就是使用柱状图表示交集的大小,连线表示具体哪些集合之间的交集。其存在的意义是在集合数目较多时,使用韦恩图就很难解读其中的信息,可视化的效果不好。

安装

1
2
3
4
# 安装包
install.packages("UpSetR")
# 导入包
library("UpSetR")

输入数据

UpSetR主要支持三种类型的数据输入:

  • 一种是数据框格式的数据,后续的集合intersects就是针对的列的信息;
  • 一种数据类型就是集合交集的向量,使用&表示集合之间存在intersects,数值表示intersects数目的大小
  • 最后一种就是非常常见的由若干named vector组成的list的形式UpSetR会自动找每个vector之间的交集。

dataframe

1
2
3
4
5
6
7
8
9
10
movies <- read.csv(system.file("extdata","movies.csv",package = "UpSetR"), header = TRUE, sep=";")

# 查看数据信息
movies[1:5,1:5]
Name ReleaseDate Action Adventure Children
1 Toy Story (1995) 1995 0 0 1
2 Jumanji (1995) 1995 0 1 1
3 Grumpier Old Men (1995) 1995 0 0 0
4 Waiting to Exhale (1995) 1995 0 0 0
5 Father of the Bride Part II (1995) 1995 0 0 0

数据是一个对电影进行分类的数据,可以分为动作片、冒险片以及各种其他类型的电影,有的电影可能只属于一种类型,而有的电影可能会同时属于好几种电影类型,比如这里的电影Jumanji,同时属于Adventure和Children。所以如果按照电影类别来对电影进行分类的话就会出现类别之间存在交集的情况。


交集表达式

1
2
expressionInput <- c(one = 2, two = 1, three = 2, `one&two` = 1, `one&three` = 4, 
`two&three` = 1, `one&two&three` = 2)

named vector

1
2
listInput <- list(one = c(1, 2, 3, 5, 7, 8, 11, 12, 13), two = c(1, 2, 4, 5, 
10), three = c(1, 5, 6, 7, 8, 9, 10, 12, 13))

简单使用

basic usage(官方文档)

数据框格式数据

1
2
upset(movies, nsets = 7, nintersects = 30, mb.ratio = c(0.5, 0.5),
order.by = c("freq", "degree"), decreasing = c(TRUE,FALSE))

关键参数:

  • nsets:按照数据集的size从大到小来选取使用的数据集数目,也就是使用多少个数据集来看之间的intersects情况;选中的数据集不会看其与未选中数据集的intersects情况;默认值是5
  • sets:指定需要看的数据集,如:sets=c("Action","Drama","Horror")会展示三者之间任意存在intersects情况
  • nintersects:可视化的intersects数目,默认是展示40个;如果想展示全部的intersects,可以将其设置为NA
  • keep.order:逻辑值,是否按照输入的sets顺序来排列sets,默认的FALSE,按照sets的大小来进行排列;在使用了sets参数下有效
  • mb.ratiobar plotsets matrix的比例
  • order.by:按照什么进行排序:
    • freq:分类别的intersects按照overlap数目的大小进行排序,比如集合独有的元素大小、两个集合交集的大小、三个集合交集的大小
    • degree:前面的freq是分类别的、在类别内部的排序,这里的degree是将上述排好序的类别再次进行排序
  • decreasing:是否降序,可以和前面的order.by参数对应

r_upset_simple1.png


交集表达式

使用fromExpression函数将其转化为数据框的格式:

1
upset(fromExpression(expressionInput), order.by = "freq")

r_upset_simple2.png

named vector

使用fromList函数将其转化为数据框的格式:

1
upset(fromList(listInput), order.by = "freq")

r_upset_simple2.png


设置查询并高亮显示

在intersects结果中可以高亮显示部分查询的结果,比如在电影分类的intersects中可以查看指定1970-1980之间的电影分类intersect情况:

1
2
3
4
5
6
7
8
9
# 指定年份的函数
# 写法类似apply只能怪函数的用法
between <- function(row, min, max){
newData <- (row["ReleaseDate"] < max) & (row["ReleaseDate"] > min)
}
# 将上述函数以及对应的参数传递给queries参数
upset(movies, sets = c("Drama", "Comedy", "Action", "Thriller", "Western", "Documentary"),
queries = list(list(query = intersects, params = list("Drama", "Action")),
list(query = between, params = list(1970, 1980), color = "red", active = TRUE)))

关键参数:

  • queries:Entered as a list that contains a list of queries
  • query:query的函数
  • params:传递给query函数的参数,list的形式,不是向量
  • color:查询结果的颜色
  • active:如果设置为TRUE,就会展示成堆叠柱状图的形式,如下图中的年份查询结果;如果设置为FALSE(默认),就会使用一个三角形表示查询结果的大小,如下图中”Drama”和”Action”两者相交的结果。
  • query.name:设置query.name,用于legend
  • query.legendlegend的位置,可以为query.legend = "top"

r_upset_queries1.png

Querying the Data(官方文档)

添加属性信息

除了展示分类信息的intersect之外,如果还想展示其他属性的信息,比如上面电影除了分类之外还有上映时间以及评分信息等,如果想添加这两个信息可以使用attribute.plots参数进行添加。

Attribute Plots(官方文档)

自带的绘图函数

attribute.plots内置有柱形图(histogram)、散点图(scatter_plot)等:

1
2
3
4
5
6
upset(movies,attribute.plots=list(gridrows=60,
plots=list(list(plot=scatter_plot, x="ReleaseDate", y="AvgRating"),
list(plot=scatter_plot, x="ReleaseDate", y="Watches"),
list(plot=scatter_plot, x="Watches", y="AvgRating"),
list(plot=histogram, x="ReleaseDate")),
ncols = 2))

r_upset_attributeplots1.png


自定义绘图函数

支持自定义ggplot2语法的绘图函数,并且可以和queries结合:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 自定义的柱状图
plot1 <- function(mydata, x){
myplot <- (ggplot(mydata, aes_string(x= x, fill = "color"))
+ geom_histogram() + scale_fill_identity()
+ theme(plot.margin = unit(c(0,0,0,0), "cm")))
}
# 自定义的散点图
plot2 <- function(mydata, x, y){
myplot <- (ggplot(data = mydata, aes_string(x=x, y=y, colour = "color"), alpha = 0.5)
+ geom_point() + scale_color_identity()
+ theme_bw() + theme(plot.margin = unit(c(0,0,0,0), "cm")))
}
# 这里可以设置queries参数来决定是不是将后面的queries信息应用到属性图中
attributeplots <- list(gridrows = 55,
plots = list(list(plot = plot1, x= "ReleaseDate", queries = FALSE),
list(plot = plot1, x= "ReleaseDate", queries = TRUE),
list(plot = plot2, x = "ReleaseDate", y = "AvgRating", queries = FALSE),
list(plot = plot2, x = "ReleaseDate", y = "AvgRating", queries = TRUE)),
ncols = 3)
# 绘图
upset(movies, attribute.plots = attributeplots,
queries = list(list(query = between, params = list(1920, 1940),query.name="query1"),
list(query = intersects, params = list("Drama"), color= "red",query.name="query2"),
list(query = elements, params = list("ReleaseDate", 1990, 1991, 1992),query.name="query3")),
query.legend = "top",
main.bar.color = "yellow")

r_upset_attributeplots2.png


添加meteadata

还可以添加信息对分类信息进行注释:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
sets <- names(movies[3:19])
avgRottenTomatoesScore <- round(runif(17, min = 0, max = 90))
metadata <- as.data.frame(cbind(sets, avgRottenTomatoesScore))
names(metadata) <- c("sets", "avgRottenTomatoesScore")
# 保证这一列信息是数字,用于后续绘图
metadata$avgRottenTomatoesScore <- as.numeric(as.character(metadata$avgRottenTomatoesScore))
head(metadata)
sets avgRottenTomatoesScore
1 Action 48
2 Adventure 31
3 Children 46
4 Comedy 71
5 Crime 40
6 Documentary 26
upset(movies, set.metadata = list(data = metadata,
plots = list(list(type = "hist",
column = "avgRottenTomatoesScore",
assign = 20))))

关键参数:

  • data:数据框格式的数据,第一列为集合的名称,后续的列为metadata信息
  • plots
    • type
      • 数据是数字:bar plot ("hist") 或者 heat map ("heat")
      • 数据是布尔型的:"bool" heat map
      • 数据的分类型的字符变量:heat map ("heat") 或者 text ("text")
    • column:画图的数据信息
    • assign:分配的metadate plot的大小,upset不加metadata之前是100x100的grid,这个如果设置为20,就会添加20用作画图,整个图就会变成100x120
    • colors:
      • bar plot:只会接受一个颜色
      • "heat" or "bool":针对每一个分类变量都有一个color

r_upset_meteadata1.png

Incorporating Set Metadata(官方文档)



-----本文结束感谢您的阅读-----

本文标题:R系列之维恩图和upset图的绘制

文章作者:showteeth

发布时间:2020年04月03日 - 14:10

最后更新:2020年04月04日 - 16:23

原始链接:http://showteeth.tech/posts/4408.html

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。

0%