
文章插图
- Graph-level prediction 预测整图或子图的类别或性质
 

文章插图
HowWorkflow

文章插图
以fraud detection为例:软件栈
- Tabformer数据集
 
文章插图
- workflow
 
文章插图

文章插图
- 计算平面
 

文章插图
- 【GNN 101】数据平面
 

文章插图
SW ChallengesGraph SamplerFor
many small graphs datasets, full batch training works most time. Full batch training means we can do training on whole graph; When it comes to one large graph datasets, in many real scenarios, we meet Neighbor Explosion problem;Neighbor Explosion:Graph sampler comes to rescue. Only sample a fraction of target nodes, and furthermore, for each target node, we sample a sub-graph of its ego-network for training.This is called mini-batch training. Graph sampling is triggered for each data loading.And the hops of the sampled graph equals the GNN layer number . Which means graph sampler in data loader is important in GNN training.
文章插图

文章插图
Challenge: How to optimize sampler both as standalone and in training pipe?
When graph comes to huge(billions of nodes, tens of billions of edges), we meet new at-scale challenges:
- How to store the huge graph across node? -> graph partition
 - How to build a training system w/ not only distributed model computing but also distributed graph store and sampling?
- How to cut the graph while minimize cross partition connections?
 
 - How to cut the graph while minimize cross partition connections?
 

文章插图
A possible GNN distributed training architecture:

文章插图
Scatter-Gather
- Fuse adjacent graphs ops
One common fuse pattern for GCN & GraphSAGE:
Challenge: How to fuse more GNN patterns on different ApplyEdge and ApplyVertex,automatically?

文章插图
 - How to implement fused Aggregate 

文章插图
Challenge:
- Different graph data structureslead to different implementations in same logic operations;
 - Different graph characteristics favors different data structures;(like low-degree graphs favor COO, high-degree graphs favor CSR)
 - How to find the applicable zone for each and hide such complexity to data scientists?
 
 - Different graph data structureslead to different implementations in same logic operations;
 
- Inference challenge
- GNN inference needs full batch inference, how to make it efficient?
 - Distributed inference for big graph?
 - Vector quantization for node and edge features?
 - GNN distilled to MLP?
 
 - GNN inference needs full batch inference, how to make it efficient?
 - SW-HW co-design challenge
- How to relief irregular memory access in scatter-gather?
 - Do we need some data flow engine for acceleration?
 
 - How to relief irregular memory access in scatter-gather?
 - …
 

文章插图
References
- Graph + AI: What’s Next? Progress in Democratizing Graph for All
 - Recent Advances in Efficient and Scalable Graph Neural Networks
 - Crossing the Chasm – Technology adoption lifecycle
 - Understanding and Bridging the Gaps in Current GNN Performance Optimizations
 - Automatic Generation of High-Performance Inference Kernels for Graph Neural Networks on Multi-Core Systems
 - Understanding GNN Computational Graph: A Coordinated Computation, IO, And Memory Perspective
		  	
推荐阅读
- 知识图谱实体对齐2:基于GNN嵌入的方法
 - 101胶水用什么可以去除掉鞋子 101胶水用什么可以去除掉
 - 大连1101公交车可以刷手机吗 大连公交车可以刷手机支付宝吗?
 - 二进制1111101011011转换成十六进制是?
 - 美光汽车美容产品介绍 汽车美光d101是什么意思
 - 笔记本配个漫步者R101V音箱怎么样?适合听什么?
 - 华为1017新品几点发布
 - 101型电热鼓风干燥箱 电热鼓风干燥箱使用方法
 - 气压1015百帕是高还是低
 - hp1010彩色墨盒怎么使用
 
 



