Learned, data-dependent weighting of neighbors on a graph. Vague umbrella term. Usually means one of:
Additive scoring (per-edge MLP or weight vector, sparse, over neighbors only):
GAT, GATv2
Dot-product scoring (QKV, often dense/full attention):
graph transformer