Learned, data-dependent weighting of neighbors on a graph. Vague umbrella term. Usually means one of:

Additive scoring (per-edge MLP or weight vector, sparse, over neighbors only):
GAT, GATv2

Dot-product scoring (QKV, often dense/full attention):
graph transformer


attention (ML), GNN