Conventional federated learning directly averaging model weights is only possible if all local models have the same model structure. Naturally, it poses a restrictive constraint for collaboration between models with heterogeneous architectures. Sharing prediction instead of weight remo