深度学习或强化学习在组合优化方面有哪些应用

文章	评论	标签
135	150	23

作者：三天哥
链接：原文链接！直达！原文看起来更舒服，我搬运的这个没有考虑格式
来源：知乎
> 好的东西我总是怕人家删除！

Pointer Network:
Pointer Networks
arxiv.org/abs/1506.03134
这篇是之后几篇的基础了吧，提出了pointer decoding的方式，求解TSP问题。

Neural combinatorial optimization with reinforcement learning:
https://arxiv.org/pdf/1611.09940.pdfarxiv.org/pdf/1611.09940.pdf
代码实现参考：https://github.com/pemami4911/neural-combinatorial-rl-pytorchgithub.com/pemami4911/neural-combinatorial-rl-pytorch
higgsfield/np-hard-deep-reinforcement-learning
github.com/higgsfield/np-hard-deep-reinforcement-learning

Google的这篇借用pointer network加上attention mechanism，用policy gradient优化，actor-critic训练。也求解了knapsack的问题。我实现的经验来说，moving average的效果比actor-critic训练更稳定一些，actor-critic有时候会爆掉。这篇应该是被ICLR拒了。

Reinforcement learning for solving vehicle routing problem:
https://arxiv.org/pdf/1802.04240.pdfarxiv.org/pdf/1802.04240.pdf

代码实现参考：mveres01/pytorch-drl4vrpgithub.com/mveres01/pytorch-drl4vrp

Leigh发的这篇的基础是前面两篇，简化的pointer network的encoding过程，直接进行embedding。主要延伸到解VRP问题，也解了TSP问题，和之前的进行了对比。

Learning Combinatorial Optimization Algorithms over Graphs:
arxiv.org/abs/1704.01665
作者的github：Hanjun-Dai/graph_comb_optgithub.com/Hanjun-Dai/graph_comb_opt这篇先graph embedding的思路(structure to vector)，然后Reinforce to train。从small scale training transfer到large scale表现也还不错。作者是用c++写的，后来也发了pytorch版本，但是底层还是c++。我用pytorch实现了一下，但是原文中graph embedding也是属于训练的部分，在pytorch中backward会有问题。如果有深入了解的，可以交流一下。
Attention: Learn to solve routing problems!
https://openreview.net/pdf?id=ByxBFsRqYmopenreview.net/pdf?id=ByxBFsRqYm
原作者的github: wouterkool/attention-tspgithub.com/wouterkool/attention-tspICLR2019的一篇。这篇的整体思路也是encoder(一个复杂的attention)+decoder。这篇包罗万象，解了各种tsp和vrp变种，还有其它，也和pointer network做了对比。我初略地看了一下，没有细看。
仓储优化中的beer game问题：https://arxiv.org/pdf/1708.05924.pdfarxiv.org/pdf/1708.05924.pdf用deep reinforcement learning的方法，分别对4个agent(manufacturing, distributor, warehouse, retailer) 建立network，然后用一个feedback scheme去让agent向一个目标前进。综上，TSP和VRP问题主要的思路是encoder + decoder。encoder有很多方法，graph embedding, attention, glimpse, multi-head等，decoder也同样。但是训练好的model的robustness还有待考验。有相关研究的同学，可以一起交流一下。个人github: jingw2 - Overviewgithub.com/jingw2

深度学习或强化学习在组合优化方面有哪些应用

Y.H.2HANG