Embedding on PMEM
将Embedding Variable存到PMEM的好处
DeepRec拥有超大规模分布式训练能力,支持万亿样本模型训练和千亿Embedding Processing。稀疏模型中90%以上的数据是Embedding Variable,用于具有超大规模特征的大规模训练。在这种情况下,内存容量成为瓶颈之一。将Embedding Variable存到 PMEM 将带来以下好处:
提高大规模分布式训练的内存存储能力;
降低 TCO;
将PMEM配置成内存模式来保存Embedding Variable
通过开源程序ipmctl将物理机上的持久内存配置成百分之百内存模式:
# ipmctl create -goal memorymode=100
注:在虚拟机客户机实例(如re7p)里无法将物理机的PMEM配置成内存模式。 重启后系统的可用内存就为所有的PMEM大小,这时Embedding Variable就会存到PMEM中。
将PMEM配置成应用直接访问FSDAX模式来保存Embedding Variable
裸金属上配置PMEM为FSDAX模式:
# ipmctl create -goal persistentmemorytype=appdirect
# reboot
# ndctl create-namespace --region region0 --mode fsdax
# mkfs.ext4 /dev/pmem0
# mount -o dax /dev/pmem0 /mnt/pmem0
配置完,请检查FSDAX模式是否设置成功:
# ipmctl show -memoryresources
MemoryType | DDR | PMemModule | Total
==========================================================
Volatile | 256.000 GiB | 0.000 GiB | 256.000 GiB
AppDirect | - | 1008.000 GiB | 1008.000 GiB
Cache | 0.000 GiB | - | 0.000 GiB
Inaccessible | 0.000 GiB | 5.937 GiB | 5.937 GiB
Physical | 256.000 GiB | 1013.937 GiB | 1269.937 GiB
# ndctl list -NR
{
"regions":[
{
"dev":"region0",
"size":1082331758592,
"available_size":0,
"max_available_extent":0,
"type":"pmem",
"iset_id":9218623383794094352,
"persistence_domain":"memory_controller",
"namespaces":[
{
"dev":"namespace0.0",
"mode":"fsdax",
"map":"dev",
"size":1065418227712,
"uuid":"c5c8759c-abb8-4f75-a402-2bbdba76ebf0",
"sector_size":512,
"align":2097152,
"blockdev":"pmem0"
}
]
}
]
}
阿里云虚拟机实例re7p上配置Host PMEM为FSDAX模式:
以下命令以实例规格ecs.re7p.16xlarge(https://help.aliyun.com/document_detail/25378.html?spm=a2c4g.11186623.6.605.68ec600d0TJFNo#re7p) 为例。
# vi /etc/default/grub
在最后加上以下一行:
GRUB_CMDLINE_LINUX="memmap=1008G!257G"
# sudo grub2-mkconfig -o /boot/grub2/grub.cfg
# reboot
# sudo mkfs.ext4 /dev/pmem0
# mkdir /mnt/pmem0
# mount -o dax /dev/pmem0 /mnt/pmem0
FSDAX模式下在Docker 容器里编译和安装DeepRec:
# docker pull registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-cpu-py36-ubuntu18.04
# git clone https://github.com/alibaba/DeepRec.git
# git clone https://github.com/memkind/memkind.git
# docker run -it -name test -v /host_code_path:/work -v /mnt/pmem0:/mnt/pmem0 --privileged registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-cpu-py36-ubuntu18.04 /bin/bash
# apt update
# apt install libpmem-dev gzip numactl gdb autoconf -y
# pip install pandas
# pip install numpy==1.16.0
# cd /root/memkind/
# ./autogen.sh && ./configure
# make clean;make -j;make install
# cd /work/DeepRec
# export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
# ./configure
Do you wish to build TensorFlow with PMEM support? [y/N]: y
bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --copt="-L/usr/local/lib" --copt="-lpmem" --copt="-lmemkind" --config=opt //tensorflow/tools/pip_package:build_pip_package
# ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+deeprec2201-cp36-cp36m-linux_x86_64.whl
将PMEM配置成NUMA节点来保存Embedding Variable
裸金属上配置PMEM为NUMA节点:
请安装v66以上版本的ndctl和 daxctl,在PMEM上创建devdax模式的命名空间,将持久内存从devdax模式重新配置成system-ram模式,如果有2个socket,请对于socket 1上的PMEM执行类似的操作。
# ndctl create-namespace --mode=devdax --map=mem
# daxctl reconfigure-device --mode=system-ram --region=0 dax0.0
在此操作之后,持久内存被配置为一个单独的 NUMA 节点,并且可以用作易失性内存。
# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
node 0 size: 191904 MB
node 0 free: 109899 MB
node 1 cpus:
node 1 size: 759808 MB
node 1 free: 759807 MB
node distances:
node 0 1
0: 10 17
1: 17 10
阿里云虚拟机实例re7p上配置Host PMEM为NUMA节点:
以下命令以实例规格ecs.re7p.16xlarge(https://help.aliyun.com/document_detail/25378.html?spm=a2c4g.11186623.6.605.68ec600d0TJFNo#re7p) 为例。
# vi /etc/default/grub
删去以下一行:
GRUB_CMDLINE_LINUX="memmap=1008G!257G"
# sudo grub2-mkconfig -o /boot/grub2/grub.cfg
# reboot
[root@iZ2zei09caif72ul6x3iaiZ ~]# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
node 0 size: 249568 MB
node 0 free: 247930 MB
node 1 cpus:
node 1 size: 1016062 MB
node 1 free: 1015842 MB
KMEM DAX模式下在Docker 容器里编译和安装DeepRec:
# docker pull registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-cpu-py36-ubuntu18.04
# git clone https://github.com/alibaba/DeepRec.git
# git clone https://github.com/memkind/memkind.git
# docker run -i -t -v /root:/root --privileged registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-developer:deeprec-dev-cpu-py36-ubuntu18.04 /bin/bash
# apt update
# apt install libpmem-dev gzip numactl gdb autoconf -y
# pip install pandas
# pip install numpy==1.16.0
# cd /root/memkind/
# ./autogen.sh && ./configure
# make clean;make -j;make install
# cd /work/DeepRec
# export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
# export MEMKIND_DAX_KMEM_NODES=1
# ./configure
Do you wish to build TensorFlow with PMEM support? [y/N]: y
bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --host_cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" -c opt --copt="-L/usr/local/lib" --copt="-lmemkind" --config=opt //tensorflow/tools/pip_package:build_pip_package
# ./bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
# pip3 install /tmp/tensorflow_pkg/tensorflow-1.15.5+deeprec2201-cp36-cp36m-linux_x86_64.whl
将Embedding Variable存到PMEM上验证WDL模型性能
在PMEM内存模式上运行WDL Stand-alone Training
1. 用户通过命令设置WDL模型Embedding Variable的存储类型为DRAM;
2. 执行WDL train过程。
在PMEM FSDAX模式上运行WDL Stand-alone Training
1. 用户通过命令设置WDL模型Embedding Variable的存储类型为PMEM_LIBPMEM,设置存储路径指向mount的持久内存目录,设置持久内存上存储数据占用空间大小;
2. 执行WDL train过程。
在PMEM KMEM DAX模式上运行WDL Stand-alone Training
1. 用户通过命令设置WDL模型Embedding Variable的存储类型为PMEM_MEMKIND;
2. 执行WDL train过程。