Implementation of various RAT approaches described in Hoang et al., Improving Retrieval Augmented Neural Machine Translation by Controlling Source and Fuzzy-Match Interactions, Findings of EACL 2023
and
a new approach (RAT-SEPD), where D_ENC shares the self-attention and ffn weights with DEC
Model | Module | INPUT |
---|---|---|
Baseline | ENC | SRC |
DEC | TGT | |
RAT-CAT | ENC | SRC + tCtx_1 + ... + tCtx_k |
DEC | TGT | |
RAT-SEP | ENC | SRC |
ENC2 | tCtx_1 | |
... | ||
tCtx_k | ||
DEC | TGT | |
RAT-SI | ENC | SRC |
tCtx_1 + SRC | ||
... | ||
tCtx_k + SRC | ||
DEC | TGT | |
RAT-SEPD | ENC | SRC |
D_ENC | tCtx_1 | |
... | ||
tCtx_k | ||
DEC | TGT |
RAT-CAT and RAT-SI utilizes an additional token '⦅SENSEP⦆' to mark the boundaries between sentences in the encoder input.
Note that RAT-SI is implemented in a slightly different way than the paper (tCtx_k + SRC instead of SRC + tCtx_k), but the result should be identical.
onmt-main --model src/BaselineTransformer.py --config config/wmt14_ende_BaselineTransformer.yml --auto_config train --with_eval > run/wmt14_ende_BaselineTransformer.log 2>&1
onmt-main --model src/BaselineTransformer.py --config config/wmt14_ende_RatCATTransformer.yml --auto_config train --with_eval > run/wmt14_ende_RatCATTransformer.log 2>&1
onmt-main --model src/RatSEPTransformer.py --config config/wmt14_ende_RatSEPTransformer.yml --auto_config train --with_eval > run/wmt14_ende_RatSEPTransformer.log 2>&1
onmt-main --model src/RatSITransformer.py --config config/wmt14_ende_RatSITransformer.yml --auto_config train --with_eval > run/wmt14_ende_RatSITransformer.log 2>&1
onmt-main --model src/RatSEPDTransformer.py --config config/wmt14_ende_RatSEPDTransformer.yml --auto_config train --with_eval > run/wmt14_ende_RatSEPDTransformer.log 2>&1
Using top 3 contexts:
onmt-main --model src/RatSITransformer_top3.py --config config/wmt14_ende_RatSITransformer_top3.yml --auto_config train --with_eval > run/wmt14_ende_RatSITransformer_top3.log 2>&1 &