Pretrained Models - Diagnostic Report Analysis: General Optimization of NLP

Diagnostic Report Analysis: General Optimization of NLP Banner

Multiple model architectures were pretrained on Dutch clinical reports using a masked language modeling (MLM) objective. The architectures are BERT base, RoBERTa base, RoBERTa large, Longformer base, and Longformer large. These were pretrained with three pretraining strategies: general-domain pretraining, domain-specific pretraining, or mixed-domain pretraining. Please check out the paper [PENDING] and the model cards below for more information.

Model	#params	Language
joeranbosma/dragon-bert-base-mixed-domain	109M	Dutch → Dutch
joeranbosma/dragon-roberta-base-mixed-domain	278M	Multiple → Dutch
joeranbosma/dragon-roberta-large-mixed-domain	560M	Multiple → Dutch
joeranbosma/dragon-longformer-base-mixed-domain	149M	English → Dutch
joeranbosma/dragon-longformer-large-mixed-domain	435M	English → Dutch
joeranbosma/dragon-bert-base-domain-specific	109M	Dutch
joeranbosma/dragon-roberta-base-domain-specific	278M	Dutch
joeranbosma/dragon-roberta-large-domain-specific	560M	Dutch
joeranbosma/dragon-longformer-base-domain-specific	149M	Dutch
joeranbosma/dragon-longformer-large-domain-specific	435M	Dutch