BriefGPT.xyz
Jun, 2020
AMALGUM -- 一个免费,平衡,多层次的英语网络语料库
AMALGUM -- A Free, Balanced, Multilayer English Web Corpus
HTML
PDF
Luke Gessler, Siyao Peng, Yang Liu, Yilun Zhu, Shabnam Behzad...
TL;DR
研究介绍了一个自由可用的英语网络语料库,使用高质量的自动注释层来提供大规模的替代手动创建注释数据集,并评估了结果的准确性。
Abstract
We present a freely available, genre-balanced
english web corpus
totaling 4M tokens and featuring a large number of high-quality
automatic annotation
layers, including dependency trees, non-named entity annotatio
→