Our text classification datasets rely on expert manual annotation, covering emotion tendency, topic classification, industry attribution, intention labeling and multi-dimensional classification tasks. Massive samples include short comments, long articles, consultation dialogues, public opinion content and industry documents, with unified classification rules and standardized label systems.
We cover multilingual resources, balance sample quantity of each category, effectively reduce training data bias, and include edge and ambiguous samples to enhance model classification ability.