Background

Named Entity Recognition (NER) is a fundamental task in information extraction that locates the mentions of named entities and classifies them (e.g., person, organization and location) in unstructured texts. The NER task has traditionally been solved as a sequence labeling problem, where entity boundaries and category labels are jointly predicted. Chinese NER is more difficult to process than English NER. Chinese language is logographic and provides no conventional features like capitalization. In addition, due to a lack of delimiters between characters, Chinese NER is correlated with word segmentation, and named entity boundaries are also word boundaries. However, incorrectly segmented entity boundaries will cause error propagation in NER. For example, in a particular context, a disease entity “思覺失調症” (schizophrenia) may be incorrectly segmented into three words: “思覺” (thinking and feeling), “失調” (disorder) and “症” (disease).

In the digital era, healthcare information-seeking users usually search and browse web content in click-through trails to obtain healthcare-related information before making a doctor’s appointment for diagnosis and treatment. Web texts are valuable sources to provide healthcare information such as health-related news, digital health magazines and medical question/answer forums. Domain-specific healthcare information includes many proper names, mainly as named entities, such as “葡萄糖六磷酸鹽去氫酶” (Glucose-6-Phosphate Dehydrogenase; G6PD), “電腦斷層掃描” (computed tomography; CT), and “靜脈免疫球蛋白注射” (intravenous immunoglobulin; IVIG). In summary, Chinese healthcare NER is an important and essential task in natural language processing to automatically identify healthcare entities such as symptoms, chemicals, diseases, and treatments for machine reading and understanding.

Following the ROCLING-2022 shared task focused on Chinese healthcare NER, we organize a MultiNER-Health shared task for multi-genre NER in the healthcare domain. In this shared task, we have three genres:

1. Formal texts (FT): this includes health news and articles written by professional editors or journalists.

2. Social media (SM): this contains texts from crowed users in medical question/answer forums.

3. Wikipedia articles (WA): this free online encyclopedia includes articles created and edited by volunteers worldwide.

Named entities may be used in different word forms in other genres. For example, “後天免疫缺乏症候群” (Acquired Immunodeficiency Syndrome; AIDS) is commonly used as a spoken language form “愛滋病” in the medical forums. On the other hand, “甘油三酯” is a different usage referred to as “三酸甘油酯” (triglyceride; TG) in Wikipedia.

Task Description

A total of 10 entity types are described and some examples are provided in Table I for Chinese healthcare NER. In this task, participants are asked to predict the named entity boundaries and categories for each given sentence. We use the common BIO (Beginning, Inside, and Outside) format for NER tasks. The B-prefix before a tag indicates that the character is the beginning of a named entity and I-prefix before a tag indicates that the character is inside a named entity. An O tag indicates that a token belongs to no named entity. Below are the example sentences.

Table 1. Named Entity Types
Entity Type	Description	Examples
Body (BODY)	The whole physical structure that forms a person or animal including biological cells, organizations, organs and systems.	“細胞核” (nucleus), “神經組織” (nerve tissue), “左心房” (left atrium), “脊髓” (spinal cord), “呼吸系統” (respiratory system)
Symptom (SYMP)	Any feeling of illness or physical or mental change that is caused by a particular disease.	“流鼻水” (rhinorrhea), “咳嗽” (cough), “貧血” (anemia), “失眠” (insomnia), “心悸” (palpitation), “耳鳴” (tinnitus)
Instrument (INST)	A tool or other device used for performing a particular medical task such as diagnosis and treatments.	“血壓計” (blood pressure meter), “達文西手臂” (DaVinci Robots), “體脂肪計” (body fat monitor), “雷射手術刀” (laser scalpel)
Examination (EXAM)	The act of looking at or checking something carefully in order to discover possible diseases.	“聽力檢查” (hearing test), “腦電波圖” (electroencephalography; EEG), “核磁共振造影” (magnetic resonance imaging; MRI)
Chemical (CHEM)	Any basic chemical element typically found in the human body.	“去氧核糖核酸” (deoxyribonucleic acid; DNA), “糖化血色素” (glycated hemoglobin), “膽固醇” (cholesterol), “尿酸” (uric acid)
Disease (DISE)	An illness of people or animals caused by infection or a failure of health rather than by an accident.	“小兒麻痺症” (poliomyelitis; polio), “帕金森氏症” (Parkinson’s disease), “青光眼” (glaucoma), “肺結核” (tuberculosis)
Drug (DRUG)	Any natural or artificially made chemical used as a medicine.	“阿斯匹靈” (aspirin), “普拿疼” (acetaminophen), “青黴素” (penicillin), “流感疫苗” (influenza vaccination)
Supplement (SUPP)	Something added to something else to improve human health.	“維他命” (vitamin), “膠原蛋白” (collagen), “益生菌” (probiotics), “葡萄糖胺” (glucosamine), “葉黃素” (lutein)
Treatment (TREAT)	A method of behavior used to treat diseases.	“藥物治療” (pharmacotherapy), “胃切除術” (gastrectomy), “標靶治療” (targeted therapy), “外科手術” (surgery)
Time (TIME)	Element of existence measured in minutes, days, years.	“嬰兒期” (infancy), “幼兒時期” (early childhood), “青春期” (adolescence), “生理期” (on one’s period), “孕期” (pregnancy)

Table 2. Shared Task Examples
Entity Type	Description	Input & Output
Formal Texts	Ex 1	Input: 早起也能預防老化，甚至降低阿茲海默症的風險 Output: O, O, O, O, O, O, B-SYMP, I-SYMP, O, O, O, O, O, B-DISE, I-DISE, I-DISE, I-DISE, I-DISE, O, O, O
Formal Texts	Ex 2	Input: 壓力、月經引起的痘痘患者 Output: B-SYMP, I-SYMP, O, B-TIME, I-TIME, O, O, O, B-DISE, I-DISE, O, O
Social Media	Ex 3	Input: 如何治療胃食道逆流症？ Output: O, O, O, O, B-DISE, I-DISE, I-DISE, I-DISE, I-DISE, I-DISE, O
Social Media	Ex 4	Input: 請問長期打善思達針劑是不是會變胖? Output: O, O, O, O, O, B-DRUG, I-DRUG, I-DRUG, I-DRUG, I-DRUG, O, O, O, O, B-SYMP, I-SYMP, O?
Wikipedia Articels	Ex 5	Input: 抗生素和維生素Ａ酸可用於口服治療痤瘡 Output: B-DRUG, I-DRUG, I-DRUG, O, B-DRUG, I-DRUG, I-DRUG, I-DRUG, B-DRUG, O, O, O, O, O, O, O, B-DISE, I-DISE (“痤瘡” is a formal usage of “痘痘” in the example 2 )
Wikipedia Articels	Ex 6	Input: 抑酸劑，又稱抗酸劑，抑制胃酸分泌，緩解燒心。 Output: B-CHEM, I-CHEM, I-CHEM, O, O, O, B-CHEM, I-CHEM, I-CHEM, O, O, O, B-CHEM, I-CHEM, O, O, O, O, O, B-DISE, I-DISE, O (“燒心” is the spoken language of “胃食道逆流症” in the example 3 )

Table 3. Datasets
	Training Set
Genre	Formal Texts	Social Media	Wikipedia Articles
#Sentence	23,008	7,684	3,205
#Character	1,109,918	403,570	118,116
#Named Entities	42,070	26,390	13,369
Data Sets	Chinese HealthNER Corpus (Lee and Lu, 2021)	CHNER Dataset (Lee et al., 2022)

Results

Table 4. Results
Team	Run#	F1-score (%)				Rank
Team	Run#	Fornal Texts	Social Media	Wikipedia Articles	Macro- Averaging	Rank
CrowNER [1]	Run 2	65.49	69.43	73.63	69.55	1
YNU-HPCC [2]	Run 2	61.96	71.11	72.13	68.40	2
ISLab [3]	Run 1	62.52	71.42	71.19	68.38	3
SCU-MESCLab [4]	Run 1	62.51	71.33	70.57	68.14	4
YNU-ISE-ZXW [5]	Run 3	62.79	70.22	70.37	67.79	5
LingX [6]	Run 2	51.23	59.28	60.54	57.02	6
Baseline [7] (Bilstm-CRF)	Word2vec	60.99	67.16	67.91	65.35	-
Baseline [7] (Bilstm-CRF)	BERT	61.08	70.77	72.54	68.13	-

[1] Yin-Chieh Wang, Wen-Hong Wu, Feng-Yu Kuo, Han-Chun Wu, Te-Yu Chi, Te-Lun Yang, Sheh Chen, and Jyh-Shing Roger Jang. 2023. CrowNER at ROCLING 2023 MultiNER-Health Task: enhancing NER task with GPT paraphrase augmentation on sparsely labeled data.

[2] Chonglin Pang, You Zhang, and Xiaobing Zhou. YUN-HPCC at ROCLING 2023 MultiNER-Health Task: a transformer-based approach for Chinese healthcare NER.

[3] Jun-Jie Wu, Tao-Hsing Chang, and Fu-Yuan Hsu. 2023. ISLab at ROCLING 2023. MultiNER-Health Task: a three-stage NER model combining textual content and label semantics.

[4] Tzu-En Su, Ruei-Cyuan Su, Ming-Hsiang Su, and Tsung-Hsien Yang. 2023. SCU-MESCLab at ROCLING 2023 MultiNER-Health Task: named entity recognition using multiple classifier model.

[5] Xingwei Zhang, Jin Wang, and Xuejie Zhang. 2023. YUN-ISE-ZXW at ROCLING 2023 MultiNER-Health Task: a transformer-based model with LoRA for Chinese healthcare named entity recognition.

[6] Xuelin Wang and Qihao Yang. 2023. LingX at ROCLING 2023 MultiNER-Health Task: intelligent capture of Chinese medical named entities by LLMs.

[7] Lung-Hao Lee, Chien-Huan Lu, and Tzu-Mi Lin. 2022. NCUEE-NLP at SemEval-2022. Task 11: Chinese named entity recognition using the BERT-BiLSTM-CRF model. In Proceedings of the 16th International Workshop on Semantic Evaluation. Association for Computational Linguistics, pages 1597-1602.

References

Rafael A. Calvo, and Sunghwan Mac Kim. 2013. Emotions in text: dimensional and categorical models. Computational Intelligence, 29(3):527-543.
Munmun De Choudhury, Scott Counts, and Michael Gamon. 2012. Not all moods are created equal! Exploring human emotional states in social media. In Proc. of ICWSM-12, pages 66-73.
Yu-Chih Deng, Cheng-Yu Tsai, Yih-Ru Wang, Sin-Horng Chen, and Lung-Hao Lee. 2022. Predicting Chinese Phrase-level Sentiment Intensity in Valence-Arousal Dimensions with Linguistic Dependency Features. IEEE Access, 10:126612-126620.
Yu-Chih Deng, Yih-Ru Wang, Sin-Horng Chen, and Lung-Hao Lee. 2023. Towards Transformer Fusions for Chinese Sentiment Intensity Prediction in Valence-Arousal Dimensions. IEEE Access, 11:109974-109982.
Steven Du and Xi Zhang. 2016. Aicyber’s system for IALP 2016 shared task: Character-enhanced word vectors and Boosted Neural Networks, In Proc. of IALP-16, pages 161–163.
Pranav Goel, Devang Kulshreshtha, Prayas Jain and Kaushal Kumar Shukla. 2017. Prayas at EmoInt 2017: An Ensemble of Deep Neural Architectures for Emotion Intensity Prediction in Tweets, In Proc. of WASSA-17, pages 58–65.
Sunghwan Mac Kim, Alessandro Valitutti, and Rafael A. Calvo. 2010. Evaluation of unsupervised emotion models to textual affect recognition. In Proc. of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 62-70.
Lung-Hao Lee, Jian-Hong Li, and Liang-Chih Yu. 2022. Chinese EmoBank: Building Valence-Arousal Resources for Dimensional Sentiment Analysis. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(4): Article 65, 1-18.
N. Malandrakis, A. Potamianos, E. Iosif, and S. Narayanan. 2013. Distributional semantic models for affective text analysis. IEEE Transactions on Audio, Speech, and Language Processing, 21(11): 2379-2392.
Myriam Munezero, Tuomo Kakkonen, and Calkin S. Montero. 2011. Towards automatic detection of antisocial behavior from texts. In Proc. of the Workshop on Sentiment Analysis where AI meets Psychology (SAAIP) at IJCNLP-11, pages 20-27.
Georgios Paltoglou, Mathias Theunis, Arvid Kappas, and Mike Thelwall. 2013. Predicting emotional responses to long informal text. IEEE Trans. Affective Computing, 4(1):106-115.
Jie Ren and Jeffrey V. Nickerson. 2014. Online review systems: How emotional language drives sales. In Proc. of AMCIS-14.
James A. Russell. 1980. A circumplex model of affect. Journal of Personality and Social Psychology, 39(6):1161.
Wen-Li Wei, Chung-Hsien Wu, and Jen-Chun Lin. 2011. A regression approach to affective rating of Chinese words from ANEW. In Proc. of ACII-11, pages 121-131.
Liang-Chih Yu, Cheng-Wei Lee, Huan-Yi Pan, Chih-Yueh Chou, Po-Yao Chao, Zhi-Hong Chen, Shu-Fen Tseng, Chien-Lung Chan and K. Robert Lai. 2018. Improving early prediction of academic failure using sentiment analysis on self-evaluated comments, Journal of Computer Assisted Learning, 34(4):358-365.
Liang-Chih Yu, Lung-Hao Lee, Shuai Hao, Jin Wang, Yunchao He, Jun Hu, K. Robert Lai, and Xuejie Zhang. 2016a. Building Chinese affective resources in valence-arousal dimensions. In Proc. of NAACL/HLT-16, pages 540-545.
Liang-Chih Yu, Lung-Hao Lee, Jin Wang and Kam-Fai Wong. 2017. IJCNLP-2017 Task 2: Dimensional sentiment analysis for Chinese phrases, In Proc. of IJCNLP-17, pages 9-16.
Liang-Chih Yu, Lung-Hao Lee and Kam-Fai Wong. 2016b. Overview of the IALP 2016 shared task on dimensional sentiment analysis for Chinese words, In Proc. of IALP-16, pages 156-160.
Liang-Chih Yu, Jin Wang, K. Robert Lai and Xuejie Zhang. 2020. Pipelined neural networks for phrase-level sentiment intensity prediction, IEEE Transactions on Affective Computing, 11(3), 447-458.
Liang-Chih Yu, Jin Wang, Bo Peng, Chu-Ren Huang. 2021. ROCLING-2021 shared task: dimensional sentiment analysis for educational texts, In Proc. of ROCLING-21, pages 385-388.
Jin Wang, Liang-Chih Yu, K. Robert Lai and Xuejie Zhang. 2016. Community-based weighted graph model for valence-arousal prediction of affective words, IEEE/ACM Trans. Audio, Speech and Language Processing, 24(11):1957-1968.
Jin Wang, Liang-Chih Yu, K. Robert Lai and Xuejie Zhang. 2020. Tree-structured regional CNN- LSTM model for dimensional sentiment analysis, IEEE/ACM Transactions on Audio Speech and Language Processing, 28, 581–591.
Chuhan Wu, Fangzhao Wu, Yongfeng Huang, Sixing Wu and Zhigang Yuan. 2017. THU NGN at IJCNLP-2017 Task 2: Dimensional sentiment analysis for Chinese phrases with deep LSTM, In Proc. of IJCNLP-17, pages 42-52.
Suyang Zhu, Shoushan Li and Guodong Zhou. 2019. Adversarial attention modeling for multi- dimensional emotion regression, In Proc. of ACL-19, pages 471–480.

ROCLING 2023 Shared Task 1
MultiNER-Health Chinese Multi-genre Named Entity Recognition in the Healthcare Domain

Organizers

Background

Task Description

Data

Training Set

Notes

Testing Set

Evaluation

Results

Important Dates

Baseline System

References

ROCLING 2023 Shared Task 1 MultiNER-Health Chinese Multi-genre Named Entity Recognition in the Healthcare Domain

Organizers

Background

Task Description

Data

Training Set

Notes

Testing Set

Evaluation

Results

Important Dates

Baseline System

References

ROCLING 2023 Shared Task 1
MultiNER-Health Chinese Multi-genre Named Entity Recognition in the Healthcare Domain