A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@gmail.com&api_key=61f08fa0b96a73de8c900d749fcb997acc09&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 197

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 197
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 271
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3165
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 597
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 511
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 317
Function: require_once

Building a Korean morphological analyzer using two Korean BERT models. | LitMetric

Building a Korean morphological analyzer using two Korean BERT models.

PeerJ Comput Sci

Department of Radio and Information Communications Engineering, Chungnam National University, Daejeon, South Korea.

Published: May 2022


Category Ranking

98%

Total Visits

921

Avg Visit Duration

2 minutes

Citations

20

Article Abstract

A morphological analyzer plays an essential role in identifying functional suffixes of Korean words. The analyzer input and output differ from each other in their length and strings, which can be dealt with by an encoder-decoder architecture. We adopt a Transformer architecture, which is an encoder-decoder architecture with self-attention rather than a recurrent connection, to implement a Korean morphological analyzer. Bidirectional Encoder Representations from Transformers (BERT) is one of the most popular pretrained representation models; it can present an encoded sequence of input words, considering contextual information. We initialize both the Transformer encoder and decoder with two types of Korean BERT, one of which is pretrained with a raw corpus, and the other is pretrained with a morphologically analyzed dataset. Therefore, implementing a Korean morphological analyzer based on Transformer is a fine-tuning process with a relatively small corpus. A series of experiments proved that parameter initialization using pretrained models can alleviate the chronic problem of a lack of training data and reduce the time required for training. In addition, we can determine the number of layers required for the encoder and decoder to optimize the performance of a Korean morphological analyzer.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9137944PMC
http://dx.doi.org/10.7717/peerj-cs.968DOI Listing

Publication Analysis

Top Keywords

morphological analyzer
20
korean morphological
16
korean bert
8
encoder-decoder architecture
8
encoder decoder
8
analyzer
6
korean
6
morphological
5
building korean
4
analyzer korean
4

Similar Publications