ResearchHub | Open Science Community

A Survey on Deep Learning for Named Entity Recognition

Jing Li et al.Mar 17, 2020

Named entity recognition (NER) is the task to identify mentions of rigid designators from text belonging to predefined semantic types such as person, location, organization etc. NER always serves as the foundation for many natural language applications such as question answering, text summarization, and machine translation. Early NER systems got a huge success in achieving good performance with the cost of human engineering in designing domain-specific features and rules. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.

Artificial Intelligence

Paleontology

0

Paper

Artificial Intelligence

976

0

Save

0

Time-aware point-of-interest recommendation

Quan Yuan et al.Jul 28, 2013

The availability of user check-in data in large volume from the rapid growing location based social networks (LBSNs) enables many important location-aware services to users. Point-of-interest (POI) recommendation is one of such services, which is to recommend places where users have not visited before. Several techniques have been recently proposed for the recommendation service. However, no existing work has considered the temporal information for POI recommendations in LBSNs. We believe that time plays an important role in POI recommendations because most users tend to visit different places at different time in a day, \eg visiting a restaurant at noon and visiting a bar at night. In this paper, we define a new problem, namely, the time-aware POI recommendation, to recommend POIs for a given user at a specified time in a day. To solve the problem, we develop a collaborative recommendation model that is able to incorporate temporal information. Moreover, based on the observation that users tend to visit nearby POIs, we further enhance the recommendation model by considering geographical information. Our experimental results on two real-world datasets show that the proposed approach outperforms the state-of-the-art POI recommendation methods substantially.

Artificial Intelligence

Information Systems

0

Paper

Artificial Intelligence

709

0

Save

0

Exploiting Geographical Neighborhood Characteristics for Location Recommendation

Yong Liu et al.Nov 3, 2014

Geographical characteristics derived from the historical check-in data have been reported effective in improving location recommendation accuracy. However, previous studies mainly exploit geographical characteristics from a user's perspective, via modeling the geographical distribution of each individual user's check-ins. In this paper, we are interested in exploiting geographical characteristics from a location perspective, by modeling the geographical neighborhood of a location. The neighborhood is modeled at two levels: the instance-level neighborhood defined by a few nearest neighbors of the location, and the region-level neighborhood for the geographical region where the location exists. We propose a novel recommendation approach, namely Instance-Region Neighborhood Matrix Factorization (IRenMF), which exploits two levels of geographical neighborhood characteristics: a) instance-level characteristics, i.e., nearest neighboring locations tend to share more similar user preferences; and b) region-level characteristics, i.e., locations in the same geographical region may share similar user preferences. In IRenMF, the two levels of geographical characteristics are naturally incorporated into the learning of latent features of users and locations, so that IRenMF predicts users' preferences on locations more accurately. Extensive experiments on the real data collected from Gowalla, a popular LBSN, demonstrate the effectiveness and advantages of our approach.

Artificial Intelligence

Signal Processing

0

Paper

Artificial Intelligence

316

0

Save

0

Twevent

Chenliang Li et al.Oct 29, 2012

Event detection from tweets is an important task to understand the current events/topics attracting a large number of common users. However, the unique characteristics of tweets (e.g. short and noisy content, diverse and fast changing topics, and large data volume) make event detection a challenging task. Most existing techniques proposed for well written documents (e.g. news articles) cannot be directly adopted. In this paper, we propose a segment-based event detection system for tweets, called Twevent. Twevent first detects bursty tweet segments as event segments and then clusters the event segments into events considering both their frequency distribution and content similarity. More specifically, each tweet is split into non-overlapping segments (i.e. phrases possibly refer to named entities or semantically meaningful information units). The bursty segments are identified within a fixed time window based on their frequency patterns, and each bursty segment is described by the set of tweets containing the segment published within that time window. The similarity between a pair of bursty segments is computed using their associated tweets. After clustering bursty segments into candidate events, Wikipedia is exploited to identify the realistic events and to derive the most newsworthy segments to describe the identified events. We evaluate Twevent and compare it with the state-of-the-art method using 4.3 million tweets published by Singapore-based users in June 2010. In our experiments, Twevent outperforms the state-of-the-art method by a large margin in terms of both precision and recall. More importantly, the events detected by Twevent can be easily interpreted with little background knowledge because of the newsworthy segments. We also show that Twevent is efficient and scalable, leading to a desirable solution for event detection from tweets.

Artificial Intelligence

Information Systems

0

Paper

Artificial Intelligence

314

0

Save

0

Topic Modeling for Short Texts with Auxiliary Word Embeddings

Chenliang Li et al.Jul 7, 2016

For many applications that require semantic understanding of short texts, inferring discriminative and coherent latent topics from short texts is a critical and fundamental task. Conventional topic models largely rely on word co-occurrences to derive topics from a collection of documents. However, due to the length of each document, short texts are much more sparse in terms of word co-occurrences. Data sparsity therefore becomes a bottleneck for conventional topic models to achieve good results on short texts. On the other hand, when a human being interprets a piece of short text, the understanding is not solely based on its content words, but also her background knowledge (e.g., semantically related words). The recent advances in word embedding offer effective learning of word semantic relations from a large corpus. Exploiting such auxiliary word embeddings to enrich topic modeling for short texts is the main focus of this paper. To this end, we propose a simple, fast, and effective topic model for short texts, named GPU-DMM. Based on the Dirichlet Multinomial Mixture (DMM) model, GPU-DMM promotes the semantically related words under the same topic during the sampling process by using the generalized Polya urn (GPU) model. In this sense, the background knowledge about word semantic relatedness learned from millions of external documents can be easily exploited to improve topic modeling for short texts. Through extensive experiments on two real-world short text collections in two languages, we show that GPU-DMM achieves comparable or better topic representations than state-of-the-art models, measured by topic coherence. The learned topic representation leads to the best accuracy in text classification task, which is used as an indirect evaluation.

Philosophy

Artificial Intelligence

0

Paper

Save

TwiNER

Chenliang Li et al.Aug 12, 2012

Many private and/or public organizations have been reported to create and monitor targeted Twitter streams to collect and understand users' opinions about the organizations. Targeted Twitter stream is usually constructed by filtering tweets with user-defined selection criteria e.g. tweets published by users from a selected region, or tweets that match one or more predefined keywords. Targeted Twitter stream is then monitored to collect and understand users' opinions about the organizations. There is an emerging need for early crisis detection and response with such target stream. Such applications require a good named entity recognition (NER) system for Twitter, which is able to automatically discover emerging named entities that is potentially linked to the crisis. In this paper, we present a novel 2-step unsupervised NER system for targeted Twitter stream, called TwiNER. In the first step, it leverages on the global context obtained from Wikipedia and Web N-Gram corpus to partition tweets into valid segments (phrases) using a dynamic programming algorithm. Each such tweet segment is a candidate named entity. It is observed that the named entities in the targeted stream usually exhibit a gregarious property, due to the way the targeted stream is constructed. In the second step, TwiNER constructs a random walk model to exploit the gregarious property in the local context derived from the Twitter stream. The highly-ranked segments have a higher chance of being true named entities. We evaluated TwiNER on two sets of real-life tweets simulating two targeted streams. Evaluated using labeled ground truth, TwiNER achieves comparable performance as with conventional approaches in both streams. Various settings of TwiNER have also been examined to verify our global context + local context combo idea.

Artificial Intelligence

Communication

0

Paper

Artificial Intelligence

244

0

Save

0

Graph-based Point-of-interest Recommendation with Geographical and Temporal Influences

Quan Yuan et al.Nov 3, 2014

The availability of user check-in data in large volume from the rapid growing location-based social networks (LBSNs) enables a number of important location-aware services. Point-of-interest (POI) recommendation is one of such services, which is to recommend POIs that users have not visited before. It has been observed that: (i) users tend to visit nearby places, and (ii) users tend to visit different places in different time slots, and in the same time slot, users tend to periodically visit the same places. For example, users usually visit a restaurant during lunch hours, and visit a pub at night. In this paper, we focus on the problem of time-aware POI recommendation, which aims at recommending a list of POIs for a user to visit at a given time. To exploit both geographical and temporal influences in time aware POI recommendation, we propose the Geographical-Temporal influences Aware Graph (GTAG) to model check-in records, geographical influence and temporal influence. For effective and efficient recommendation based on GTAG, we develop a preference propagation algorithm named Breadth first Preference Propagation (BPP). The algorithm follows a relaxed breath-first search strategy, and returns recommendation results within at most 6 propagation steps. Our experimental results on two real-world datasets show that the proposed graph-based approach outperforms state-of-the-art POI recommendation methods substantially.

Artificial Intelligence

Theoretical Computer Science

0

Paper

Artificial Intelligence

242

0

Save

0

On predicting the popularity of newly emerging hashtags in Twitter

Zongyang Ma et al.May 8, 2013

Because of T witter's popularity and the viral nature of information dissemination on T witter, predicting which T witter topics will become popular in the near future becomes a task of considerable economic importance. Many T witter topics are annotated by hashtags. In this article, we propose methods to predict the popularity of new hashtags on T witter by formulating the problem as a classification task. We use five standard classification models (i.e., N aïve bayes, k ‐nearest neighbors, decision trees, support vector machines, and logistic regression) for prediction. The main challenge is the identification of effective features for describing new hashtags. We extract 7 content features from a hashtag string and the collection of tweets containing the hashtag and 11 contextual features from the social graph formed by users who have adopted the hashtag. We conducted experiments on a T witter data set consisting of 31 million tweets from 2 million S ingapore‐based users. The experimental results show that the standard classifiers using the extracted features significantly outperform the baseline methods that do not use these features. Among the five classifiers, the logistic regression model performs the best in terms of the M icro‐ F 1 measure. We also observe that contextual features are more effective than content features.

Artificial Intelligence

Social Psychology

0

Paper

Artificial Intelligence

233

0

Save

0

Who, where, when and what

Quan Yuan et al.Aug 11, 2013

Micro-blogging services, such as Twitter, and location-based social network applications have generated short text messages associated with geographic information, posting time, and user ids. The availability of such data received from users offers a good opportunity to study the user's spatial-temporal behavior and preference. In this paper, we propose a probabilistic model W4 (short for Who+Where+When+What) to exploit such data to discover individual users' mobility behaviors from spatial, temporal and activity aspects. To the best of our knowledge, our work offers the first solution to jointly model individual user's mobility behavior from the three aspects. Our model has a variety of applications, such as user profiling and location prediction; it can be employed to answer questions such as ``Can we infer the location of a user given a tweet posted by the user and the posting time?" Experimental results on two real-world datasets show that the proposed model is effective in discovering users' spatial-temporal topics, and outperforms state-of-the-art baselines significantly for the task of location prediction for tweets.

Artificial Intelligence

Signal Processing

0

Paper

Artificial Intelligence

220

0

Save

0

A Survey of Location Prediction on Twitter

Xin Zheng et al.Feb 20, 2018

Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.

Artificial Intelligence

Signal Processing

0

Paper

Artificial Intelligence

218

0

Save