6 min read
On this page

Semantic Analysis

Overview

Semantic analysis moves beyond syntactic structure to understand meaning. It encompasses word-level disambiguation, sentence-level inference, structured meaning representations, and opinion mining. These tasks require models to capture nuance, context, and world knowledge.


Word Sense Disambiguation (WSD)

Determining which sense of a polysemous word is intended in context.

Example: "bank" can mean a financial institution, a river bank, or the act of banking an airplane.

Approaches

Method Description
Lesk algorithm Choose the sense whose dictionary definition has the most word overlap with the context
Knowledge-based Use WordNet relations (hypernyms, synonyms) and graph algorithms (PageRank on sense graphs)
Supervised Train a classifier per word on sense-annotated corpora (SemCor)
Neural Fine-tune BERT to select the correct sense from candidate definitions

Resources

  • WordNet: Lexical database organizing words into synonym sets (synsets) linked by semantic relations (hypernymy, meronymy, etc.)
  • SemCor: Largest manually sense-annotated corpus (~230k annotations)
  • BabelNet: Multilingual encyclopedic dictionary combining WordNet and Wikipedia

Current state: BERT-based WSD models achieve ~80% F1 on the ALL evaluation framework, approaching the estimated inter-annotator agreement ceiling.


Semantic Similarity

Measuring how similar two text units are in meaning.

Word-Level Similarity

  • Path-based (WordNet): Shortest path between synsets in the taxonomy
  • Information content: Probability of the least common subsumer (Lin similarity, Jiang-Conrath)
  • Embedding-based: Cosine similarity of word vectors

Sentence-Level Similarity (STS)

Predict a continuous similarity score (0-5) between sentence pairs.

Method Approach
Word overlap Jaccard similarity, BLEU
Embedding average Average word embeddings, compute cosine
SBERT Siamese BERT with cosine similarity
Cross-encoder Concatenate both sentences in BERT, regress score

Cross-encoders are more accurate but O(n^2) for pairwise comparisons. Bi-encoders (SBERT) encode once and compare cheaply.


Semantic Parsing

Translates natural language into a formal meaning representation.

Abstract Meaning Representation (AMR)

A rooted, directed, acyclic graph representing sentence meaning.

"The boy wants to go"

(w / want-01
   :ARG0 (b / boy)
   :ARG1 (g / go-02
      :ARG0 b))
  • Nodes: concepts (word senses, PropBank framesets)
  • Edges: semantic roles (ARG0, ARG1, etc.)
  • Abstracts away syntax: "the boy's desire to go" has the same AMR
  • Challenges: reentrancy (nodes with multiple parents), alignment to text
  • Neural AMR parsers: sequence-to-graph models achieve ~85 Smatch F1

Text-to-SQL

Converts natural language questions into SQL queries over a database.

"How many employees earn more than 50000?"
->
SELECT COUNT(*) FROM employees WHERE salary > 50000

Key challenges:

  • Schema linking: mapping mentions to tables and columns
  • Complex queries: joins, subqueries, aggregation, GROUP BY
  • Generalization to unseen databases

Approaches:

  • Grammar-based decoding: constrain output to valid SQL
  • Schema encoding: represent table/column names alongside the question
  • Spider benchmark: cross-database evaluation (200 databases)
  • Current SOTA: LLM-based approaches with schema prompting achieve ~85% execution accuracy on Spider

Natural Language Inference (NLI)

Determine the logical relationship between a premise and hypothesis.

Label Meaning Example
Entailment Premise implies hypothesis P: "A dog runs in a park" -> H: "An animal is outside"
Contradiction Premise contradicts hypothesis P: "A dog runs in a park" -> H: "No animals are present"
Neutral Neither entails nor contradicts P: "A dog runs in a park" -> H: "The dog is chasing a ball"

Datasets

Dataset Size Source Notes
SNLI 570k Image captions Simpler, less natural
MultiNLI 433k Multiple genres More diverse
ANLI 163k Adversarially collected Harder examples
XNLI 7.5k per language MultiNLI translations 15 languages

Models

  • Decomposable attention: Align words between premise and hypothesis, aggregate
  • ESIM: BiLSTM with cross-sentence attention
  • BERT cross-encoder: Concatenate [CLS] premise [SEP] hypothesis [SEP], classify
  • BERT achieves ~92% on SNLI, ~90% on MultiNLI
  • NLI models are used as general-purpose zero-shot classifiers (hypothesis = category description)

Sentiment Analysis

Determining the opinion, emotion, or attitude expressed in text.

Levels of Granularity

Level Task Example
Document Overall sentiment Movie review: positive/negative
Sentence Sentence polarity "The food was great but service was slow"
Aspect-based Per-aspect sentiment food: positive, service: negative
Targeted Sentiment toward entity "I like iPhone but hate Android"

Lexicon-Based Approaches

  • VADER: Rule-based, handles social media conventions (capitalization, emoticons, intensifiers)
  • SentiWordNet: Assigns positivity/negativity scores to WordNet synsets
  • AFINN, Opinion Lexicon: Curated word lists with polarity scores
  • Fast but limited by vocabulary coverage and inability to handle negation/sarcasm robustly

Machine Learning Approaches

Classical ML:

  • Features: unigrams/bigrams, POS tags, negation scope, lexicon scores
  • Classifiers: SVM, logistic regression, Naive Bayes
  • Effective for well-defined domains with labeled data

Deep Learning:

  • CNN over word embeddings captures local n-gram patterns
  • BiLSTM captures sequential context and long-range negation
  • Attention mechanisms highlight relevant words

Pretrained Transformers:

  • Fine-tune BERT on sentiment datasets
  • State-of-the-art across benchmarks (SST-2: ~97% accuracy)

Aspect-Based Sentiment Analysis (ABSA)

Subtasks:

  1. Aspect term extraction: identify aspect mentions ("food", "service")
  2. Aspect category detection: map to predefined categories
  3. Aspect sentiment classification: determine polarity per aspect

Approaches:

  • Joint extraction with sequence labeling (BIO tags for aspects)
  • Attention over aspect term to determine sentiment
  • Instruction-tuned LLMs with structured output achieve strong performance

Relation Extraction

Identifying semantic relationships between entities in text.

"Steve Jobs founded Apple in 1976."
-> (Steve Jobs, founded, Apple)
-> (Apple, founded_in, 1976)

Approaches

Method Description
Pattern-based Hand-crafted rules ("X founded Y", "X, founder of Y")
Supervised Classify entity pairs with labeled data (SemEval, TACRED)
Distant supervision Align knowledge base triples to text; noisy but scalable
Neural BERT with entity markers: "[E1]Steve Jobs[/E1] founded [E2]Apple[/E2]"
Few-shot / zero-shot Prompt LLMs with relation definitions

Open Information Extraction (OpenIE)

Extracts relation triples without predefined schema.

  • Input: "Einstein was born in Ulm and developed the theory of relativity"
  • Output: (Einstein, was born in, Ulm), (Einstein, developed, theory of relativity)
  • Systems: OpenIE 5, Stanford OpenIE, neural OpenIE
  • Useful for knowledge base construction and question answering

Knowledge Graph Construction

Relation extraction feeds into knowledge graph construction:

  1. Entity recognition and linking
  2. Relation extraction between entity pairs
  3. Triple validation and canonicalization
  4. Integration into knowledge graphs (Wikidata, Freebase)

Evaluation Across Tasks

Task Metric Benchmark
WSD F1 WSD Evaluation Framework
STS Spearman/Pearson correlation STS Benchmark
NLI Accuracy MultiNLI, ANLI
Sentiment Accuracy / Macro-F1 SST-2, SemEval
Relation Extraction F1 TACRED, DocRED
Text-to-SQL Execution accuracy Spider

Key Takeaways

  • WSD resolves lexical ambiguity using context; BERT-based models approach human performance
  • Semantic parsing maps language to formal representations (AMR, SQL) enabling precise reasoning
  • NLI is a versatile task used for zero-shot classification and textual reasoning
  • Sentiment analysis ranges from document-level polarity to fine-grained aspect-based opinion mining
  • Relation extraction bridges text understanding and knowledge graph construction
  • Pretrained transformers dominate all these tasks, but task-specific design choices still matter