Semantic Analysis

Overview

Semantic analysis moves beyond syntactic structure to understand meaning. It encompasses word-level disambiguation, sentence-level inference, structured meaning representations, and opinion mining. These tasks require models to capture nuance, context, and world knowledge.

Word Sense Disambiguation (WSD)

Determining which sense of a polysemous word is intended in context.

Example: "bank" can mean a financial institution, a river bank, or the act of banking an airplane.

Approaches

Method	Description
Lesk algorithm	Choose the sense whose dictionary definition has the most word overlap with the context
Knowledge-based	Use WordNet relations (hypernyms, synonyms) and graph algorithms (PageRank on sense graphs)
Supervised	Train a classifier per word on sense-annotated corpora (SemCor)
Neural	Fine-tune BERT to select the correct sense from candidate definitions

Resources

WordNet: Lexical database organizing words into synonym sets (synsets) linked by semantic relations (hypernymy, meronymy, etc.)
SemCor: Largest manually sense-annotated corpus (~230k annotations)
BabelNet: Multilingual encyclopedic dictionary combining WordNet and Wikipedia

Current state: BERT-based WSD models achieve ~80% F1 on the ALL evaluation framework, approaching the estimated inter-annotator agreement ceiling.

Semantic Similarity

Measuring how similar two text units are in meaning.

Word-Level Similarity

Path-based (WordNet): Shortest path between synsets in the taxonomy
Information content: Probability of the least common subsumer (Lin similarity, Jiang-Conrath)
Embedding-based: Cosine similarity of word vectors

Sentence-Level Similarity (STS)

Predict a continuous similarity score (0-5) between sentence pairs.

Method	Approach
Word overlap	Jaccard similarity, BLEU
Embedding average	Average word embeddings, compute cosine
SBERT	Siamese BERT with cosine similarity
Cross-encoder	Concatenate both sentences in BERT, regress score

Cross-encoders are more accurate but O(n^2) for pairwise comparisons. Bi-encoders (SBERT) encode once and compare cheaply.

Semantic Parsing

Translates natural language into a formal meaning representation.

Abstract Meaning Representation (AMR)

A rooted, directed, acyclic graph representing sentence meaning.

"The boy wants to go"

(w / want-01
   :ARG0 (b / boy)
   :ARG1 (g / go-02
      :ARG0 b))

Nodes: concepts (word senses, PropBank framesets)
Edges: semantic roles (ARG0, ARG1, etc.)
Abstracts away syntax: "the boy's desire to go" has the same AMR
Challenges: reentrancy (nodes with multiple parents), alignment to text
Neural AMR parsers: sequence-to-graph models achieve ~85 Smatch F1

Text-to-SQL

Converts natural language questions into SQL queries over a database.

"How many employees earn more than 50000?"
->
SELECT COUNT(*) FROM employees WHERE salary > 50000

Key challenges:

Schema linking: mapping mentions to tables and columns
Complex queries: joins, subqueries, aggregation, GROUP BY
Generalization to unseen databases

Approaches:

Grammar-based decoding: constrain output to valid SQL
Schema encoding: represent table/column names alongside the question
Spider benchmark: cross-database evaluation (200 databases)
Current SOTA: LLM-based approaches with schema prompting achieve ~85% execution accuracy on Spider

Natural Language Inference (NLI)

Determine the logical relationship between a premise and hypothesis.

Label	Meaning	Example
Entailment	Premise implies hypothesis	P: "A dog runs in a park" -> H: "An animal is outside"
Contradiction	Premise contradicts hypothesis	P: "A dog runs in a park" -> H: "No animals are present"
Neutral	Neither entails nor contradicts	P: "A dog runs in a park" -> H: "The dog is chasing a ball"

Datasets

Dataset	Size	Source	Notes
SNLI	570k	Image captions	Simpler, less natural
MultiNLI	433k	Multiple genres	More diverse
ANLI	163k	Adversarially collected	Harder examples
XNLI	7.5k per language	MultiNLI translations	15 languages

Models

Decomposable attention: Align words between premise and hypothesis, aggregate
ESIM: BiLSTM with cross-sentence attention
BERT cross-encoder: Concatenate [CLS] premise [SEP] hypothesis [SEP], classify
BERT achieves ~92% on SNLI, ~90% on MultiNLI
NLI models are used as general-purpose zero-shot classifiers (hypothesis = category description)

Sentiment Analysis

Determining the opinion, emotion, or attitude expressed in text.

Levels of Granularity

Level	Task	Example
Document	Overall sentiment	Movie review: positive/negative
Sentence	Sentence polarity	"The food was great but service was slow"
Aspect-based	Per-aspect sentiment	food: positive, service: negative
Targeted	Sentiment toward entity	"I like iPhone but hate Android"

Lexicon-Based Approaches

VADER: Rule-based, handles social media conventions (capitalization, emoticons, intensifiers)
SentiWordNet: Assigns positivity/negativity scores to WordNet synsets
AFINN, Opinion Lexicon: Curated word lists with polarity scores
Fast but limited by vocabulary coverage and inability to handle negation/sarcasm robustly

Machine Learning Approaches

Classical ML:

Features: unigrams/bigrams, POS tags, negation scope, lexicon scores
Classifiers: SVM, logistic regression, Naive Bayes
Effective for well-defined domains with labeled data

Deep Learning:

CNN over word embeddings captures local n-gram patterns
BiLSTM captures sequential context and long-range negation
Attention mechanisms highlight relevant words

Pretrained Transformers:

Fine-tune BERT on sentiment datasets
State-of-the-art across benchmarks (SST-2: ~97% accuracy)

Aspect-Based Sentiment Analysis (ABSA)

Subtasks:

Aspect term extraction: identify aspect mentions ("food", "service")
Aspect category detection: map to predefined categories
Aspect sentiment classification: determine polarity per aspect

Approaches:

Joint extraction with sequence labeling (BIO tags for aspects)
Attention over aspect term to determine sentiment
Instruction-tuned LLMs with structured output achieve strong performance

Relation Extraction

Identifying semantic relationships between entities in text.

"Steve Jobs founded Apple in 1976."
-> (Steve Jobs, founded, Apple)
-> (Apple, founded_in, 1976)

Approaches

Method	Description
Pattern-based	Hand-crafted rules ("X founded Y", "X, founder of Y")
Supervised	Classify entity pairs with labeled data (SemEval, TACRED)
Distant supervision	Align knowledge base triples to text; noisy but scalable
Neural	BERT with entity markers: "[E1]Steve Jobs[/E1] founded [E2]Apple[/E2]"
Few-shot / zero-shot	Prompt LLMs with relation definitions

Open Information Extraction (OpenIE)

Extracts relation triples without predefined schema.

Input: "Einstein was born in Ulm and developed the theory of relativity"
Output: (Einstein, was born in, Ulm), (Einstein, developed, theory of relativity)
Systems: OpenIE 5, Stanford OpenIE, neural OpenIE
Useful for knowledge base construction and question answering

Knowledge Graph Construction

Relation extraction feeds into knowledge graph construction:

Entity recognition and linking
Relation extraction between entity pairs
Triple validation and canonicalization
Integration into knowledge graphs (Wikidata, Freebase)

Evaluation Across Tasks

Task	Metric	Benchmark
WSD	F1	WSD Evaluation Framework
STS	Spearman/Pearson correlation	STS Benchmark
NLI	Accuracy	MultiNLI, ANLI
Sentiment	Accuracy / Macro-F1	SST-2, SemEval
Relation Extraction	F1	TACRED, DocRED
Text-to-SQL	Execution accuracy	Spider

Key Takeaways

WSD resolves lexical ambiguity using context; BERT-based models approach human performance
Semantic parsing maps language to formal representations (AMR, SQL) enabling precise reasoning
NLI is a versatile task used for zero-shot classification and textual reasoning
Sentiment analysis ranges from document-level polarity to fine-grained aspect-based opinion mining
Relation extraction bridges text understanding and knowledge graph construction
Pretrained transformers dominate all these tasks, but task-specific design choices still matter