6 min read
On this page

Semantic Analysis

Overview

Semantic analysis moves beyond syntactic structure to understand meaning. It encompasses word-level disambiguation, sentence-level inference, structured meaning representations, and opinion mining. These tasks require models to capture nuance, context, and world knowledge.


Word Sense Disambiguation (WSD)

Determining which sense of a polysemous word is intended in context.

Example: "bank" can mean a financial institution, a river bank, or the act of banking an airplane.

Approaches

| Method | Description | |---|---| | Lesk algorithm | Choose the sense whose dictionary definition has the most word overlap with the context | | Knowledge-based | Use WordNet relations (hypernyms, synonyms) and graph algorithms (PageRank on sense graphs) | | Supervised | Train a classifier per word on sense-annotated corpora (SemCor) | | Neural | Fine-tune BERT to select the correct sense from candidate definitions |

Resources

  • WordNet: Lexical database organizing words into synonym sets (synsets) linked by semantic relations (hypernymy, meronymy, etc.)
  • SemCor: Largest manually sense-annotated corpus (~230k annotations)
  • BabelNet: Multilingual encyclopedic dictionary combining WordNet and Wikipedia

Current state: BERT-based WSD models achieve ~80% F1 on the ALL evaluation framework, approaching the estimated inter-annotator agreement ceiling.


Semantic Similarity

Measuring how similar two text units are in meaning.

Word-Level Similarity

  • Path-based (WordNet): Shortest path between synsets in the taxonomy
  • Information content: Probability of the least common subsumer (Lin similarity, Jiang-Conrath)
  • Embedding-based: Cosine similarity of word vectors

Sentence-Level Similarity (STS)

Predict a continuous similarity score (0-5) between sentence pairs.

| Method | Approach | |---|---| | Word overlap | Jaccard similarity, BLEU | | Embedding average | Average word embeddings, compute cosine | | SBERT | Siamese BERT with cosine similarity | | Cross-encoder | Concatenate both sentences in BERT, regress score |

Cross-encoders are more accurate but O(n^2) for pairwise comparisons. Bi-encoders (SBERT) encode once and compare cheaply.


Semantic Parsing

Translates natural language into a formal meaning representation.

Abstract Meaning Representation (AMR)

A rooted, directed, acyclic graph representing sentence meaning.

"The boy wants to go"

(w / want-01
   :ARG0 (b / boy)
   :ARG1 (g / go-02
      :ARG0 b))
  • Nodes: concepts (word senses, PropBank framesets)
  • Edges: semantic roles (ARG0, ARG1, etc.)
  • Abstracts away syntax: "the boy's desire to go" has the same AMR
  • Challenges: reentrancy (nodes with multiple parents), alignment to text
  • Neural AMR parsers: sequence-to-graph models achieve ~85 Smatch F1

Text-to-SQL

Converts natural language questions into SQL queries over a database.

"How many employees earn more than 50000?"
->
SELECT COUNT(*) FROM employees WHERE salary > 50000

Key challenges:

  • Schema linking: mapping mentions to tables and columns
  • Complex queries: joins, subqueries, aggregation, GROUP BY
  • Generalization to unseen databases

Approaches:

  • Grammar-based decoding: constrain output to valid SQL
  • Schema encoding: represent table/column names alongside the question
  • Spider benchmark: cross-database evaluation (200 databases)
  • Current SOTA: LLM-based approaches with schema prompting achieve ~85% execution accuracy on Spider

Natural Language Inference (NLI)

Determine the logical relationship between a premise and hypothesis.

| Label | Meaning | Example | |---|---|---| | Entailment | Premise implies hypothesis | P: "A dog runs in a park" -> H: "An animal is outside" | | Contradiction | Premise contradicts hypothesis | P: "A dog runs in a park" -> H: "No animals are present" | | Neutral | Neither entails nor contradicts | P: "A dog runs in a park" -> H: "The dog is chasing a ball" |

Datasets

| Dataset | Size | Source | Notes | |---|---|---|---| | SNLI | 570k | Image captions | Simpler, less natural | | MultiNLI | 433k | Multiple genres | More diverse | | ANLI | 163k | Adversarially collected | Harder examples | | XNLI | 7.5k per language | MultiNLI translations | 15 languages |

Models

  • Decomposable attention: Align words between premise and hypothesis, aggregate
  • ESIM: BiLSTM with cross-sentence attention
  • BERT cross-encoder: Concatenate [CLS] premise [SEP] hypothesis [SEP], classify
  • BERT achieves ~92% on SNLI, ~90% on MultiNLI
  • NLI models are used as general-purpose zero-shot classifiers (hypothesis = category description)

Sentiment Analysis

Determining the opinion, emotion, or attitude expressed in text.

Levels of Granularity

| Level | Task | Example | |---|---|---| | Document | Overall sentiment | Movie review: positive/negative | | Sentence | Sentence polarity | "The food was great but service was slow" | | Aspect-based | Per-aspect sentiment | food: positive, service: negative | | Targeted | Sentiment toward entity | "I like iPhone but hate Android" |

Lexicon-Based Approaches

  • VADER: Rule-based, handles social media conventions (capitalization, emoticons, intensifiers)
  • SentiWordNet: Assigns positivity/negativity scores to WordNet synsets
  • AFINN, Opinion Lexicon: Curated word lists with polarity scores
  • Fast but limited by vocabulary coverage and inability to handle negation/sarcasm robustly

Machine Learning Approaches

Classical ML:

  • Features: unigrams/bigrams, POS tags, negation scope, lexicon scores
  • Classifiers: SVM, logistic regression, Naive Bayes
  • Effective for well-defined domains with labeled data

Deep Learning:

  • CNN over word embeddings captures local n-gram patterns
  • BiLSTM captures sequential context and long-range negation
  • Attention mechanisms highlight relevant words

Pretrained Transformers:

  • Fine-tune BERT on sentiment datasets
  • State-of-the-art across benchmarks (SST-2: ~97% accuracy)

Aspect-Based Sentiment Analysis (ABSA)

Subtasks:

  1. Aspect term extraction: identify aspect mentions ("food", "service")
  2. Aspect category detection: map to predefined categories
  3. Aspect sentiment classification: determine polarity per aspect

Approaches:

  • Joint extraction with sequence labeling (BIO tags for aspects)
  • Attention over aspect term to determine sentiment
  • Instruction-tuned LLMs with structured output achieve strong performance

Relation Extraction

Identifying semantic relationships between entities in text.

"Steve Jobs founded Apple in 1976."
-> (Steve Jobs, founded, Apple)
-> (Apple, founded_in, 1976)

Approaches

| Method | Description | |---|---| | Pattern-based | Hand-crafted rules ("X founded Y", "X, founder of Y") | | Supervised | Classify entity pairs with labeled data (SemEval, TACRED) | | Distant supervision | Align knowledge base triples to text; noisy but scalable | | Neural | BERT with entity markers: "[E1]Steve Jobs[/E1] founded [E2]Apple[/E2]" | | Few-shot / zero-shot | Prompt LLMs with relation definitions |

Open Information Extraction (OpenIE)

Extracts relation triples without predefined schema.

  • Input: "Einstein was born in Ulm and developed the theory of relativity"
  • Output: (Einstein, was born in, Ulm), (Einstein, developed, theory of relativity)
  • Systems: OpenIE 5, Stanford OpenIE, neural OpenIE
  • Useful for knowledge base construction and question answering

Knowledge Graph Construction

Relation extraction feeds into knowledge graph construction:

  1. Entity recognition and linking
  2. Relation extraction between entity pairs
  3. Triple validation and canonicalization
  4. Integration into knowledge graphs (Wikidata, Freebase)

Evaluation Across Tasks

| Task | Metric | Benchmark | |---|---|---| | WSD | F1 | WSD Evaluation Framework | | STS | Spearman/Pearson correlation | STS Benchmark | | NLI | Accuracy | MultiNLI, ANLI | | Sentiment | Accuracy / Macro-F1 | SST-2, SemEval | | Relation Extraction | F1 | TACRED, DocRED | | Text-to-SQL | Execution accuracy | Spider |


Key Takeaways

  • WSD resolves lexical ambiguity using context; BERT-based models approach human performance
  • Semantic parsing maps language to formal representations (AMR, SQL) enabling precise reasoning
  • NLI is a versatile task used for zero-shot classification and textual reasoning
  • Sentiment analysis ranges from document-level polarity to fine-grained aspect-based opinion mining
  • Relation extraction bridges text understanding and knowledge graph construction
  • Pretrained transformers dominate all these tasks, but task-specific design choices still matter