25 patterns
Data Extraction Regex Patterns
Data extraction patterns are designed to find and capture structured information within unstructured text. Unlike validation patterns, extraction patterns use global matching to pull out all occurrences from a larger string.
Common Use Cases
All Data Extraction Patterns
Hashtag Extraction
Extracts all hashtags (#tag).
#[A-Za-z0-9_]+\bMention Extraction
Extracts user mentions (@user).
@[A-Za-z0-9_]+\bNumber Extraction
Extracts integers and decimals (positive/negative).
-?\d+(?:\.\d+)?URLs in Text
Extracts URLs with or without protocol.
\bhttps?:\/\/[^\s<>"]+|www\.[^\s<>"]+HTML Entity
Matches HTML entities ( ,  ,  ).
&[a-zA-Z]+;|&#\d+;|&#x[0-9a-fA-F]+;HTML Comment
Matches HTML comments.
<!--[\s\S]*?-->Markdown Link Extraction
Extracts the text and URL from a Markdown link [text](url).
\[([^\]]+)\]\(([^\)]+)\)Emoji Extraction
Matches common Unicode emojis in a text.
\uD83C[\uDF00-\uDFFF]|\uD83D[\uDC00-\uDE4F]|\uD83D[\uDE80-\uDEFF]|[\u2600-\u27BF]Markdown Link
Extracts or validates Markdown hyperlinks [text](url).
\[([^\[\]]+)\]\((https?:\/\/[^\s)]+)\)HTML Tag
Matches paired HTML tags with their content.
<([a-zA-Z][a-zA-Z0-9]*)\b[^>]*>.*?<\/\1>Markdown Header
Matches Markdown heading lines (H1 to H6).
^#{1,6}\s+.+$Log Level Prefix
Matches standard log level prefixes at the start of a log line.
^\[(DEBUG|INFO|WARN|WARNING|ERROR|FATAL|CRITICAL)\]Extract IPv4 Addresses from Text
Extracts all valid IPv4 addresses from a block of text (use with global flag).
\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\bMarkdown Image
Matches Markdown image syntax . Captures alt text and source URL.
!\[([^\]]*)\]\(([^)]+)\)Markdown Bold Text
Matches **bold** Markdown text. Captures the inner text in group 1.
\*\*([^*\n]+)\*\*Markdown Italic Text
Matches *italic* Markdown text without matching bold (**) text.
(?<!\*)\*([^*\n]+)\*(?!\*)Markdown Fenced Code Block
Matches ``` fenced code blocks with optional language hint. Captures language and code.
```([a-zA-Z0-9+#-]*)\n([\s\S]*?)```HTML Self-closing Tag
Matches XHTML-style self-closing tags (e.g. <img />, <br />). Captures tag name and attributes.
<([a-zA-Z][a-zA-Z0-9-]*)([^>]*?)\/>Double-quoted String
Matches a double-quoted string with support for escaped quotes inside.
"(?:[^"\\]|\\.)*"Single-quoted String
Matches a single-quoted string supporting escaped single quotes.
'(?:[^'\\]|\\.)*'SQL Comment
Matches both single-line (--) and multi-line (/* */) SQL comments.
(--[^\r\n]*|\/\*[\s\S]*?\*\/)Log Timestamp
Extracts ISO-like log timestamps from text (e.g. 2026-01-15 14:30:45.123).
\b\d{4}-\d{2}-\d{2}[ T]\d{2}:\d{2}:\d{2}(?:[.,]\d+)?\bMarkdown List Item
Matches a Markdown unordered (-, *, +) or ordered (1.) list item line.
^\s*(?:[-*+]|\d+\.)\s+.+$Markdown Blockquote
Matches a single Markdown blockquote line (> text).
^>\s+.+$Markdown Table Row
Matches a Markdown table row line (| col1 | col2 |).
^\|.+\|\s*$Frequently Asked Questions
What is the difference between validation and extraction regex?
Validation uses ^ and $ anchors to match the entire string. Extraction drops the anchors and uses the global flag (g) to find all matches within a larger text.
How do I extract all emails from a text?
Use the Email Extraction pattern with the global flag: text.match(/\b[\w.-]+@[\w.-]+\.\w{2,4}\b/g)
How do I extract all URLs from HTML?
Use the URL Extraction pattern: text.match(/https?:\/\/[^\s<>"]+/g)
Looking for patterns in other categories?
Browse all 250 patterns