inlinepatterns
INLINE PATTERNS.
Inline patterns such as emphasis are handled by means of auxiliary objects, one per pattern. Pattern objects must be instances of classes that extend markdown.Pattern. Each pattern object uses a single regular expression and needs support the following methods:
pattern.getCompiledRegExp() # returns a regular expression
pattern.handleMatch(m) # takes a match object and returns
# an ElementTree element or just plain text
All of python markdown's built-in patterns subclass from Pattern, but you can add additional patterns that don't.
Also note that all the regular expressions used by inline must capture the whole block. For this reason, they all start with '^(.)' and end with '(.)!'. In case with built-in expression Pattern takes care of adding the "^(.)" and "(.)!".
Finally, the order in which regular expressions are applied is very important - e.g. if we first replace http://.../ links with tags and then try to replace inline html, we would end up with a mess. So, we apply the expressions in the following order:
-
escape and backticks have to go before everything else, so that we can preempt any markdown patterns by escaping them.
-
then we handle auto-links (must be done before inline html)
-
then we handle inline HTML. At this point we will simply replace all inline HTML strings with a placeholder and add the actual HTML to a hash.
-
then inline images (must be done before links)
-
then bracketed links, first regular then reference-style
-
finally we apply strong and emphasis
Attributes
NOBRACKET = '[^\\]\\[]*'
module-attribute
BRK = '\\[(' + NOBRACKET + '(\\[' * 6 + NOBRACKET + '\\])*' * 6 + NOBRACKET + ')\\]'
module-attribute
NOIMG = '(?<!\\!)'
module-attribute
BACKTICK_RE = '(?:(?<!\\\\)((?:\\\\{2})+)(?=`+)|(?<!\\\\)(`+)(.+?)(?<!`)\\3(?!`))'
module-attribute
ESCAPE_RE = '\\\\(.)'
module-attribute
EMPHASIS_RE = '(\\*)([^\\*]+)\\2'
module-attribute
STRONG_RE = '(\\*{2}|_{2})(.+?)\\2'
module-attribute
EM_STRONG_RE = '(\\*|_)\\2{2}(.+?)\\2(.*?)\\2{2}'
module-attribute
STRONG_EM_RE = '(\\*|_)\\2{2}(.+?)\\2{2}(.*?)\\2'
module-attribute
SMART_EMPHASIS_RE = '(?<!\\w)(_)(?!_)(.+?)(?<!_)\\2(?!\\w)'
module-attribute
EMPHASIS_2_RE = '(_)(.+?)\\2'
module-attribute
LINK_RE = NOIMG + BRK + '\\(\\s*(<.*?>|((?:(?:\\(.*?\\))|[^\\(\\)]))*?)\\s*(([\'"])(.*?)\\12\\s*)?\\)'
module-attribute
IMAGE_LINK_RE = '\\!' + BRK + '\\s*\\(\\s*(<.*?>|([^"\\)\\s]+\\s*"[^"]*"|[^\\)\\s]*))\\s*\\)'
module-attribute
REFERENCE_RE = NOIMG + BRK + '\\s?\\[([^\\]]*)\\]'
module-attribute
SHORT_REF_RE = NOIMG + '\\[([^\\]]+)\\]'
module-attribute
IMAGE_REFERENCE_RE = '\\!' + BRK + '\\s?\\[([^\\]]*)\\]'
module-attribute
NOT_STRONG_RE = '((^| )(\\*|_)( |$))'
module-attribute
AUTOLINK_RE = '<((?:[Ff]|[Hh][Tt])[Tt][Pp][Ss]?://[^>]*)>'
module-attribute
AUTOMAIL_RE = '<([^> \\!]*@[^> ]*)>'
module-attribute
HTML_RE = '(\\<([a-zA-Z/][^\\>]*?|\\!--.*?--)\\>)'
module-attribute
ENTITY_RE = '(&[\\#a-zA-Z0-9]*;)'
module-attribute
LINE_BREAK_RE = ' \\n'
module-attribute
ATTR_RE = re.compile('\\{@([^\\}]*)=([^\\}]*)}')
module-attribute
Classes
Pattern(pattern, markdown_instance=None)
Bases: object
Base class that inline patterns subclass.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
handleMatch(m)
Return a ElementTree element from the given match.
Subclasses should override this method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
m
|
match
|
A re match object containing a match of the pattern. |
required |
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
SimpleTextPattern(pattern, markdown_instance=None)
Bases: Pattern
Return a simple text of group(2) of a Pattern.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
EscapePattern(pattern, markdown_instance=None)
Bases: Pattern
Return an escaped character.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
SimpleTagPattern(pattern, tag)
Bases: Pattern
Return a tag
element with a text attribute of group(3) of a Pattern.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
tag = tag
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
SubstituteTagPattern(pattern, tag)
Bases: SimpleTagPattern
Return an element of type tag
with no children.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
tag = tag
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
BacktickPattern(pattern)
Bases: Pattern
Return a <code>
element containing the matching text.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
ESCAPED_BSLASH = '%s%s%s' % (util.STX, ord('\\'), util.ETX)
instance-attribute
tag = 'code'
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleMatch(m)
DoubleTagPattern(pattern, tag)
Bases: SimpleTagPattern
Return a ElementTree element nested in tag2 nested in tag1.
Useful for strong emphasis etc.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
tag = tag
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleMatch(m)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
HtmlPattern(pattern, markdown_instance=None)
Bases: Pattern
Store raw inline html and return a placeholder.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
handleMatch(m)
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
LinkPattern(pattern, markdown_instance=None)
Bases: Pattern
Return a link element from the given match.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleMatch(m)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
sanitize_url(url)
Sanitize a url against xss attacks in "safe_mode".
Rather than specifically blacklisting javascript:alert("XSS")
and all
its aliases (see http://ha.ckers.org/xss.html), we whitelist known
safe url formats. Most urls contain a network location, however some
are known not to (i.e.: mailto links). Script urls do not contain a
location. Additionally, for javascript:...
, the scheme would be
"javascript" but some aliases will appear to urlparse()
to have no
scheme. On top of that relative links (i.e.: "foo/bar.html") have no
scheme. Therefore we must check "path", "parameters", "query" and
"fragment" for any literal colons. We don't check "scheme" for colons
because it should never have any and "netloc" must allow the form:
username:password@host:port
.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
ImagePattern(pattern, markdown_instance=None)
Bases: LinkPattern
Return a img element from the given match.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
sanitize_url(url)
Sanitize a url against xss attacks in "safe_mode".
Rather than specifically blacklisting javascript:alert("XSS")
and all
its aliases (see http://ha.ckers.org/xss.html), we whitelist known
safe url formats. Most urls contain a network location, however some
are known not to (i.e.: mailto links). Script urls do not contain a
location. Additionally, for javascript:...
, the scheme would be
"javascript" but some aliases will appear to urlparse()
to have no
scheme. On top of that relative links (i.e.: "foo/bar.html") have no
scheme. Therefore we must check "path", "parameters", "query" and
"fragment" for any literal colons. We don't check "scheme" for colons
because it should never have any and "netloc" must allow the form:
username:password@host:port
.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleMatch(m)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
ReferencePattern(pattern, markdown_instance=None)
Bases: LinkPattern
Match to a stored reference and return link element.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
NEWLINE_CLEANUP_RE = re.compile('[ ]?\\n', re.MULTILINE)
class-attribute
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
sanitize_url(url)
Sanitize a url against xss attacks in "safe_mode".
Rather than specifically blacklisting javascript:alert("XSS")
and all
its aliases (see http://ha.ckers.org/xss.html), we whitelist known
safe url formats. Most urls contain a network location, however some
are known not to (i.e.: mailto links). Script urls do not contain a
location. Additionally, for javascript:...
, the scheme would be
"javascript" but some aliases will appear to urlparse()
to have no
scheme. On top of that relative links (i.e.: "foo/bar.html") have no
scheme. Therefore we must check "path", "parameters", "query" and
"fragment" for any literal colons. We don't check "scheme" for colons
because it should never have any and "netloc" must allow the form:
username:password@host:port
.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleMatch(m)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
ImageReferencePattern(pattern, markdown_instance=None)
Bases: ReferencePattern
Match to a stored reference and return img element.
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
NEWLINE_CLEANUP_RE = re.compile('[ ]?\\n', re.MULTILINE)
class-attribute
instance-attribute
Functions
getCompiledRegExp()
handleMatch(m)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
sanitize_url(url)
Sanitize a url against xss attacks in "safe_mode".
Rather than specifically blacklisting javascript:alert("XSS")
and all
its aliases (see http://ha.ckers.org/xss.html), we whitelist known
safe url formats. Most urls contain a network location, however some
are known not to (i.e.: mailto links). Script urls do not contain a
location. Additionally, for javascript:...
, the scheme would be
"javascript" but some aliases will appear to urlparse()
to have no
scheme. On top of that relative links (i.e.: "foo/bar.html") have no
scheme. Therefore we must check "path", "parameters", "query" and
"fragment" for any literal colons. We don't check "scheme" for colons
because it should never have any and "netloc" must allow the form:
username:password@host:port
.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
makeTag(href, title, text)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
AutolinkPattern(pattern, markdown_instance=None)
Bases: Pattern
Return a link Element given an autolink (<http://example/com>
).
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
AutomailPattern(pattern, markdown_instance=None)
Bases: Pattern
Return a mailto link Element given an automail link (<foo@example.com>
).
Create an instant of an inline pattern.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pattern
|
str
|
A regular expression that matches a pattern |
required |
markdown_instance
|
Markdown
|
Instance of Markdown |
None
|
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Attributes
pattern = pattern
instance-attribute
compiled_re = re.compile('^(.*?)%s(.*)$' % pattern, re.DOTALL | re.UNICODE)
instance-attribute
safe_mode = False
instance-attribute
markdown = markdown_instance
instance-attribute
Functions
getCompiledRegExp()
type()
unescape(text)
Return unescaped text given text with an inline placeholder.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleMatch(m)
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
Functions
build_inlinepatterns(md_instance, **kwargs)
Build the default set of inline patterns for Markdown.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
dequote(string)
Remove quotes from around a string.
Source code in pyrevitlib/pyrevit/coreutils/markdown/inlinepatterns.py
handleAttributes(text, parent)
Set values of an element based on attribute definitions ({@id=123}).