diff options
author | Juhani Krekelä | 2021-07-08 09:14:41 +0300 |
---|---|---|
committer | Wolfgang Müller | 2021-07-09 11:53:32 +0200 |
commit | 59767124ed7768c82621c414549d24d33096b16e (patch) | |
tree | f51b17df5e25d0656eb898305f4b1726ea27e980 | |
parent | 8ad81eb268df5984fcffd0a18b9c68e254094956 (diff) | |
download | weltschmerz-59767124ed7768c82621c414549d24d33096b16e.tar.gz |
Improve URL matching
Currently the weltschmerz URL regex does not match URLs with quotes or
parentheses, considering the URL to end when one is encountered. On the
other hand, if a URL is surrounded by angle brackets it includes the
closing '>' in the URL match. Additionally, the regex allows URL to
contain a space if and only if it is the second character of the host
component of the URL.
Most of this appears to be down to bugs in the regex as it is currently
written. This rewrites the regex to be cleaner and easier to read, while
maintaining the intended logic of the original.
-rw-r--r-- | terminal.vala | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/terminal.vala b/terminal.vala index 86f086f..b2a610f 100644 --- a/terminal.vala +++ b/terminal.vala @@ -1,6 +1,6 @@ [GtkTemplate (ui = "/weltschmerz/ui/terminal.ui")] class Terminal : Gtk.Overlay { - const string URL_REGEX = """(?>https?|ftp):\/\/[^\s\$.?#].(?>[^\s()"]*|\([^\s]*\)|"[^\s"]*")"""; + const string URL_REGEX = """(?>https?|ftp):\/\/[^[:punct:][:space:]](?>[^][)(><"“”[:space:]]+|\([^)([:space:]]*\)|"[^"[:space:]]*")+"""; const uint PCRE2_CASELESS = 0x00000008u; const uint PCRE2_MULTILINE = 0x00000400u; const uint PCRE2_NO_UTF_CHECK = 0x00080000u; |