summaryrefslogtreecommitdiffstatshomepage
path: root/parsing.c (follow)
Commit message (Collapse)AuthorAgeLines
* Replace most uses of strncmp() with prefixcmp()Lukas Fleischer2014-01-10-6/+6
| | | | | | | This is a preparation for replacing all prefix checks with either strip_prefix() or starts_with() when Git 1.8.6 is released. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
* Update copyright informationLukas Fleischer2014-01-08-1/+1
| | | | | | | | | * Name "cgit Development Team" as copyright holder to avoid listing every single developer. * Update copyright ranges. Signed-off-by: Lukas Fleischer <cgit@crytocrack.de>
* Mark several functions/variables staticLukas Fleischer2013-03-04-3/+3
| | | | | | Spotted by parsing the output of `gcc -Wmissing-prototypes [...]`. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
* White space around control verbs.Jason A. Donenfeld2013-03-04-2/+2
| | | | Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
* do not write outside heap bufferJim Meyering2012-10-02-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * parsing.c (substr): Handle tail < head. This started when I noticed some cgit segfaults on savannah.gnu.org. Finding the offending URL/commit and then constructing a stand-alone reproducer were far more time-consuming than writing the actual patch. The problem arises with a commit like this, in which the user name part of the "Author" field is empty: $ git log -1 commit 6f3f41d73393278f3ede68a2cb1e7a2a23fa3421 Author: <T at h.or> Date: Mon Apr 23 22:29:16 2012 +0200 Here's what happens: (this is due to buf=malloc(0); strncpy (buf, head, -1); where "head" may point to plenty of attacker-specified non-NUL bytes, so we can overwrite a zero-length heap buffer with arbitrary data) Invalid write of size 1 at 0x4A09361: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) Address 0x4c718d0 is 0 bytes after a block of size 0 alloc'd at 0x4A0884D: malloc (vg_replace_malloc.c:263) by 0x455C85: xmalloc (wrapper.c:35) by 0x40894C: substr (parsing.c:60) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) Invalid write of size 1 at 0x4A09400: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) Address 0x4c7192b is not stack'd, malloc'd or (recently) free'd Invalid write of size 1 at 0x4A0940E: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) Address 0x4c7192d is not stack'd, malloc'd or (recently) free'd Process terminating with default action of signal 11 (SIGSEGV) Access not within mapped region at address 0x502F000 at 0x4A09400: strncpy (mc_replace_strmem.c:463) by 0x408977: substr (parsing.c:61) by 0x4089EF: parse_user (parsing.c:73) by 0x408D10: cgit_parse_commit (parsing.c:153) by 0x40A540: cgit_mk_refinfo (shared.c:171) by 0x40A581: cgit_refs_cb (shared.c:181) by 0x43DEB3: do_for_each_ref (refs.c:690) by 0x41075E: cgit_print_branches (ui-refs.c:191) by 0x416EF2: cgit_print_summary (ui-summary.c:56) by 0x40780A: summary_fn (cmd.c:120) by 0x40667A: process_request (cgit.c:544) by 0x404078: cache_process (cache.c:322) This happens when tail - head == -1 here: (parsing.c) char *substr(const char *head, const char *tail) { char *buf; buf = xmalloc(tail - head + 1); strncpy(buf, head, tail - head); buf[tail - head] = '\0'; return buf; } char *parse_user(char *t, char **name, char **email, unsigned long *date) { char *p = t; int mode = 1; while (p && *p) { if (mode == 1 && *p == '<') { *name = substr(t, p - 1); t = p; mode++; } else if (mode == 1 && *p == '\n') { The fix is to handle the case of (tail < head) before calling xmalloc, thus avoiding passing an invalid value to xmalloc. And here's the reproducer: It was tricky to reproduce, because git prohibits use of an empty "name" in a commit ID. To construct the offending commit, I had to resort to using "git hash-object". git init -q foo && ( cd foo && echo a > j && git add . && git ci -q --author='au <T at h.or>' -m. . && h=$(git cat-file commit HEAD|sed 's/au //' \ |git hash-object -t commit -w --stdin) && git co -q -b test $h && git br -q -D master && git br -q -m test master) git clone -q --bare foo foo.git cat <<EOF > in repo.url=foo.git repo.path=foo.git EOF CGIT_CONFIG=in QUERY_STRING=url=foo.git valgrind ./cgit The valgrind output is what you see above. AFAICS, this is not exploitable thanks (ironically) to the use of strncpy. Since that -1 translates to SIZE_MAX and this is strncpy, not only does it copy whatever is in "head" (up to first NUL), but it also writes SIZE_MAX - strlen(head) NUL bytes into the destination buffer, and that latter is guaranteed to evoke a segfault. Since cgit is single-threaded, AFAICS, there is no way that the buffer clobbering can be turned into an exploit.
* Remove dead initialization in cgit_parse_commit()Lukas Fleischer2011-07-22-1/+1
| | | | | | | | The value stored to "t" during its initialization gets overwritten in any case, so just leave it uninitialized. Spotted by clang-analyzer. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Avoid null pointer dereference in reencode().Lukas Fleischer2011-05-23-1/+4
| | | | | | | | Returning "*txt" if "txt" is a null pointer is a bad thing. Spotted with clang-analyzer. Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* fix two encoding bugsJulius Plenz2011-03-26-9/+15
| | | | | | | | | | | | | | reencode() takes three arguments in the order (txt, from, to), opposed to reencode_string, which will, like iconv, handle the arguments with from and to swapped. Fix that (this makes reencode more intuitive). If src and dst encoding are equivalent, don't do any encoding. If no special encoding parameter is found within the commit, assume UTF-8 and explicitly convert to PAGE_ENCODING. The change to reencode() mentioned above avoids re-encoding a UTF-8 string to UTF-8, for example. Signed-off-by: Julius Plenz <plenz@cis.fu-berlin.de> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Reencode author and committerRémi Lagacé2010-07-13-0/+4
| | | | | | | When a commit has a specific encoding, this encoding also applies to the author and committer name and email. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* parsing.c: enable builds with NO_ICONV definedLars Hjemli2008-12-05-0/+4
| | | | Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* parsing.c: be prepared for unexpected content in commit/tag objectsLars Hjemli2008-09-15-63/+96
| | | | | | | | When parsing commits and tags cgit made too many assumptions about the formatting of said objects. This patch tries to make the code be more prepared to handle 'malformed' objects. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Move cgit_parse_query() from parsing.c to html.c as http_parse_querystring()Lars Hjemli2008-04-08-49/+0
| | | | | | This is a generic http-function. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Move function for configfile parsing into configfile.[ch]Lars Hjemli2008-03-28-75/+0
| | | | | | This is a generic function which wanted its own little object file. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add command dispatcherLars Hjemli2008-03-24-2/+2
| | | | | | | This simplifies the code in cgit.c and makes it easier to extend cgit with new pages/commands. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Move cgit_repo into cgit_contextLars Hjemli2008-02-16-8/+8
| | | | | | | | This removes the global variable which is used to keep track of the currently selected repository, and adds a new variable in the cgit_context structure. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Introduce struct cgit_contextLars Hjemli2008-02-16-4/+4
| | | | | | | | | This struct will hold all the cgit runtime information currently found in a multitude of global variables. The first cleanup removes all querystring-related variables. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Merge branch 'stable'Lars Hjemli2007-12-02-3/+3
|\ | | | | | | | | | | * stable: Handle missing timestamp in commit/tag objects Set commit date on snapshot contents
| * Handle missing timestamp in commit/tag objectsLars Hjemli2007-12-02-3/+3
| | | | | | | | | | | | | | | | | | | | When a commit or tag lacks author/committer/tagger timestamp, do not skip the next line in the commit/tag object. Also, do not bother to print timestamps with value 0 as it is close to certain to be bogus. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* | Use utf8::reencode_string from gitLars Hjemli2007-11-05-60/+4
| | | | | | | | | | | | | | This replaces the iconv-support in cgit with similar functions already existing in git. Signed-off-by: Lars Hjemli <hjemli@gmai.com>
* | Convert subject and message with iconv_msg.Jonathan Bastien-Filiatrault2007-11-05-0/+14
| |
* | Add iconv_msg function.Jonathan Bastien-Filiatrault2007-11-05-0/+58
| |
* | Set msg_encoding according to the header.Jonathan Bastien-Filiatrault2007-11-05-0/+8
| |
* | Add commit->msg_encoding, allocate msg dynamicly.Jonathan Bastien-Filiatrault2007-11-05-0/+1
|/
* cgit_parse_commit(): Add missing call to xstrdup()Lars Hjemli2007-10-27-2/+2
| | | | | | | It's rather silly to point into random memory-locations. Also, remove a call to strdup() used on a literal char *. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Skip unknown header fields when parsing tags and commitsLars Hjemli2007-10-27-0/+6
| | | | | | | | Both the commit- and tagparser failed to handle unexpected header fields. This adds futureproofing by simply skipping any header we don't know/care about. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add trim_end() and use it to remove trailing slashes from repo pathsLars Hjemli2007-06-26-1/+1
| | | | | | | | The new function removes all trailing instances of an arbitrary character from a copy of the supplied char array. This is then used to remove any trailing slashes from cgit_query_path. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Check for NULL commit buffer in cgit_parse_commit()Ondrej Jirman2007-05-31-0/+3
| | | | | | This can be NULL, so try not to segfault. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Handle single-line and empty commit subjectsOndrej Jirman2007-05-31-3/+8
| | | | | | | | | | If commit object ends with \0 after subject line, then info->subject was not set. This commit fixes this and also sets subject to ** empty ** if it would otherwise be empty, so that there is something to click on. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Don't be fooled by trailing '/' in url-parameterLars Hjemli2007-05-18-1/+2
| | | | | | | | cgit_parse_url() didn't check if the path-part of urls contained a real path or just a trailing slash. This made the log-page die since the path filtering supplied an invalid path argument. This fixes it. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Enable url=value querystring parameterLars Hjemli2007-05-18-0/+43
| | | | | | | This makes is possible to use repo-urls like '/pub/scm/git/git.git' and even add path specifications, like '/pub/scm/git/git.git/log/documentation'. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Restrict deep nesting of configfilesLars Hjemli2007-05-15-4/+6
| | | | | | | | | | | | There is no point in restricting the number of included config- files, but there is a point in restricting the nestinglevel of configfiles: to avoid recursive inclusions. This is easily achieved by decrementing the static nesting-variable upon exit from cgit_read_config(). Also fix some whitespace breakage. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add include-parameter to config filesLars Hjemli2007-05-14-6/+8
| | | | | | | | | This parameter can be used to include another config-file, like a standalone repository listing. Suggested in a patch by Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Update to libgit 1.5.2-rc2Lars Hjemli2007-05-08-3/+3
| | | | Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Do not die if tag has no messageLars Hjemli2007-02-04-2/+2
| | | | Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add function cgit_parse_tag()Lars Hjemli2007-01-17-0/+47
| | | | | | Teach cgit how to extract author info from a tag. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Handle empty/malformed commit messagesLars Hjemli2007-01-16-6/+13
| | | | | | | An empty commit message would trigger a segfault in the current cgit_parse_commit(). Also, make sure that all char-pointers are properly initialized.
* Handle %xx encoding in querystringLars Hjemli2007-01-04-0/+21
| | | | | | | Convert valid %xx expressions in querystring to ascii, ignore invalid expressions (i.e. eat the three characters %xx). Signed-off-by: Lars Hjemli <larsh@hal-2004.(none)>
* Handle '+' in querystringLars Hjemli2006-12-28-0/+2
| | | | | | Translate '+' to ' ' in querystring parser (still doesn't handle %xx) Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Teach commit parser about author/committer email + timestampLars Hjemli2006-12-16-2/+10
| | | | | | | | | | | We want all four of these when showing a commit, so save them in the commitinfo struct. Btw: There's probably no good reason to save committer timestamp since it's already available in commit->date. But it doesn't hurt us either, and it makes the parser look more complete, so we just do it. Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add ui-commit.c + misc ui cleanupsLars Hjemli2006-12-16-0/+1
| | | | Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Add a common commit parserLars Hjemli2006-12-15-0/+53
| | | | | | Make a better commit parser, replacing the ugly one in ui-log.c Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* Rename config.c to parsing.c + move cgit_parse_query from cgit.c to parsing.cLars Hjemli2006-12-11-0/+106
Signed-off-by: Lars Hjemli <hjemli@gmail.com>