| Commit message (Collapse) | Author | Age | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This commit enables ruff's flake8-builtin linter that emits warnings
when builtin functions are shadowed. This is useful for builtins like
"dict", "list", or "str" which we use often.
Given the nature of this program we historically rely a lot on the usage
of "id", "hash", and "filter" as variable names which also shadow Python
builtins. For now let's ignore those, we have not used any of them in
our code and the impact to the codebase would be considerable. This
might be revisited in the future.
|
|
|
|
|
|
|
| |
It might be that we get a valid (maybe empty) response from the API, in
which case we do not want to simply crash because we expect the
'gmetadata' field in the response. Instead, throw a proper ScrapeError
for it.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Non-H usually has nothing to censor, so this should be a safe default.
We have not come across anything where this would have been a false
positive.
|
| |
|
|
|
|
|
|
| |
If a parser function returned None we yield it regardless, even though
it won't have any impact further down the line. Instead clean up the
collect() stream as early as possible.
|
| |
|
|
|
|
|
|
| |
We can expect a number of scraper sources to either give languages as
ISO 639-3 or as their English name, so it makes sense to implement a
simple parser method on our side.
|
| |
|
| |
|
| |
|
|
|