From d1d654ebac2d51e3841675faeb56480e440f622f Mon Sep 17 00:00:00 2001 From: Wolfgang Müller Date: Tue, 5 Mar 2024 18:08:09 +0100 Subject: Initial commit --- docs/plugins/writing/scrapers.rst | 48 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 docs/plugins/writing/scrapers.rst (limited to 'docs/plugins/writing/scrapers.rst') diff --git a/docs/plugins/writing/scrapers.rst b/docs/plugins/writing/scrapers.rst new file mode 100644 index 0000000..258d3a8 --- /dev/null +++ b/docs/plugins/writing/scrapers.rst @@ -0,0 +1,48 @@ +Scrapers +======== + +A scraper extends the abstract :class:`~hircine.scraper.Scraper` class and +implements its :meth:`~hircine.scraper.Scraper.scrape` method. The latter is a +generator function yielding :ref:`scraped-data`. + +.. autoclass:: hircine.scraper.Scraper + :members: + :special-members: __init__ + +Exceptions +---------- + +A scraper may raise two kinds of exceptions: + +.. autoexception:: hircine.scraper.ScrapeWarning + +.. autoexception:: hircine.scraper.ScrapeError + +Utility functions +----------------- + +.. automodule:: hircine.scraper.utils + :members: + +Registering a scraper +--------------------- + +To register your class as a scraper, place it into the ``hircine.scraper`` +:ref:`entry point group `. For example, put the +following in a ``pyproject.toml`` file: + +.. code-block:: toml + + [project.entry-points.'hircine.scraper'] + my_scraper = 'myscraper.MyScraper' + +Example +------- + +.. literalinclude:: /_examples/example_scraper.py + :language: python + +The scraper above will scrape a JSON file with the following structure: + +.. literalinclude:: /_examples/example_scraper.json + :language: json -- cgit v1.2.3-2-gb3c3