Skip to content

async_classes

IAsyncScraper

IAsyncScraper(max_workers: int = 4)

Bases: IScraper

Base class for asynchronous scrapers. Implements the IScraper interface. This class provides a synchronous wrapper around the asynchronous scraping method. Subclasses must implement the async_scrape() method.

Parameters:

Name Type Description Default
max_workers int

The maximum number of concurrent workers.

4

async_scrape abstractmethod async

async_scrape(link: str) -> ScrapeResult

Asynchronously scrape the given URL.

Parameters:

Name Type Description Default
link str

The URL to scrape.

required

Returns:

Name Type Description
ScrapeResult ScrapeResult

The result of the scrape.

scrape

scrape(link: str) -> ScrapeResult

Synchronously scrape the given URL. Wraps async_scrape().

Parameters:

Name Type Description Default
link str

The link to scrape.

required

Returns:

Name Type Description
ScrapeResult ScrapeResult

The result of the scrape.

scrape_multiple

scrape_multiple(links) -> Generator[Tuple[str, ScrapeResult], None, None]

Asynchronously scrape multiple URLs and yield results in synchronous context. Blocks while waiting for results.

Parameters:

Name Type Description Default
links Iterable

A collection of URLs to scrape.

required

Returns:

Type Description
None

Generator[Tuple[str, ScrapeResult], None, None]: A generator yielding tuples of URL and ScrapeResult.

IAsyncAnalyzer

IAsyncAnalyzer(max_workers: int = 2)

Bases: IAnalyzer

Base class for asynchronous analyzers. Implements the IAnalyzer interface. This class provides a synchronous wrapper around the asynchronous analysis method. Subclasses must implement the async_analyze() method.

Parameters:

Name Type Description Default
max_workers int

The maximum number of concurrent workers.

2

async_analyze abstractmethod async

async_analyze(content: str) -> AnalysisResult

Asynchronously analyze the given content.

Parameters:

Name Type Description Default
content str

The content to analyze.

required

Returns:

Name Type Description
AnalysisResult AnalysisResult

The result of the analysis.

analyze

analyze(content: str) -> AnalysisResult

Synchronously analyze the given content. Wraps async_analyze().

Parameters:

Name Type Description Default
content str

The content to analyze.

required

Returns:

Name Type Description
AnalysisResult AnalysisResult

The result of the analysis.

analyze_multiple

analyze_multiple(contents: dict) -> Generator[Tuple[str, AnalysisResult], None, None]

Asynchronously analyze multiple contents and yield results in synchronous context. Blocks while waiting for results.

Parameters:

Name Type Description Default
contents dict

A dictionary of contents to analyze, with keys as identifiers and values as content.

required

Returns:

Type Description
None

Generator[Tuple[str, AnalysisResult], None, None]: A generator yielding tuples of identifier and AnalysisResult.

IAsyncLinkCollector

Bases: ILinkCollector

Base class for asynchronous link providers. Implements the ILinkProvider interface. This class provides a synchronous wrapper around the asynchronous link retrieval method. Subclasses must implement the async_collect_links() method.

async_collect_links() -> AsyncIterable[str]

Asynchronously retrieve links to scrape.

collect_links() -> Iterable[str]

Synchronously retrieve links to scrape. Wraps async_collect_links().

Returns:

Type Description
Iterable[str]

Iterable[str]: An iterable of URLs to scrape.