Browser

Data Retrieval

GetGrid retrieves the matched selector data into an IGPALGrid<string> for further processing. GetPageSource extracts the full HTML of the current page. GetSiteMap retrieves the sitemap XML and GetSiteMapUrls returns the individual URLs as a list. GetHydratedData extracts Next.js hydration data embedded in the page. GetLLMDigest produces a cleaned, structured summary of the page content optimized for use as LLM input. The Save variants write these outputs directly to files.

Examples

GPAL Fluent: High-level fluent C# API

//GetGrid populates the out parameter with a grid where each row corresponds to one matched element. Columns in the grid correspond to the element attributes GPAL extracts based on the selector configuration.

// Extract a table to a grid

var grid = GPAL.Grid.ToGPALObject();

GPAL.Browser

.GoTo("https://example.com/data")

.WithSelector(".data-row")

.WithAllThatMatch(1000)

.GetGrid(out grid);

// Get page HTML source

string html;

GPAL.Browser

.GoTo("https://example.com")

.GetPageSource(out html);

// Get all URLs from a sitemap

List<string> urls;

GPAL.Browser

.GoTo("https://example.com")

.GetSiteMapUrls(out urls);

Documentation Example

Documentation

Showing off some plain text in these paragraphs eligendi laboriosam illo nostrum corporis at libero vel voluptas? Expedita, facere dolores voluptatem ad ab rem assumenda soluta!

Other formats

Here you can find different accents and emphasis sit amet consectetur adipisicing elit. Obcaecati, iste distinctio veritatis eligendi laboriosam illo nostrum corporis at libero vel voluptas? Expedita, facere dolores voluptatem ad ab rem assumenda soluta!

This is a link and how it could look like bestlinkinthebeautifulworld. Obcaecati, iste distinctio veritatis eligendi laboriosam illo nostrum corporis at libero vel voluptas? Expedita, facere dolores voluptatem ad ab rem assumenda soluta!

Here's just some classic bold text adipisicing elit. Obcaecati, iste distinctio veritatis eligendi laboriosam notBoldSecondbestlinkinthebeautifulworld illo nostrum corporis at libero vel voluptas? Expedita, facere dolores voluptatem ad ab rem assumenda soluta!

Obcaecati, iste distinctio veritatis eligendi laboriosam adipisicing elit illo nostrum corporis at adipisicing elit libero vel voluptas? Expedita, adipisicing facere dolores voluptatem ad ab rem assumenda soluta!

Other cuple of colors in case we want to emphasize several ways adipisicing elit. Obcaecati, iste distinctio veritatis eligendi laboriosam adipisicing elit illo nostrum corporis at voluptatem libero vel voluptas? Expedita, facere dolores voluptatem ad ab rem assumenda soluta!

Adding Images to the content

Lorem ipsum dolor sit amet consectetur adipisicing elit. Obcaecati, iste distinctio veritatis eligendi laboriosam illo nostrum corporis at libero vel voluptas? Expedita, facere dolores voluptatem ad ab rem assumenda soluta! Lorem ipsum dolor, sit amet consectetur adipisicing elit. Quod veniam, quam ad expedita laborum sed at voluptates culpa ipsam ut vel. Ullam temporibus a mollitia quod aliquam ratione exercitationem nesciunt.

Coding Blocks

Lorem ipsum dolor sit amet consectetur adipisicing elit. Repudiandae quas consequuntur illo numquam assumenda autem exercitationem distinctio perspiciatis in natus. Eius dicta similique ipsam ipsa minima, nemo quae enim tempore.

GPAL

.CallIfNotFound(GenericCallIfNotFound)

.WithPublishToConsole();

//System.Drawing.Rectangle windowSize = new System.Drawing.Rectangle(10, 10, 1500, 1024);

// NOTE: we have to set browser = before we execute any steps

// this is due to the 'GenericCallIfNotFound' which might throw an exception, and BankScraper will not have the browser set when it calls scraper.Close()

// until the complete fluent line gets executed (meaning every step, meaning browser is not set until everything else succeeds)

browser = GPAL.Browser

.WithBrowserType(Enums.BrowserType.Chrome)

.WithProfileDataDirectory(ChromeProfileLocation)

.WithUseAutomationEngine(AutomationEngine.Selenium)

.WithWindowSize(new System.Drawing.Rectangle(0,0,1920,1080))

.ToGPALObject();

Browser

Browser

Data Retrieval

Examples

Documentation Example

Documentation

Category

Endpoint

Enpoint

Enpoint

Enpoint

Other formats

Adding Images to the content

Coding Blocks

On this page